Amping up algorithmic transparency to govern online platform power
The algorithms driving social media platforms are some of the most powerful on the planet, shaping what billions of people read and watch on a daily basis. Understandably, those algorithms are under increasing scrutiny, including from Congress as it looks for ways to rein in that power.
On Tuesday, the Senate Judiciary Committee convened to ask platform representatives some pointed questions about harmful viral content, polarization and the incentives driving this algorithmically governed media world we all now inhabit.
Defending their platforms, representatives from Twitter, Facebook and YouTube coalesced around a couple of common retorts to deflect the pressure: user control and platform transparency.
YouTube’s Alexandra Veitch reminded the panel that the platform offers an off switch for autoplay, implying that users could just flip off the algorithm if they thought
it was causing trouble. Facebook’s Monika Bickert indicated options to toggle the “most recent” and “favorites” aspects of the News Feed. And Twitter’s Lauren Culbertson offered the “sparkle icon,” launched in 2018, which allows users to view an old-fashioned chronological timeline.
With platforms touting the control they offer to users, you might wonder what exactly happens when people do decide to turn off the algorithms.
A new empirical study we just published provides some insight into how algorithms can change what we see on social media platforms, specifically on Twitter. In our study, we found that Twitter’s timeline algorithm shifted exposure to different types of content, different topics and different accounts. In the chronological timeline, 51 percent of tweets contained an external link. But this rate fell to 18 percent in the algorithmic timeline. So, if you use Twitter to find news articles and other content from across the internet, you may want to click the sparkle icon to turn off the algorithm.
Similarly, you may want to switch off the algorithm if you prefer to see tweets only from accounts you actually follow. On average, we found that “suggested tweets” made up 55 percent of all tweets in algorithmic timelines. This led to a significant increase in the average number of accounts in the timeline, from an average of 663 in chronological timelines to 1,169 in algorithmic timelines. Other notable impacts of the algorithm included less exposure to tweets with COVID-19 health information (e.g. “If you have diabetes, you are at higher risk…”) and a slight partisan echo chamber effect — both left-leaning and right-leaning accounts saw more tweets from like-minded accounts and fewer tweets from bipartisan accounts.
To test Twitter’s algorithm, we had to create a group of “sock puppet” accounts – bots that mimicked real users’ following patterns – that checked their timelines twice per day. Each time they collected the first 50 tweets in their algorithmic “Top Tweets” timeline, and the first 50 tweets in the chronological “Latest Tweets” timeline. We used that data to characterize just how the Twitter algorithm was shifting the content users were exposed to.
Which brings us to the second argument platforms often come back to: transparency. But if the platforms are so transparent, why is it that academic researchers like us have to rig up laborious experiments to externally test their algorithms?
To their credit, platforms have generally come a long way in offering transparency about their algorithms and content moderation. But there’s still a huge gap. Social media platforms still do not provide meaningful transparency about how their distribution algorithms shape the texture of content people see. In some ways, the emphasis on user control errantly shifts the responsibility to end users. As Sen. Jon Ossoff (D-Ga.) noted, the power dynamics of platforms are severely tilted, and platforms still dominate.
Platforms know that the real power lever is not a sparkle icon or an on/off switch, but the algorithm itself. This is why Facebook and Twitter consistently announce structural changes to their algorithms: Just last week, Facebook turned down the dial for hateful content in anticipation of the Derek Chauvin verdict. Similarly, Twitter turned off many algorithmic features in the lead-up to the U.S. presidential election.
It is worth commending platforms for providing transparency in these situations, but the public still needs more information. Three easy wins would involve transparency around (1) algorithmic exposure metrics, (2) metrics about user controls and (3) on-platform optimization.
With regard to algorithmic exposure metrics, Sen. Chris Coons (D-Del.) noted that YouTube already collects metrics about how many times YouTube’s algorithm recommends videos. YouTube and other platforms could share these metrics with the public (particularly for videos found to violate content policies). In the past, some platforms have claimed that sharing such metrics could violate user privacy, but Facebook itself has demonstrated workarounds that protect user privacy while offering transparency and supporting academic research.
Also, if platforms claim user controls as a meaningful way of controlling algorithmic harms, they should share metrics about these controls. Our study provides only a narrow glimpse of what’s going on, but platforms have the full data to be transparent (in aggregate) about how the algorithm is shaping exposure to different types of content. How many users flip the on/off switch? What percentage of on-platform time do users spend in algorithmic feeds? For example, in 2018, YouTube Chief Product Officer Neal Mohan said their recommendation algorithm drives 70 percent of the time users spend on the platform. Has the new off switch for autoplay changed that?
Finally, platforms could provide more transparency about on-platform optimization. There are many potential reasons that users in our study saw fewer external links. For example, Twitter may be responding to bots that share vast amounts of links. Another common theory, which repeatedly came up in the Senate hearing, is that platforms elevate content that keeps users on the platform, and thus suppress content that would take them off the platform. For example, we know that Facebook uses a metric called “L6/7,” the percentage of users who log in six of the last seven days. What are the other specific targets that platforms use for measurement and optimization?
While more drastic policy proposals may also be at hand, we believe that transparency can be one effective lever for governing platform algorithms. Algorithmic transparency policies should be developed in collaboration with the rich community of grassroots organizations and scientists who study these platforms and their associated harms, including those who study misinformation, content moderation, as well as the algorithms themselves. Policies should also focus on the outcomes experienced by end users: the ways that algorithms can affect us as individuals, as communities and as a society.
Jack Bandy is a Ph.D. candidate in Northwestern’s Technology and Social Behavior program who conducts research in Nicholas Diakopoulos’s lab. Follow him on Twitter @jackbandy. Nicholas Diakopoulos is an associate professor in communication studies and computer science at Northwestern University, where he directs the Computational Journalism Lab. He is author of “Automating the News: How Algorithms are Rewriting the Media.” Follow him on Twitter @ndiakopoulos.