Twitter Algorithm Code Leaked and Released, An Explanation

Twitter’s algorithm code was recently leaked online via GitHub. The New York Times reported that parts of Twitter’s source code were publicly available on the platform before being taken down, following Twitter’s DMCA request to GitHub. The leaked information included “proprietary source code for Twitter’s platform and internal tools.”

Fevi Yu
April 5, 2023

The Algorithm and Its Components

Prior to the leak, Twitter, via Elon Musk, had already announced that it was going to open-source codes used for recommended tweets on March 31, 2023. On Friday, Twitter released the code for the algorithm that determines which tweets appear on a user’s “For You” timeline. The company published the code on its official GitHub page, stating that the move was part of its effort to increase transparency and give developers a better understanding of how the platform operates.

The code for the algorithm includes the following components:

Different recommendation sources: These are the various sources Twitter uses to gather tweets that it thinks a user might be interested in seeing. They include accounts that the user follows, popular accounts, and tweets that are currently trending.
Machine learning model: This is the tool Twitter uses to rank the tweets it has gathered based on how relevant they are to the user. The model takes into account factors like the user’s activity on the platform, the content of the tweets, and the popularity of the tweets.
Filters: These are the final step in the algorithm and are used to remove tweets that are inappropriate, have been blocked by the user, or have already been seen by the user.

Twitter’s decision to release the algorithm code has been praised by some in the tech industry as a positive step towards greater transparency in social media. However, others have raised concerns about the potential for bad actors to use the code to exploit the platform’s vulnerabilities.

Twitter’s engineering team explained that the algorithm that determines which “top Tweets that ultimately show up on your device’s For You timeline” is “composed of many interconnected services and jobs.” The algorithm has a three-step process that gathers the best tweets from “different recommendation sources,” ranks them using a “machine learning model,” and filters out blocked tweets, inappropriate tweets, or posts the user has already seen.

Twitter also noted that the largest source of the tweets is “In-Network Sources,” or users someone follows. The top tweets from that pile are ranked based on the likelihood of a user’s engagement with that tweet’s author. For the “Out-of-Network Sources,” Twitter considers tweets that attracted engagement from people users follow and tweets liked by those who like tweets similar to a user.

Get in touch with us, today

Call, chat with a representative, or fill out the form

202-506-0448

Fill Out Form

Schedule Consult

Identification Values and Categories

When the code was leaked, many users pointed out some questionable considerations in Twitter’s recommendation algorithm. For instance, in the “HomeTweetTypePredicates.scala” code branch, users found seemingly discriminatory categories such as “author_is_elon,” “author_is_power_user,” “author_is_democrat,” and “author_is_republican.”

A Twitter engineer clarified that these identification values were “used purely for metrics collection” and to “track how often we are serving Tweets from these authors and how often their tweets are being impressed by users.” Twitter uses this information to validate that their A/B experimentation platform does not negatively impact one group over another.

However, many Users were still concerned with these categories, and during a Twitter Spaces audio session, Elon Musk expressed confusion and criticism over the categories’ as well. Musk questioned why categories such as “Republican” and “Democrat” were included and suggested that they should not be there. He added that such categories only served to “divide people” and were “stupid embarrassing things.” Musk’s appearance on Twitter Spaces highlighted his plans to increase transparency on the platform by releasing the social media site’s code.

The recent leak of Twitter’s recommendation algorithm code has raised concerns over the platform’s use of discriminatory categories in their algorithm. While Twitter claims that these categories were used for metrics collection, the revelation has caused many to question the platform’s commitment to inclusion and diversity. However, Twitter’s release of the algorithm code is also a significant step towards transparency and accountability, giving users and researchers the opportunity to understand how the platform operates.

As someone who studies algorithms intensely, this was a very interesting revelation, to say the least.

Picture of <span style="font-size:20px;">by</span> Fevi Yu

by Fevi Yu

SEO Consultant since 2008

Fevi Yu is a seasoned SEO consultant, digital agency founder, and the visionary behind the Basic Website Package—an innovative web and SEO solution crafted for business owners aiming for immediate results. She’s also the creator of the Unlimited WP Maintenance Package, which provides comprehensive WordPress support to enhance website performance and ensure long-term success. This article was written with the assistance of generative AI for enhanced clarity and precision.

Aneth Coloma

Social Media Manager

Aneth is our Social Media Manager. She’s a creative-technical hybrid with almost 10 years of experience in digital marketing with a focus on social media. From writing to design, she can handle all aspects of social media content creation and her ability to analyze social media insights can help grow a brand’s online presence. She takes initiative, drives results, and stays current with evolving trends.