The Algorithm and Its Components
Prior to the leak, Twitter, via Elon Musk, had already announced that it was going to open-source codes used for recommended tweets on March 31, 2023. On Friday, Twitter released the code for the algorithm that determines which tweets appear on a user’s “For You” timeline. The company published the code on its official GitHub page, stating that the move was part of its effort to increase transparency and give developers a better understanding of how the platform operates.
The code for the algorithm includes the following components:
- Different recommendation sources: These are the various sources Twitter uses to gather tweets that it thinks a user might be interested in seeing. They include accounts that the user follows, popular accounts, and tweets that are currently trending.
- Machine learning model: This is the tool Twitter uses to rank the tweets it has gathered based on how relevant they are to the user. The model takes into account factors like the user’s activity on the platform, the content of the tweets, and the popularity of the tweets.
- Filters: These are the final step in the algorithm and are used to remove tweets that are inappropriate, have been blocked by the user, or have already been seen by the user.
Twitter’s decision to release the algorithm code has been praised by some in the tech industry as a positive step towards greater transparency in social media. However, others have raised concerns about the potential for bad actors to use the code to exploit the platform’s vulnerabilities.
Twitter’s engineering team explained that the algorithm that determines which “top Tweets that ultimately show up on your device’s For You timeline” is “composed of many interconnected services and jobs.” The algorithm has a three-step process that gathers the best tweets from “different recommendation sources,” ranks them using a “machine learning model,” and filters out blocked tweets, inappropriate tweets, or posts the user has already seen.
Twitter also noted that the largest source of the tweets is “In-Network Sources,” or users someone follows. The top tweets from that pile are ranked based on the likelihood of a user’s engagement with that tweet’s author. For the “Out-of-Network Sources,” Twitter considers tweets that attracted engagement from people users follow and tweets liked by those who like tweets similar to a user.