Design Twitter
Hozefa (Facebook, Wealthfront EM) answers the question, "Design Twitter"
Step 1: Define the problem
Ask clarifying questions
The problem space of designing X, formerly known as Twitter is incredibly large. If the interviewer hasn’t provided a clear scope of the problem, ask clarifying questions to define the problem scope.
To start, I’d like to ask some clarifying questions about the key features and aspects we need to include in the design, for example:
- What is the scale of the product we are building? What’s the number of daily active users (DAUs)?
- What kind of data are we expected to support (text, images, video)?
- Are we building a web or mobile application, or both?
Define functional requirements
Based on the insights gathered from my clarifying questions, I can boil down the primary areas and features of Twitter to the following:
- User posts a tweet
- User follows/unfollows another user
- User views the user timeline (activity from the user)
- User news feed (activity from people the user is following, we call them followees)
I’d also list some optional functional requirements and discuss their priorities with the interviewer. For example, we decide the following are low priority/no-goals for this system design question.
- User searches keywords
- Edit/Delete/Retweet
- Add tweet tag
- Comments/Share/Likes/Emojis.
- User account registration/login/authentication.
- Notification
Define non-functional requirements
When dealing with designs that must satisfy high scale, consider what’s most important for end users and what tradeoffs you’d make. The CAP theorem is a good place to start.
Given our functional requirements, it is safe to assume that our non-functional requirements include:
- High availability
- High scalability
- Low read latency, as our system should be read heavy
Let’s say a user can tolerate a bit of latency seeing their followees’ posts. Therefore, our system should be okay to offer eventual consistency.
According to the CAP theorem, in a system that network partition is inevitable, choosing a strong consistency means we need to sacrifice availability, and choosing high availability means we need to settle for eventual consistency.
Let’s also try to optimize the system for influencers (celebrity users on Twitter that have many followers). We should optimize the system so that influencers’ followers see their posts in a timely manner.
Back-of-the-envelope calculation
Assuming twitter has 200M DAUs.
First we estimate the peak read and write query-per-second (QPS) numbers.
Average Read QPS = DAUs * average_number_of_request_per_day / 86400 seconds per day ~= 200M * 50 / 86400 ~= 100k
Peak Read QPS = 3 * Average Read QPS = 300k
Peak Write QPS = 10% Peak Read QPS = 30k
Then we work out the amount of storage needed to hold a year’s worth of data.
- Size per tweet:
- tweet_id - 8 bytes
- user_id - 8 bytes
- text - 560 bytes (twitter has a 280 character length limit for all non-subscribers, and let’s say we need two bytes to store a character without compression)
- Timestamp: 8 bytes
- Total: ~600 Bytes
- New tweet contents per day
- 200M DAU * average_number_of_tweets_per_user = 200M * 1 = 200M
- Overall storage to store the tweets per year
- 200M per day * 600 B per tweet * 365 day per year = 44TB.
For tips on estimating unknowns, check out our Estimation Strategies and Tricks lesson.
Step 2: Design a high-level system
Design your APIs
Because our system does not require the server to independently send or push data to the client, we can use standard REST APIs to facilitate communications between the client and the server.
Below we can mock some of our REST APIs mapped from our functional requirements.
Endpoints
- User relationship management
GET /users/{username}/followers: Get a list of followers for a user.GET /users/{username}/following: Get a list of users a user is following.POST /users/{username}/follow: Follow a user.POST /users/{username}/unfollow: Unfollow a user.
- Tweet management
POST /tweets: Create a new tweet.GET /tweets/{tweet_id}: Get a specific tweet.
- Timeline management
GET /users/{username}/timeline: Get the user's timeline (tweets from that particular user).
- Newsfeed management
GET /users/{username}/newsfeed: Get the user's newsfeed (tweets from all the user’s followees).
Response format
Responses should typically be in JSON format and include appropriate status codes, such as 200 for a successful request, 201 for successful creation, and 404 for not found.
Design your data model
We need to store at least two types of data to satisfy the functional requirements:
- Tweets data: Data that store the tweet text and pertinent information
- User relations: Store all the users that one user follows.
Tweet table:
We need to have an index on (TweetId, CreatedAt) since we need to fetch and display all the recent tweets first.
User relation table:
Thinking about one of our functional requirements, which is to generate a newsfeed, we need to query the user relation table to get all the userIDs a user followees, and then joins with the tweet table to query all the recent tweets from that followee.
A natural way to design the above data schema is to use a SQL database such as MySQL since it supports a lot of joins between two structured data.
Create a high-level design diagram
Let’s provide a high-level design that delivers the basic functional requirements.

The above diagram includes two core functionalities, one for following/unfollowing another user, and another for tweet-related services.
When a user follows/unfollows another user, the request goes through the Web Server and is routed to the User Relation Service. Then we update the “ToUserID” in the User Relation Table.
The Tweet Service includes two data paths, the write path and the read path.
The write path is simple. The only use case here is to create a new tweet. A user uploads a new tweet from a mobile/web client to create a tweet. First, the client issues a write request to a web server, which forwards the request to a Write API Server. The Write API Server then stores the tweet into the Tweets Table.
The read path includes two use cases.
- Timeline generation.
- The Client posts a timeline generation request to the Web Server.
- The Web Server forwards the request to the Read API server and the Timeline Generation Service inside the Tweet Service.
- The Timeline Generation Service retrieves the user’s tweets from the Tweets Table.
- Newsfeed.
- The Client posts a newsfeed generation request to the Web Server
- The Web Server forwards the request to the Read API server and the Newsfeed Generation Service inside the Tweet Service.
- The Newsfeed Generation Service retrieves the user’s followee from the User Relation Table.
- Given a list of followee IDs, the Newsfeed Generation Service then retrieves all their recent tweets from the Tweet Table.
- The Newsfeed Generation Service merges and sort all the followee’s tweets based on their timestamp and returns the top K tweets.
At this point all the functional requirements should be met. Moving forward, we can identify points of weakness in the current system and how we can address them alongside our non-functional requirements.
Step 3: Deep-dive into the design
Assess tradeoffs
Below we can outline some common trade-offs within the context of the problem.
Relational vs. non-relational database
This is similar to “Design Instagram”. Please refer to its discussion.
Push vs. pull model
In our proposed high-level design, we use a pull model to construct a user’s newsfeed (similar for the timeline generation process). This makes our write process fast, as it only requires inserting a new entry in the Tweet Table.
However, this solution has a limitation. The newsfeed generation API will be called much more frequently than the tweet posting API (as our system is read heavy), so joining the User Relation and the Tweet Table will cost a huge amount of resources and incur a large latency.
One optimization we can do is to add caching layers in front of databases to speed up the join operation. And we can further cache each user’s newsfeed to speed up the read.
Another possible solution for speeding up newsfeed generation service is to use a push model. A similar idea can be applied to the timeline generation service but is omitted here for simplicity.

In a push model, we create a new Newsfeed Table to store all the tweet IDs from a particular user’s followees. Therefore, fetching a user’s Newsfeed becomes much easier. It simply fetches all the tweet IDs from the Newsfeed Table and then gets the tweet contents from the Tweets Table and returns the result to the user.
However, the write path becomes more complicated. In addition to writing to a Tweets Table, the write request also needs to populate the newly inserted tweet ID to all the followers’ entries in the Newsfeed Table. The process happens as follows:
- A user posts a new tweet.
- A new entry is created at the Tweet Table.
- Return success to the write request.
- Start the Async Fanout Service. It gets all the user’s followers from the User Relation Table. This service happens asynchronously so that it does not block the user’s further requests.
- The Async Fanout Service inserts the new tweet ID to the followers Newsfeed Table.
Step 4: Identify bottlenecks and scale
Previously we discussed using a push model to improve the read latency and resource usage. But does a push model have its own limitation, compared to a pull model? The answer is yes, especially in an influencer's case.
An influencer can have a huge amount of followers, therefore the Async Fanout Service might take a long time to populate all their followers’ Newsfeed Table. Then the followers start to see an inconsistent newsfeed result: a portion of the followers might see the update much earlier than others.
A naive solution for this problem is to add more resources to speed up the Async Fanout Service. We can also filter out inactive users so their newsfeed tables don’t get populated. Beyond these, one another potential solution is to use a mixed pull and push model.
The above table summarizes the pros and cons for the pull & push models. We can use a pull model for influencer users and a push model for non-influencer users, which gives us the best of both worlds in terms of both model’s benefits.
To scale up the system, we need to make sure there is no single point of failure. Here are some (but not exhaustive) optimization ideas we can discuss with the interviewer.
- Add a Load Balancer between the client and the Web Servers. The load balancer will choose the best (in terms of availability, latency, utilization etc.) for a given incoming request. Similar approach can be applied between a web server and API servers.
- Add replicas across our system’s app servers and database clusters. All writes will first go to the primary server and then will be replicated to secondary servers, either synchronously or asynchronously. This also makes the system more robust, and when failure happens, we can failover to a secondary server.
- Shard the database to even out the traffic. For example, we can shard the Tweet Table by TweetID so that we can evenly distribute all tweets to avoid tweet hotspots (caused by a heavy user). However, fetching a user’s timeline or newsfeed might incur higher latencies because we need to query all database partitions to get the recent tweets. We can mitigate this issue by using a combination of TweetID and CreatedAt timestamp. This avoids setting up a secondary index on CreatedAt timestamp so we can get saving in writing. This also eliminate the need to filter out obsolete tweets so that we can speed up the read as well.
Step 5: Review and summarize
The current state of our system works well as a distributed system and appropriately handles basic requests we scoped in our functional requirements.
If we have more time, we can continue to discuss other low priority functionalities such as tweet editing/deletion/retweeting, commenting, adding tags etc.
Hiring decision
- The candidate (TC) provides a clear scope of the problem and navigates the problem by considering realistic design constraints and tradeoffs.
- TC also demonstrates a logical thinking process and problem solving skills at each step of the interview and lands on a working solution.
- In most cases, TC shows knowledge of various system design technologies, evaluates the pros and cons for each and makes a sound decision based on the specific requirements of the system.
- TC is aware that the initial working solution has several drawbacks and is able to provide mitigations and optimizations.
- TC considers several corner cases of the system and is able to provide good coverage for these cases.
- TC works towards a final design without a single point of failure and better performance. Considering these factors, I would be inclined to hire this candidate.
Exponent coach decision: Hire