"System Components
Data Collection Layer
Posts with hashtags are asynchronously sent to Kafka topics
Each message contains: hashtag, timestamp, userid, postid
Multiple Kafka partitions ensure scalability and fault tolerance
Processing Layer
Apache Flink processes streams in real-time
Implements sliding window aggregation (1hr, 24hr, 7d windows)
Calculates topic popularity using weighted metrics:
Post count
User engagement (likes, comments)
Unique user"
Usman B. - "System Components
Data Collection Layer
Posts with hashtags are asynchronously sent to Kafka topics
Each message contains: hashtag, timestamp, userid, postid
Multiple Kafka partitions ensure scalability and fault tolerance
Processing Layer
Apache Flink processes streams in real-time
Implements sliding window aggregation (1hr, 24hr, 7d windows)
Calculates topic popularity using weighted metrics:
Post count
User engagement (likes, comments)
Unique user"See full answer
"Good Discussion on the distributed messaging queues (Complex topic with lot of nuances)
Liked the mind-map style drawing of requirements and metrics capture
Touched on different types of queue styles (point to point, pub-sub, fan-out/fan-in)
Storage and WAL usage was interesting
Some distributed queue challenges that could be helpful to highlight / expand are:
Message guarantee / semantics - Ordering of messages across different servers
Replication
Master slave architecture or Pe"
Karthik R. - "Good Discussion on the distributed messaging queues (Complex topic with lot of nuances)
Liked the mind-map style drawing of requirements and metrics capture
Touched on different types of queue styles (point to point, pub-sub, fan-out/fan-in)
Storage and WAL usage was interesting
Some distributed queue challenges that could be helpful to highlight / expand are:
Message guarantee / semantics - Ordering of messages across different servers
Replication
Master slave architecture or Pe"See full answer
"The addition of an intermediate "sanitization" ML is something Neeraj used in the Uber Eats design and again seems kind of outside the scope here and redundant. This can simply be built into the AI response model to save a step. It's not clear what benefit this step provides, and he basically said we should have it "just because it would be good" and there's no concrete reasoning why to include it.
Adding a Kafka queue to handle the thumbs-down ratings? For what purpose do you need a queue othe"
Robert H. - "The addition of an intermediate "sanitization" ML is something Neeraj used in the Uber Eats design and again seems kind of outside the scope here and redundant. This can simply be built into the AI response model to save a step. It's not clear what benefit this step provides, and he basically said we should have it "just because it would be good" and there's no concrete reasoning why to include it.
Adding a Kafka queue to handle the thumbs-down ratings? For what purpose do you need a queue othe"See full answer
Software Engineer
System Design
+1 more
🧠Want an expert answer to a question? Saving questions lets us know what content to make next.
"Wrote up a simple cache system using python dict. Added TTL requirement. Then went into code-level concurrency issues for the cache."
S R. - "Wrote up a simple cache system using python dict. Added TTL requirement. Then went into code-level concurrency issues for the cache."See full answer
" Thanks a lot for showing us how a recommender system can be build. I see it was proposed to use Collaborative filtering which is user - item matrix having dimension N * M (where N - number os users and M - number of songs). Though, it was explained how it gonna be built, it is still unclear how all users and songs features are going to be used. In that matrix we have values in cell (lets say i, j) like 1 - a specific user (i) clicked on song (j) when it was recommended or it is 0 when the user"
Dinar M. - " Thanks a lot for showing us how a recommender system can be build. I see it was proposed to use Collaborative filtering which is user - item matrix having dimension N * M (where N - number os users and M - number of songs). Though, it was explained how it gonna be built, it is still unclear how all users and songs features are going to be used. In that matrix we have values in cell (lets say i, j) like 1 - a specific user (i) clicked on song (j) when it was recommended or it is 0 when the user"See full answer
"The question is bit vague (I guess deliberately) so I believe firstly we shall ask questions and resolve ambiguity. Some initial questions could be :
1) Is this one time activity or something that should be done on continuous basis. If continuous basis then at what frequency.
2) How much staleness is acceptable in SYSTEM Y data
3) Are there any limitation in SYSTEM Y and is it fair to assume that we would need some kind of transformation to bring data into SYSTEM Y schema.
4) What kind of vol"
Kshitij A. - "The question is bit vague (I guess deliberately) so I believe firstly we shall ask questions and resolve ambiguity. Some initial questions could be :
1) Is this one time activity or something that should be done on continuous basis. If continuous basis then at what frequency.
2) How much staleness is acceptable in SYSTEM Y data
3) Are there any limitation in SYSTEM Y and is it fair to assume that we would need some kind of transformation to bring data into SYSTEM Y schema.
4) What kind of vol"See full answer
"Background
A. Objective
Lyft has a presence in Toledo, Ohio. At our current revenue per ride of $6, we can match 60% consumers requesting a ride with a driver. Our goal is to maximize net revenues in the next 12 months by figuring out the optimal revenue per ride.
B. TLDR Summary
· With a target market of (~100K-138K) consumers, on a $25 charge to consumers, Lyft should pay $20.75 to drivers and fix its share at $4.25 to maximize its net revenues over a 12-mont"
Eshan P. - "Background
A. Objective
Lyft has a presence in Toledo, Ohio. At our current revenue per ride of $6, we can match 60% consumers requesting a ride with a driver. Our goal is to maximize net revenues in the next 12 months by figuring out the optimal revenue per ride.
B. TLDR Summary
· With a target market of (~100K-138K) consumers, on a $25 charge to consumers, Lyft should pay $20.75 to drivers and fix its share at $4.25 to maximize its net revenues over a 12-mont"See full answer
"Great. I will start by understanding the goal of Google Photos and how it aligns with the goal and mission of Google. After defining goals, I will talk about the user actions which will contribute towards this goal. Based on these actions, we will define metrics.
Check-in with the interviewer on the approach here. Assuming that this looks good for the interviewer to proceed.
Google Photo - Helping users organise & manage their pictures. Completely in line with Google's mission.
Thinking ab"
Harshit G. - "Great. I will start by understanding the goal of Google Photos and how it aligns with the goal and mission of Google. After defining goals, I will talk about the user actions which will contribute towards this goal. Based on these actions, we will define metrics.
Check-in with the interviewer on the approach here. Assuming that this looks good for the interviewer to proceed.
Google Photo - Helping users organise & manage their pictures. Completely in line with Google's mission.
Thinking ab"See full answer
"Rate Limiter is to limit the number of request from a particular IP Address. Rate limiter will block the IP address to reduce the load on server. It should be highly available and handle concurrent requests. Blocked IP addresses should be kept in a pool which is present in shared cache. We need to keep threshold value after it reaches threshold value it should start blocking IP address.
All these ip address to be kept in No SQL DB. Batch will run that will clear the cache and delete all the bloc"
Ashish G. - "Rate Limiter is to limit the number of request from a particular IP Address. Rate limiter will block the IP address to reduce the load on server. It should be highly available and handle concurrent requests. Blocked IP addresses should be kept in a pool which is present in shared cache. We need to keep threshold value after it reaches threshold value it should start blocking IP address.
All these ip address to be kept in No SQL DB. Batch will run that will clear the cache and delete all the bloc"See full answer
"I think, robots.txt file is provided by websites which web-crawler is crawling. Am I wrong somewhere or missing some context?"
Somya M. - "I think, robots.txt file is provided by websites which web-crawler is crawling. Am I wrong somewhere or missing some context?"See full answer
"At a high level, the core challenge here revolves around building an effective recommendation algorithm for news.
News is an inherently diverse category, spanning various topics and catering to a wide array of user types and personas, such as adults, business professionals, general readers, or specific cohorts with unique interests. Consequently, developing a single, one-size-fits-all recommendation algorithm is not feasible.
To enhance the personalization of the news recommendation algorithm,"
Sai vuppalapati M. - "At a high level, the core challenge here revolves around building an effective recommendation algorithm for news.
News is an inherently diverse category, spanning various topics and catering to a wide array of user types and personas, such as adults, business professionals, general readers, or specific cohorts with unique interests. Consequently, developing a single, one-size-fits-all recommendation algorithm is not feasible.
To enhance the personalization of the news recommendation algorithm,"See full answer
"I started asking some questions regarding the constrains of the system:
An antena is emitting a signal that says if the tagged device was out of the room where the interview was happening.
I was able to decide which would be the schema for the Antena's message.
The antena is sending the info of multiple users.
The system doesn't need to push notification to the users when the user left the device behind.
Upon reflection, this is what I recollected doing.
I propuse the json schema a"
Eduardo C. - "I started asking some questions regarding the constrains of the system:
An antena is emitting a signal that says if the tagged device was out of the room where the interview was happening.
I was able to decide which would be the schema for the Antena's message.
The antena is sending the info of multiple users.
The system doesn't need to push notification to the users when the user left the device behind.
Upon reflection, this is what I recollected doing.
I propuse the json schema a"See full answer
"Inventory Service and Registration service can they be in SYNC always ?? The "experience layer" needs accessed by Hotel owners also ??? How does real time inventory come from all Hotels ???"
Anup S. - "Inventory Service and Registration service can they be in SYNC always ?? The "experience layer" needs accessed by Hotel owners also ??? How does real time inventory come from all Hotels ???"See full answer
"Designing a video streaming system like Netflix or Facebook Video involves addressing multiple aspects, such as scalability, availability, low latency, and high performance. Here's a high-level design:
System Requirements
Functional Requirements:
User Management:
User sign-up, login, and profile management.
Subscription plans and payment integration (for Netflix-like systems).
Content Management:
Upload, edit, and delete videos.
Categorize content (genres, recommendations).
Video Playback:
S"
Kamal .. - "Designing a video streaming system like Netflix or Facebook Video involves addressing multiple aspects, such as scalability, availability, low latency, and high performance. Here's a high-level design:
System Requirements
Functional Requirements:
User Management:
User sign-up, login, and profile management.
Subscription plans and payment integration (for Netflix-like systems).
Content Management:
Upload, edit, and delete videos.
Categorize content (genres, recommendations).
Video Playback:
S"See full answer