Skip to main content

Design Uber Eats

Premium

How would you design a food delivery app like Uber Eats or DoorDash? Watch Uma, Senior Software Engineer at Netflix, as he walks us through his solution to this system design interview question.

“Design Uber Eats” is a classic distributed system design question, where you're asked to design a read-heavy, search-heavy, location-aware CRUD system with multiple interacting actors (customers, restaurants, delivery partners).

In this design, the following trade-offs are central:

  • Scalability of a global system
  • Eventual consistency for restaurant listings
  • High availability for searches and viewing restaurants
  • Tradeoffs around storing menus, images, and metadata
  • Latency requirements for fast search & discovery

This write-up presents a structured, interview-ready solution.

Step 1: Define the problem

Ask clarifying questions

The problem space of designing Uber Eats is very large. If the interviewer hasn’t specified constraints or expectations, it’s important to ask clarifying questions to properly define the scope.

To begin, we can ask clarifying questions about the core features and components the system must support:

  1. What scale are we designing for? (number of customers, restaurants, and delivery drivers)
  2. How should restaurant search work? Should it support ranking, filtering, and personalization?
  3. Do we need real-time driver-order matching or a simpler allocation mechanism?
  4. Do we need live order tracking for customers?
  5. Should drivers be able to accept/decline orders?

Define functional requirements

Based on the clarifying questions and the problem scope, we can narrow Uber Eats down to its core purpose: enabling users to discover food, place orders, and receive deliveries in a fast and reliable way.

At its core, Uber Eats is a three-sided platform that connects customers, restaurants, and delivery partners. The system must coordinate interactions between all three actors while keeping the experience seamless and responsive.

Primary areas and features of Uber Eats are the following:

  • Customers should be able to search for restaurants and food and place orders.
  • Matches delivery drivers with customer orders efficiently.
  • The system can securely process customer payments.

Define non-functional requirements

When dealing with designs that must satisfy high scale, consider what’s most important for end users and what tradeoffs you’d make. The CAP Theorem is a good place to start.

When designing a system at Uber Eats scale, non-functional requirements are just as important as functional ones. These requirements shape the architectural trade-offs we make and help determine which system qualities we prioritize.

  • Low latency for search
  • Highly available system
  • Eventual consistency with SLA
  • High consistency for driver matching and payments

First and foremost, low latency is critical to the user experience. Customers expect restaurant search results to load quickly, menus to appear instantly, and order updates to feel real-time. Any noticeable delay in these flows can significantly degrade user satisfaction.

Second, the system must be highly available. Users should be able to browse restaurants and place orders even if certain backend components experience partial outages. Temporary failures in non-critical services should not render the entire platform unusable.

Third, the system must be highly scalable. Uber Eats needs to handle traffic spikes during peak meal times, support millions of concurrent users across regions, and scale independently across services such as search, ordering, and delivery.

In terms of consistency, not all data requires strict real-time accuracy. For example, restaurant listings, menu updates, and search results can tolerate eventual consistency without harming the user experience. This allows the system to favor availability and fault tolerance for read-heavy workloads like search and discovery.

However, certain operations—such as payment processing, order creation, and driver assignment—require strong consistency to prevent issues like double charges, lost orders, or incorrect assignments. As a result, the system selectively applies stronger consistency guarantees only where correctness is critical.

By intentionally making these trade-offs, the system balances performance, reliability, and correctness in a way that aligns with real-world user expectations.

Estimate the amount of data (back-of-envelope)

Data estimate

Assumptions:

  • 10 million Monthly Active Users (MAU)
  • Each user places 5 orders per month
  • Each order has:
    • Metadata JSON: ~10 KB
    • Supporting objects (payment receipts, delivery logs, map snapshots): ~200 KB
  • Total per order ≈ 210 KB

Monthly storage

10^7 users × 5 orders × 210 KB

≈ 10^7 × 2.1×10^5 B = 2.1×10^12 bytes ≈ 2 TB per month

Yearly storage

2 TB/month × 12 ≈ 24 TB/year

Additional systems (restaurants, drivers, telemetry, payments, menus, search indices) add ~4–5× more:

≈ 100 TB/year total persistent storage (across S3 + SQL + NoSQL)

QPS estimate

  1. Users

10M MAU → ~3M Daily Active Users (DAU)

  1. Write QPS (Order Placements)

Each DAU places 0.16 orders/day:

3×10^6 × 0.16 = 480,000 writes/day

480,000 / 86,400 ≈ 6 writes/sec

  1. Read QPS (Search, Restaurant Fetches, Tracking)

Each DAU performs 3 searches/day:

3 searches × 3×10^6 DAU / 86,400 ≈ 100 reads/sec per endpoint

Add map tiles + driver pings:

≈ 2,000–3,000 reads/sec globally

For tips on estimating unknowns, check out our Estimation Strategies and Tricks lesson.

Step 2: Design a high-level system

Design the APIs

Once we understand the system’s core responsibilities, the next step is to define the APIs that allow clients and services to interact with the platform. These APIs form the contract between the mobile applications, backend services, and internal components, so they should be simple, well-scoped, and aligned with business workflows.

Rather than exposing a single monolithic API, the system is broken down into domain-specific services, each responsible for a clearly defined set of operations. This separation improves scalability, fault isolation, and long-term maintainability.

Search service

The Search Service allows customers to efficiently discover nearby restaurants and available menu items by querying an indexed search engine (e.g., Elasticsearch) that supports geolocation, text search, and ranking.

APIs

  • GET /restaurants?lat=&lng=&query=
  • GET /restaurant/{id}/menu

Order service

The Order Service manages the entire lifecycle of a food order—validating restaurant availability, creating an order record, initiating payment, and tracking order status throughout preparation and delivery.

APIs

  • POST /order/create
  • POST /order/{id}/pay
  • GET /order/{id}/status

Delivery service

The Delivery Service continuously ingests driver location updates, matches orders to the nearest available driver using proximity-based algorithms, and orchestrates driver-order assignment and acceptance flows.

APIs

  • POST /driver/{id}/location
  • GET /driver/{id}/assignedOrders
  • POST /driver/{id}/acceptOrder

This API design reflects a few key principles:

  • Clear separation of concerns: Search, ordering, and delivery are handled by independent services.
  • Scalability: Read-heavy APIs (search) and write-heavy APIs (driver locations) can scale independently.
  • Resilience: Failures in one service do not cascade across the entire system.
  • Extensibility: New features such as personalization, promotions, or batching can be added without breaking existing clients.

By designing APIs around business workflows rather than technical implementation details, the system remains intuitive for clients and flexible for future growth.

Design the data model (core entities)

Once the APIs and service boundaries are defined, the next step is to design the data model that supports these workflows. A good data model balances performance, consistency, and scalability, while reflecting how the business actually operates.

At a high level, Uber Eats stores and processes three fundamentally different types of data: restaurant data, order data, and driver data. Each of these has different access patterns and consistency requirements, which strongly influences how and where the data is stored.

1. Restaurant and menu data

Restaurant and menu data is primarily read-heavy. Customers frequently browse restaurants, view menus, and search for specific items, while updates to this data happen relatively infrequently.

Because of this access pattern, the system maintains two representations of restaurant data:

  • A source of truth stored in a relational database, which ensures structured storage and reliable updates.
  • A search-optimized index (such as Elasticsearch), which enables fast text search, filtering, and geospatial queries.

This separation allows the system to serve search requests with very low latency, while still preserving data integrity in the primary database. Any changes to restaurant details or menu items are asynchronously propagated from the database to the search index, accepting a small window of eventual consistency in exchange for performance.

2. Order data

Order data sits at the core of the system and has the strictest consistency requirements. An order moves through multiple states, created, paid, prepared, picked up, and delivered—and each transition must be handled correctly to avoid issues such as duplicate charges or lost orders.

Because of these requirements, order data is best suited for a relational database that provides ACID guarantees. This ensures that operations such as order creation, payment confirmation, and driver assignment happen atomically and reliably.

Orders also serve as the central point of coordination between customers, restaurants, drivers, and payment systems, which makes a structured schema especially valuable.

3. Driver data (including locations)

Driver data, particularly location updates, has a very different profile from restaurant or order data. Drivers send frequent GPS updates—often every few seconds—resulting in extremely high write throughput.

Because this data is both high-volume and short-lived, storing it in a traditional relational database would be inefficient. Instead, driver locations are stored in in-memory or high-throughput NoSQL stores such as Redis or Cassandra, which provide low-latency reads and writes.

To ensure freshness, location entries are typically written with a time-to-live (TTL), so outdated positions are automatically removed. This keeps the dataset small and relevant for real-time matching.

Tables with schema

  • Restaurant: id, name, address, lat, long (geocoords), geohash, is_open, average_prep_time, metadata, image_ids
  • MenuItem: id, restaurant_id, name, description, price, image_id, available_flag
  • Customer: id, name, addresses, payment_methods (tokenized)
  • Order: id, customer_id, restaurant_id, items, total, status, created_at, assigned_driver_id
  • Driver: id, current_location (lat/lng), status (available/assigned), vehicle_info

High level architecture

design uber eats

Now let us look at the High level flow of the requests in the system.

All client requests, whether coming from customer apps, restaurant dashboards, or driver apps—first pass through a load balancer, which distributes traffic across healthy service instances and provides a single entry point into the system.

From there, requests are routed to the appropriate backend service based on the user’s intent.

Search flow

When a customer opens the app and searches for nearby restaurants or food items, the request is routed to the Search Service. This service is optimized for low-latency, read-heavy traffic, as search and browsing represent the majority of user interactions.

Instead of querying the primary database directly, the Search Service relies on a search index (such as Elasticsearch) that is purpose-built for text search, filtering, and geospatial queries. This allows the system to quickly return relevant restaurants based on location, cuisine, availability, and ranking signals.

The relational database remains the source of truth for restaurant and menu data, but updates are asynchronously propagated to the search index. This introduces a small amount of eventual consistency, which is an acceptable tradeoff given the performance benefits and the fact that slight staleness in restaurant listings does not critically impact the user experience.

Order placement flow

Once a customer decides to place an order, the request is handled by the Order Service, which manages the full lifecycle of an order.

The Order Service first validates the request, creates an order record, and persists it to a transactional SQL database. Because orders involve money and state transitions, this part of the system prioritizes strong consistency and durability.

After the order is created, the Order Service communicates with the Payment Service, which securely processes the transaction by interacting with third-party payment providers. Only after a successful payment confirmation does the system move the order forward.

At this point, the Order Service updates the order state and emits an event to notify downstream systems—most importantly, the Delivery Service—that the order is ready to be assigned to a driver.

Driver location + matching flow

Drivers continuously send GPS updates from their mobile devices, often every few seconds. These updates are ingested by the Driver Location Service, which is optimized for extremely high write throughput.

Rather than storing this data in a traditional database, driver locations are written to an in-memory or NoSQL store (such as Redis or DynamoDB), allowing for fast updates and efficient geospatial queries. Each location update is typically stored with a TTL to ensure that stale data is automatically discarded.

When a new order becomes available for delivery, the Delivery Service queries this location store to find nearby available drivers. Using proximity-based matching algorithms, it selects the most suitable driver and assigns the order.

Once a driver is selected, the system updates the order record with the assigned driver and notifies the driver in real time using push notifications or WebSocket connections. This avoids inefficient polling and enables fast acceptance or rejection flows.

Step 3 : Deep dive + trade-offs

Once the high-level architecture is defined, it’s important to examine some of the key design decisions in more detail and understand the trade-offs involved. Large-scale systems like Uber Eats rarely optimize for a single dimension; instead, they carefully balance consistency, availability, performance, and operational complexity.

Why SQL for orders?

Orders form the backbone of the Uber Eats platform. Every order involves money, multiple state transitions, and coordination between customers, restaurants, drivers, and payment providers. Because of this, strong consistency is non-negotiable.

Relational databases are a natural fit here. They provide ACID guarantees, which ensure that operations such as creating an order, confirming payment, and assigning a driver happen atomically and reliably. This prevents issues like duplicate charges, lost orders, or inconsistent order states.

The main trade-off is scalability. As order volume grows, SQL databases often require manual sharding to scale horizontally. However, this operational complexity is an acceptable cost given the correctness guarantees required for financial transactions.

Why Elasticsearch for search?

Search and discovery are some of the most frequently used features in Uber Eats. Users expect fast responses when searching for restaurants, cuisines, or specific menu items near them.

Elasticsearch is well-suited for this use case because it supports:

  • Full-text search
  • Geospatial queries
  • Filtering and ranking

By offloading search traffic to Elasticsearch, the system avoids placing heavy read load on the primary database.

The trade-off is eventual consistency. Since Elasticsearch indexes are updated asynchronously from the source-of-truth database, users may briefly see slightly outdated data (for example, a menu item that was just marked unavailable). In practice, this is acceptable because minor staleness in listings does not significantly harm the user experience.

Why Redis for driver locations?

Driver location data is highly dynamic. Each driver sends frequent GPS updates, resulting in extremely high write throughput. Storing this data in a traditional database would be inefficient and slow.

Redis (or a similar in-memory store) is chosen because it offers:

  • Very low read and write latency
  • Support for geospatial queries
  • TTL-based expiration to automatically remove stale data

The trade-off is durability. Redis is not ideal for long-term storage or historical analysis. As a result, historical driver movement data—if needed—is typically sent to a separate analytics pipeline rather than stored in the serving layer.

Step 4: Identify bottlenecks and scale

As Uber Eats grows, certain parts of the system naturally become bottlenecks. Identifying these early and designing mitigation strategies is critical for maintaining performance and reliability.

High latency in read paths

Search, restaurant browsing, and menu viewing dominate user traffic. If these requests hit the primary database directly, latency increases as load grows.

To address this, the system relies heavily on:

  • Caching (Redis or Memcached) for frequently accessed data
  • CDNs for static assets like images
  • Read-optimized stores such as Elasticsearch or read replicas

This approach trades perfect freshness for speed. Cached or indexed data may be slightly stale, but users benefit from consistently fast responses.

Write bottlenecks during order placement

Order creation involves multiple steps and dependencies, including payment processing and driver matching. Handling all of this synchronously would limit throughput and increase latency.

To scale writes safely:

  • Non-critical steps are handled asynchronously using message queues
  • Orders are sharded across databases to distribute load
  • Critical operations remain transactional to preserve correctness

While asynchronous processing may add a small delay before an order is fully confirmed, it allows the system to handle spikes in demand without failing.

High-frequency driver location updates

Millions of location updates per hour can overwhelm slower storage systems.

This is mitigated by:

  • Using in-memory or NoSQL stores for fast writes
  • Applying TTLs to keep only recent location data
  • Optionally batching updates to reduce write pressure

This design prioritizes real-time accuracy over long-term persistence, which aligns well with the needs of driver matching.

Real-time order matching

Matching orders to nearby drivers in dense urban areas can be computationally expensive.

To keep matching fast and scalable:

  • Geospatial indexes are used for proximity searches
  • Drivers are notified via push notifications or WebSockets instead of polling
  • Rate limiting is applied during peak load

In extreme scenarios, matching may be slightly sub-optimal, but the system remains responsive and available.

Step 5: Review and summary

Overall, this Uber Eats design satisfies the core functional requirements of search, ordering, delivery, and payments while meeting the non-functional goals of scalability, availability, and low latency.

The system:

  • Optimizes for fast reads during search and discovery
  • Enforces strong consistency where correctness matters most
  • Scales horizontally by separating concerns across services
  • Uses asynchronous workflows to absorb traffic spikes

By embracing eventual consistency in non-critical paths and strict consistency in financial workflows, the architecture achieves a practical balance suitable for a global, high-traffic platform.

If given more time, several additional improvements could further enhance the system.

Reverse proxy and security

Introducing a reverse proxy such as Nginx or Envoy in front of backend services can:

  • Terminate SSL and enforce HTTPS
  • Apply rate limiting and request validation
  • Act as an additional caching layer

This improves both security and performance without changing core business logic.

Cache update strategies

Choosing the right cache update policy is crucial for balancing freshness and performance.

Different strategies may be used depending on the data:

  • Write-through caching for critical data
  • Lazy or write-back caching for read-heavy data
  • TTL-based eviction for rapidly changing data like driver locations

Availability and fault tolerance

To avoid single points of failure:

  • Services should be deployed across multiple availability zones
  • Databases should use replicas and automatic failover
  • Clients should gracefully degrade by falling back to cached data when possible

Retries with exponential backoff help handle transient failures without overwhelming dependent services.

Idempotency and write reliability

Some operations involve multiple writes across different systems—for example, writing order metadata to a database and storing receipts in object storage.

To ensure correctness:

  • Writes to object storage should be idempotent
  • Message queues should retry failed operations
  • Deterministic identifiers should prevent duplicate uploads or charges

This ensures eventual consistency even in the presence of partial failures.