Consistent Hashing
In this lesson, we explain how to discuss consistent hashing in system design interviews.
You'll remember from a previous lesson that caching allows for much quicker access to commonly-requested data than accessing primary storage each time a request comes in. Distributed caching simply partitions application data across multiple servers, allowing caches to scale. Consistent hashing (vs. traditional hashing) is a technique used to do this in a scalable, robust and dynamically adaptive way.
Why is consistent hashing important?
Let's take a few steps back first. Distributed systems should be both reliable and available, right? This means coping with variable traffic and the realities of network and hardware failure. It should make sense that in a distributed cache, each server has limited storage. Data need to be stored, searched, and accessed in way that's systematic and fast. The data structure of choice for this is a hash table.

While traditional hashing works well enough when there are a fixed number of reliable servers, it is not efficient when servers are added, removed, or fail. Let's look into traditional hashing schemes to see why this is the case.
Limitations of traditional hash tables
Traditional hash tables consist of three main components:
- A key, or, a piece of data that is being stored for easy access.
- A hash function, which takes the key as input and hashes it, or maps it to another piece of data (in most cases an integer) within a given range.
- The output of the hash function or hash value.
The hash function transforms key values into an array index for faster lookup, which is critical - faster access is the entire reason for caching in the first place. So what's the problem with the traditional approach when scaling a distributed cache?
Let's look below. In this example, hash values range from 00-05. Conventional methods for distributing data among caches take the serverKey = (hash_function(key) % N) to determine which of the servers' keys will ultimately be mapped, as seen below.
Note that N, the number of servers in the network, is part of the mapping. Because of this, N must stay stable or every key-value pair will need to be remapped.
We can see in the standard case where the number of servers is the expected number N, then the mapping is straightforward.

But what happens when one of the servers (say server 2) fails. When server 2 fails, it is likely that most of the keys have to be redistributed across the other servers since serverKey = (hash_function(key) % N) will change since N has changed from 3 to 2.

A similar thing happens when servers are added. In the case where we add 3 more servers, almost all keys will have to be remapped to different servers. A natural question to ask is whether there is a more efficient solution for distributing data across servers? Is there some way where only have to remap a small number of servers when we add or remove servers?
Consistent hashing solution:
Consistent hashing was an ingenious solution to solving the limitations of more conventional hashing methods. It was introduced so that the remapping of key-value pairs to servers is consistent—even when servers are added or removed. In this method, servers and objects are mapped to positions on what is termed a hash circle. Let us suppose we have 3 servers and 4 keys. We can map each of these to a value on the circle using the hash function and then use a linear mapping to map the hash values to a value between 0 and 2π.

Now, we can establish some type of convention to assign objects to servers. For instance, one convention that works well is assigning the key to its closest server in the clockwise direction. To improve the computational efficiency of the algorithm and to guarantee a more even distribution of load across servers, multiple segments are associated with each one of the servers.

For instance, in the figure above, server 0 is mapped 3 times on the hash circle, as are server 1 and server 2. Note, different weightings can be assigned to the server—which will produce a different desired load distribution across the servers. In this way, less keys will have to be remapped when servers are added or removed.
Let's take a look at what might happen when server 1 fails. Note that none of the keys have to be rehashed. Instead, all the servers mapped on the circle associated with server 1 are removed. In this case, the key may be reassigned to a different server (since the closest one in the clockwise direction may no longer be server 1).
The advantages of this method include:
- Not all keys have to be reassigned to different servers.
- The hash value of the keys do not need to be recomputed.

Therefore, we can see the advantages that consistent hashing offers over traditional hashing methods! Just as an aside, in real life, caching systems like Memcached and Redis, support out-of-the-box solutions for consistent hashing.
In Summary:
- Consistent hashing is a methodology for storing data (key and objects) across different servers. It's advantageous over traditional hashing methods because it efficiently adapts to the adding or removal of servers to the system architecture.
Further Reading
- This article from Facebook's engineering blog tracks the development of Shard Manager, a generic sharding solution developed in-house to manage sharding at an incredible scale: millions of shards, and hundreds of thousands of servers are monitored by Shard Manager. Consistent hashing pops up within the first few paragraphs, but the entire article is an excellent review of many of the concepts we've covered and will cover in coming lessons.