The rehashing problem
serverIndex = hash(key) % N
However, problems arise when new servers are added, or existing servers are
removed.
This means that when server 1 goes offline, most cache clients will
connect to the wrong servers to fetch data. This causes a storm of cache misses.
Consistent hashing
Quoted from Wikipedia: "Consistent hashing is a special kind of hashing such that when a hash table is re-sized and consistent hashing is used, only k/n keys need to be remapped on average, where k is the number of keys, and n is the number of slots. In contrast, in most traditional hash tables, a change in the number of array slots causes nearly all keys to be remapped [1]”.
Hash space and hash ring
Hash servers
Hash keys
Add a server
Remove a server
Two issues in the basic approach
• Map servers and keys on to the ring using a uniformly distributed hash function.
• To find out which server a key is mapped to, go clockwise from the key position until the first server on the ring is found.
First, it is impossible to keep the same size
of partitions on the ring for all servers considering a server can be added or removed
Second, it is possible to have a non-uniform key distribution on the ring. For instance, if servers are mapped to positions listed in Figure 5-11, most of the keys are stored on server 2. However, server 1 and server 3 have no data.
Virtual nodes
As the number of virtual nodes increases, the distribution of keys becomes more balanced.
However, more spaces are needed to store data about virtual nodes.
This is a tradeoff, and we can tune the number of virtual nodes to fit our system requirements.
Find affected keys
located between s3 and s4 need to be redistributed to s4.
keys located between s0 and s1 must be redistributed to s2.文章来源:https://www.toymoban.com/news/detail-733275.html
The benefits of consistent hashing include:
• Minimized keys are redistributed when servers are added or removed.
• It is easy to scale horizontally because data are more evenly distributed.
• Mitigate hotspot key problem. Excessive access to a specific shard could cause server
overload. Imagine data for Katy Perry, Justin Bieber, and Lady Gaga all end up on the
same shard. Consistent hashing helps to mitigate the problem by distributing the data more
evenly.文章来源地址https://www.toymoban.com/news/detail-733275.html
到了这里,关于CHAPTER 5: DESIGN CONSISTENT HASHING的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!