[Cloud Computing] Dynamo (a highly available key-value storage system)

What is the paper trying to do?

This paper is trying to “present the design and implementation of Dynamo, a highly available key-value storage system”, that is used in Amazon’s core services. By sacrificing consistency under certain failure scenarios, Dynamo is able to reach an “always-on” experience in terms of availability. This allows it to be successful in handling server failures, data center failures and network partitions. Additionally, Dynamo is incrementally scalable and can scale up and down while it is up and running.

Dynamo uses a combination of technologies, including extensive use of object versioning, application-assisted conflict resolution, data partitioning, replication via consistent hashing. Additionally, during updates, quorum-like technique and a decentralized replica synchronization protocol is used to maintain consistency amongst replicas.

What do you think is the contribution of the paper?

The main contribution of this paper is “the evaluation of how different techniques can be combined to provide a single highly-available system”. The production use of Dynamo is also presents an example of how “decentralized techniques can be combined to provide a single highly-available system”.

What are its major strengths?

The main strength of Dynamo is that its client applications “can tune the values of N, R and W to achieve their desired levels of performance, availability and durability”.

However, there a scalability problem in Dynamo. Dynamo uses a full membership model. This is where each node has information about the data hosted by its peers. To achieve this, “each node actively gossips the full routing table with other nodes in the system”. While this might work well with hundreds of nodes, it is not scalable with tens of thousands of nodes because the overhead in maintaining the routing table increases as the system size increases.

Comments

Popular posts from this blog

[Redis] Redis Cluster vs Redis Sentinel

[Unit Testing] Test Doubles (Stubs, Mocks....etc)

[Node.js] Pending HTTP requests lead to unresponsive nodeJS