[Redis] Redis Cluster

1) Overview

Provides better Scalability and Load Balancing - Redis Cluster allows your Redis data to be automatically sharded across multiple Redis nodes.
High Availability - Cluster provides ability to continue operations when a subset of the nodes experience failures or unable to communicate with the rest of the cluster; however, large scale failures may stop the operation.

2) Data Sharding

Data Sharding is a method to break up a big database into smaller parts.
The reason for data sharding is that, after a certain scale point, it is cheaper and more practical to scale horizontally (by adding more machines) than to grow it vertically (by adding buffier servers or adding more CPU/ram to servers).

3) How it works?

3.1) Redis Cluster TCP ports

Unlike sentinel, there is no dedicated monitoring. Instead, every cluster node have 2 TCP connections open. The first one is a standard redis TCP used to serve clients. Another is a cluster bus port (node-to-node communication channel using binary protocol), which is obtained by adding 10,000 to the standard redis TCP port number. This cluster bus is used by nodes for failure detection, configuration update, failover authorization and so forth.

Clients should never try to communicate with cluster bus port, but always with normal Redis command port. However, remember to keep these 2 ports open on your firewall for Redis cluster to communicate properly.

3.2) Redis Cluster Sharding

In a redis cluster system, data is spread across master nodes via hashing. The term used is hash slot. There are 16384 hash slots across master nodes, and the algorithm used is CRC16 of the key modulo 16384.

When we add a new node, some of the hash slots from the existing nodes will be moved to the new node. When we remove a node, the opposite happens. Moving hashslots do not require stop operations. Therefore, adding, removing or changing the percentage of hash slots held by each node does not required downtime.

3.3) Redis Master-Slave Model

A master node can have multiple slaves.

Slave node replicates master node. When master node fails, slave node is promoted as new master and continues to operate. However, if all master and slave nodes fail at the same time, then it will stop to operate.

3.4) Consistency

3.4.1) Asynchronous Replication

Redis is not able to provide strong consistency; meaning there are scenarios where Redis Cluster loses writes even though they are acknowledged by the system to the client.

The following is how the replication works:

Client writes to master A.
Master A replies OK to client
Master A propagates to write to its slaves A1, A2 and A3.

As you can see, master A does not wait for acknowledgement from its slaves (A1-3) before replying to the client. If master A crashes before it has completed writing to its slaves, the write is lost because the acknowledgement is sent to the client but the slave does not have that data.

This is similar to what happens to most databases. It is a tradeoff between performance and consistency.

3.4.2) Synchronous Replication

Redis cluster has support for synchronous writes; if really needed.

This is implemented via the WAIT command.

This makes losing writes a lot less likely – but strong consistency is not achieved even with synchronous replication. Under more complicated scenarios, it is possible for a slave that was not able to receive the write to be elected as the master.

3.4.3) Node timeout

Node timeout – the amount of time for a master node to fail to sense the other master nodes, and considers itself failing.

When is this useful? A notable scenario where Redis cluster loses writes is a client is isolated with a minority instances including at least 1 master. This is possible during network partitions.

For example: imagine we have node clusters that consists of 3 master nodes (A, B, C) and their respective slave nodes (A1, B1, C1). There is also a client called Z1.

After a partition occurs, it is possible that in one side of the partition, we have B, C, A1, B1, C1, and on the other side we have A and Z1. Z1 is still able to write to A because A is a master node, and thus, accepts the write. If the partition heals in a very short time, the cluster continues normally. However, if the node timeout is reached, master node A stops accepting writes and A1 is elected as new master. The write to master A is now lost.

4) Deployment

4.1) Configurations

Below are some notable settings in redis.conf file.

cluster-enabled <yes/no>:

If yes, enables Redis cluster in a specific redis instance. If no, starts redis as standalone.

cluster-config-file <filename>:

This is not user-editable, but automatically updated by cluster node.

cluster-node-time <ms>:

explained early.

cluster-slave-validity-factor <factor>:

If set to zero, a slave will always try to fail over a master, regardless of the time the link between master and slave is disconnected. If value is positive, a maximum disconnected time is calculated as “node timeout x factor”. If node is a slave, it will start a failover if master link was disconnected for more than the disconnected time. For example, if node timeout is set to 5 seconds, the validity factor is set to 10, a slaves disconnected from the master for more than 50 seconds will try to failover its master. (The slave will be promoted as the master node). Note that any value different from zero may result in Redis Cluster to be unavailable after a master failure if there is no slave able to failover it. In that case, the cluster will return back available only when the original master rejoins the cluster.

cluster-migration-barrier <count>:

minimum number of slaves a master will remain connected with.

5) Overview of more advanced concepts

Below are summary of more advanced concepts. Please do more research before proceeding with just the information below.

5.1) Resharding cluster

Resharding means to move hash slots from a set of nodes to another set of nodes.

The cmd command looks like:

redis-cli --cluster reshard <IP>

You only need to specify a single node, redis-cli will find the other nodes automatically.

5.2) Testing a Failover

We can trigger a failover by crashing a master.

The cmd command looks like:

redis-cli –p <port> debug segfault

5.3) Manual Failover

Sometimes, it is useful to force a failover without actually causing any problem on a master. For example, we might want to make a master into a slave with minimal impact.

They are safer than actual failovers because they avoid data loss in the process – by switching to the new master only when the new master processed all the replication stream from the old one.

5.4) Adding a new node.

Adding a new node is basically the process of adding an empty node and moving some data into it (if it is a new master), or to tell it to setup as replica of a known master (as a slave).

5.5) Removing node

It is very simple to remove a slave node. However, to remove a master node, you must reshard the master node first. Another option is to manually failover the master and remove it after the slave becomes the new master.

5.6) Replica Migration

It is possible to reconfigure a slave to replicate with a different master at any time.

Note that you can only tell the cluster to assign slave node to a particular master; and not which master's slave node to remove from.

The cluster will try to migrate a replica from the master that has the greatest number of replicas at the given moment. To benefit from replica migration, you have to just add a few more replicas to a single master in your cluster, it does not matter what master.

There is a configuration parameter that controls this – “cluster-migration-barrier” – in redis.conf file.

5.7) Upgrading nodes in a Redis Cluster

Upgrading slave nodes is easy. Stop, Upgrade, Start.

Upgrading master is more complicated. The suggested procedures are:

Use cluster failover to trigger manual failover of master to its slaves.
Wait for master to turn into slave
Upgrade node as you do for slaves
If you want master to be the node you upgraded, trigger manual failover again.

Search This Blog

NotAfraidOfWong