[Redis] Redis Sentinel
1) Overview
- A system designed to help manage Redis instance.
- Primary purpose is to provide high availability system, by monitoring, notifying and providing instances failover.
- Does this by monitoring master and slave nodes. When master node is down, sentinels will coordinate to promote a slave node into master.
2) 4 main tasks
2.1) Monitoring
Check if your master and slaves instances are working as
expected
2.2) Notification
Notify other program or system administrator via an API,
when something goes wrong with the monitored instances
2.3) Automatic Failover
On master failure, sentinel promotes one of the slaves to
master, then make the other additional salves use the new master.
2.4) Configuration Provider
Sentinel acts as source of authority for clients service delivery. Clients connect to Sentinels in order to ask for address of current Redis master responsible for given service. If a failover occurs, Sentinels will report the new address.3) Distributed Nature
Sentinel is a distributed system, meaning it is designed to
run in a configuration where there are multiple sentinel processes working
together. The advantage of having multiple processes cooperating are:
3.1) Failure Detection
Multiple Sentinels need to agree about the fact a given
master is no longer available. This lowers the probability of false positives.
3.2) Not a Single point of failure
Sentinel works even if not all Sentinel processes are
working, making the system robust against failures. There is no fun in having a
fail over system which is itself a single point of failure, after all.
4) Quorum
- Quorum is the # of sentinels that need to agree about the fact that the master is not reachable and start a fail over procedure.
- However, the quorum is only used to detect the failure. When failure is detected, one of the Sentinels need to be the elected leader for the fail over procedure. This only happens when there is a majority vote for master node failure among the sentinels.
5) Important Tips
- You should have at least three sentinels
- Redis Client must have Sentinel support.
- Popular client libraries have Sentinel support, but not all.
- Redis Sentinel is a specific execution mode of the Redis server itself.
- Sentinel API provides information on the running instances (Sentinel API Commands Examples: SENTINEL masters, SENTINEL slaves <master name>)
6) Deployment Tips
- The three sentinel instances should be placed into computers or virtual machines that are believed to fail in independent ways. So for example: different physical servers or virtual machines executed on different availability zones.
- Sentinel + Redis distributed system does not guarantee that acknowledged writes are retained during failures, since Redis uses asynchronous replication. However, there are ways to deploy sentinel that make the window to lose writes limited to certain moments, while there are other less secure ways to deploy it.
- There is no HA setup which is safe if you don’t test from time to time in development environments, or even better if you can, in production, if they work. You may have a misconconfiguration that will become apparent only when it’s too late (at 3am when your master stops working).
- Sentinel, Docker or other forms of Network Address Translation or Port mapping should be mixed with care: Docker performs port remapping, breaking Sentinel auto discovery of other sentinel processes and the list of slaves for a master.
7) Running Sentinel
A configuration file is mandatory when running sentinel.
To run sentinel, you can use redis-sentinel executable via
cmd:
redis-sentinel /path/to/sentinel.conf
Or, use redis-server executable via cmd:
redis-server /path/to/sentinel.conf --sentinel
7.1) Config file
- You only need to specify the masters to monitor, giving to each separated master a different name. There is no need to specify slaves, because they are auto-discovered.
- The configuration is rewritten everytime a slave is promoted to master during a fail over and every time a new Sentinel is discovered.
7.1.1) Notable Settings
- down-after-milliseconds
- Amount of time (in ms) the instance is not reachable for a sentinel to think it is down.
- parallel-syncs
- sets the number of slaves that can be reconfigured to the new master after a failover at the same time.
Comments
Post a Comment