Skip to main content

Using Sentinel

KeyDB has options for using Sentinel as derived within Redis, however with the Active-Replication options provided, Sentinel is not necessary and can be more complicated to use. However if you are migrating over your Redis Project then KeyDB will still work with your Sentinel instance setup.

KeyDB Sentinel Documentation#

KeyDB Sentinel provides high availability for KeyDB. In practical terms this means that using Sentinel you can create a KeyDB deployment that resists without human intervention to certain kind of failures.

KeyDB Sentinel also provides other collateral tasks such as monitoring, notifications and acts as a configuration provider for clients.

This is the full list of Sentinel capabilities at a macroscopical level (i.e. the big picture):

  • Monitoring. Sentinel constantly checks if your master and slave instances are working as expected.
  • Notification. Sentinel can notify the system administrator, another computer programs, via an API, that something is wrong with one of the monitored KeyDB instances.
  • Automatic failover. If a master is not working as expected, Sentinel can start a failover process where a slave is promoted to master, the other additional slaves are reconfigured to use the new master, and the applications using the KeyDB server informed about the new address to use when connecting.
  • Configuration provider. Sentinel acts as a source of authority for clients service discovery: clients connect to Sentinels in order to ask for the address of the current KeyDB master responsible for a given service. If a failover occurs, Sentinels will report the new address.

Distributed nature of Sentinel#

KeyDB Sentinel is a distributed system:

Sentinel itself is designed to run in a configuration where there are multiple Sentinel processes cooperating together. The advantage of having multiple Sentinel processes cooperating are the following:

  1. Failure detection is performed when multiple Sentinels agree about the fact a given master is no longer available. This lowers the probability of false positives.
  2. Sentinel works even if not all the Sentinel processes are working, making the system robust against failures. There is no fun in having a fail over system which is itself a single point of failure, after all.

The sum of Sentinels, KeyDB instances (masters and slaves) and clients connecting to Sentinel and KeyDB, are also a larger distributed system with specific properties. In this document concepts will be introduced gradually starting from basic information needed in order to understand the basic properties of Sentinel, to more complex information (that are optional) in order to understand how exactly Sentinel works.

KeyDB Sentinel Spec#

Introduction#

KeyDB Sentinel is the name of the KeyDB high availability solution that's currently under development. It has nothing to do with KeyDB Cluster and is intended to be used by people that don't need KeyDB Cluster, but simply a way to perform automatic fail over when a master instance is not functioning correctly.

The plan is to provide a usable beta implementation of KeyDB Sentinel in a short time, preferably in mid July 2012.

In short this is what KeyDB Sentinel will be able to do:

  • Monitor master and slave instances to see if they are available.
  • Promote a slave to master when the master fails.
  • Modify clients configurations when a slave is elected.
  • Inform the system administrator about incidents using notifications.

So the three different roles of KeyDB Sentinel can be summarized in the following three big aspects:

  • Monitoring.
  • Notification.
  • Automatic failover.

The following document explains what is the design of KeyDB Sentinel in order to accomplish this goals.

KeyDB Sentinel idea#

The idea of KeyDB Sentinel is to have multiple "monitoring devices" in different places of your network, monitoring the KeyDB master instance.

However this independent devices can't act without agreement with other sentinels.

Once a KeyDB master instance is detected as failing, for the failover process to start, the sentinel must verify that there is a given level of agreement.

The amount of sentinels, their location in the network, and the configured quorum, select the desired behavior among many possibilities.

KeyDB Sentinel does not use any proxy: clients reconfiguration is performed running user-provided executables (for instance a shell script or a Python program) in a user setup specific way.

In what form it will be shipped#

KeyDB Sentinel is just a special mode of the keydb-server executable.

If the keydb-server is called with "keydb-sentinel" as argv[0] (for instance using a symbolic link or copying the file), or if --sentinel option is passed, the KeyDB instance starts in sentinel mode and will only understand sentinel related commands. All the other commands will be refused.

The whole implementation of sentinel will live in a separated file sentinel.c with minimal impact on the rest of the code base. However this solution allows to use all the facilities already implemented inside KeyDB without any need to reimplement them or to maintain a separated code base for KeyDB Sentinel.

Sentinels networking#

All the sentinels take persistent connections with:

  • The monitored masters.
  • All its slaves, that are discovered using the master's INFO output.
  • All the other Sentinels connected to this master, discovered via Pub/Sub.

Sentinels use the KeyDB protocol to talk with each other, and to reply to external clients.

KeyDB Sentinels export a SENTINEL command. Subcommands of the SENTINEL command are used in order to perform different actions.

For instance the SENTINEL masters command enumerates all the monitored masters and their states. However Sentinels can also reply to the PING command as a normal KeyDB instance, so that it is possible to monitor a Sentinel considering it a normal KeyDB instance.

The list of networking tasks performed by every sentinel is the following:

  • A Sentinel PUBLISH its presence using the master Pub/Sub multiple times every five seconds.
  • A Sentinel accepts commands using a TCP port. By default the port is 26379.
  • A Sentinel constantly monitors masters, slaves, other sentinels sending PING commands.
  • A Sentinel sends INFO commands to the masters and slaves every ten seconds in order to take a fresh list of connected slaves, the state of the master, and so forth.
  • A Sentinel monitors the sentinel Pub/Sub "hello" channel in order to discover newly connected Sentinels, or to detect no longer connected Sentinels. The channel used is __sentinel__:hello.

Sentinels discovering#

To make the configuration of sentinels as simple as possible every sentinel broadcasts its presence using the KeyDB master Pub/Sub functionality.

Every sentinel is subscribed to the same channel, and broadcast information about its existence to the same channel, including the Run ID of the Sentinel, and the IP address and port where it is listening for commands.

Every sentinel maintains a list of other sentinels Run ID, IP and port. A sentinel that does no longer announce its presence using Pub/Sub for too long time is removed from the list, assuming the Master appears to be working well. In that case a notification is delivered to the system administrator.

Detection of failing masters#

An instance is not available from the point of view of KeyDB Sentinel when it is no longer able to reply to the PING command correctly for longer than the specified number of seconds, consecutively.

For a PING reply to be considered valid, one of the following conditions should be true:

  • PING replied with +PONG.
  • PING replied with -LOADING error.
  • PING replied with -MASTERDOWN error.

What is not considered an acceptable reply:

  • PING replied with -BUSY error.
  • PING replied with -MISCONF error.
  • PING reply not received after more than a specified number of milliseconds.

PING should never reply with a different error code than the ones listed above but any other error code is considered an acceptable reply by KeyDB Sentinel.

Handling of -BUSY state#

The -BUSY error is returned when a script is running for more time than the configured script time limit. When this happens before triggering a fail over KeyDB Sentinel will try to send a "SCRIPT KILL" command, that will only succeed if the script was read-only.

Subjectively down and Objectively down#

From the point of view of a Sentinel there are two different error conditions for a master:

  • Subjectively Down (aka S_DOWN) means that a master is down from the point of view of a Sentinel.
  • Objectively Down (aka O_DOWN) means that a master is subjectively down from the point of view of enough Sentinels to reach the configured quorum for that master.

How Sentinels agree to mark a master O_DOWN.#

Once a Sentinel detects that a master is in S_DOWN condition it starts to send other sentinels a SENTINEL is-master-down-by-addr request every second. The reply is stored inside the state that every Sentinel takes in memory.

Ten times every second a Sentinel scans the state and checks if there are enough Sentinels thinking that a master is down (this is not specific for this operation, most state checks are performed with this frequency).

If this Sentinel has already an S_DOWN condition for this master, and there are enough other sentinels that recently reported this condition (the validity time is currently set to 5 seconds), then the master is marked as O_DOWN (Objectively Down).

Note that the O_DOWN state is not propagated among Sentinels. Every single Sentinel can reach independently this state.

The SENTINEL is-master-down-by-addr command#

Sentinels ask other Sentinels for the state of a master from their local point of view using the SENTINEL is-master-down-by-addr command. This command replies with a boolean value (in the form of a 0 or 1 integer reply, as a first element of a multi bulk reply).

However in order to avoid false positives, the command acts in the following way:

  • If the specified ip and port is not known, 0 is returned.
  • If the specified ip and port are found but don't belong to a Master instance, 0 is returned.
  • If the Sentinel is in TILT mode (see later in this document) 0 is returned.
  • The value of 1 is returned only if the instance is known, is a master, is flagged S_DOWN and the Sentinel is in TILT mode.

Duplicate Sentinels removal#

In order to reach the configured quorum we absolutely want to make sure that the quorum is reached by different physical Sentinel instances. Under no circumstance we should get agreement from the same instance that for some reason appears to be two or multiple distinct Sentinel instances.

This is enforced by an aggressive removal of duplicated Sentinels: every time a Sentinel sends a message in the Hello Pub/Sub channel with its address and runid, if we can't find a perfect match (same runid and address) inside the Sentinels table for that master, we remove any other Sentinel with the same runid OR the same address. And later add the new Sentinel.

For instance if a Sentinel instance is restarted, the Run ID will be different, and the old Sentinel with the same IP address and port pair will be removed.

Starting the failover: Leaders and Observers#

The fact that a master is marked as O_DOWN is not enough to star the failover process. What Sentinel should start the failover is also to be decided.

Also Sentinels can be configured in two ways: only as monitors that can't perform the fail over, or as Sentinels that can start the failover.

What is desirable is that only a Sentinel will start the failover process, and this Sentinel should be selected among the Sentinels that are allowed to perform the failover.

In Sentinel there are two roles during a fail over:

  • The Leader Sentinel is the one selected to perform the failover.
  • The Observers Sentinels are the other sentinels just following the failover process without doing active operations.

So the condition to start the failover is:

  • A Master in O_DOWN condition.
  • A Sentinel that is elected Leader.

Leader Sentinel election#

The election process works as follows:

  • Every Sentinel with a master in O_DOWN condition updates its internal state with frequency of 10 HZ to refresh what is the Subjective Leader from its point of view.

A Subjective Leader is selected in this way by every sentinel.

  • Every Sentinel we know about a given master, that is reachable (no S_DOWN state), that is allowed to perform the failover (this Sentinel-specific configuration is propagated using the Hello channel), is a possible candidate.
  • Among all the possible candidates, the one with lexicographically smaller Run ID is selected.

Every time a Sentinel replies with to the MASTER is-sentinel-down-by-addr command it also replies with the Run ID of its Subjective Leader.

Every Sentinel with a failing master (O_DOWN) checks its subjective leader and the subjective leaders of all the other Sentinels with a frequency of 10 HZ, and will flag itself as the Leader if the following conditions happen:

  • It is the Subjective Leader of itself.
  • At least N-1 other Sentinels that see the master as down, and are reachable, also think that it is the Leader. With N being the quorum configured for this master.
  • At least 50% + 1 of all the Sentinels involved in the voting process (that are reachable and that also see the master as failing) should agree on the Leader.

So for instance if there are a total of three sentinels, the master is failing, and all the three sentinels are able to communicate (no Sentinel is failing) and the configured quorum for this master is 2, a Sentinel will feel itself an Objective Leader if at least it and another Sentinel is agreeing that it is the subjective leader.

Once a Sentinel detects that it is the objective leader, it flags the master with FAILOVER_IN_PROGRESS and IM_THE_LEADER flags, and starts the failover process in SENTINEL_FAILOVER_DELAY (5 seconds currently) plus a random additional time between 0 milliseconds and 10000 milliseconds.

During that time we ask INFO to all the slaves with an increased frequency of one time per second (usually the period is 10 seconds). If a slave is turned into a master in the meantime the failover is suspended and the Leader clears the IM_THE_LEADER flag to turn itself into an observer.

Guarantees of the Leader election process#

As you can see for a Sentinel to become a leader the majority is not strictly required. A user can force the majority to be needed just setting the master quorum to, for instance, the value of 5 if there are a total of 9 sentinels.

However it is also possible to set the quorum to the value of 2 with 9 sentinels in order to improve the resistance to netsplits or failing Sentinels or other error conditions. In such a case the protection against race conditions (multiple Sentinels starting to perform the fail over at the same time) is given by the random delay used to start the fail over, and the continuous monitor of the slave instances to detect if another Sentinel (or a human) started the failover process.

Moreover the slave to promote is selected using a deterministic process to minimize the chance that two different Sentinels with full vision of the working slaves may pick two different slaves to promote.

However it is possible to easily imagine netsplits and specific configurations where two Sentinels may start to act as a leader at the same time, electing two different slaves as masters, in two different parts of the net that can't communicate. The KeyDB Sentinel user should evaluate the network topology and select an appropriate quorum considering his or her goals and the different trade offs.

How observers understand that the failover started#

An observer is just a Sentinel that does not believe to be the Leader, but still sees a master in O_DOWN condition.

The observer is still able to follow and update the internal state based on what is happening with the failover, but does not directly rely on the Leader to communicate with it to be informed by progresses. It simply observes the state of the slaves to understand what is happening.

Specifically the observers flags the master as FAILOVER_IN_PROGRESS if a slave attached to a master turns into a master (observers can see it in the INFO output). An observer will also consider the failover complete once all the other reachable slaves appear to be slaves of this slave that was turned into a master.

If a Slave is in FAILOVER_IN_PROGRESS and the failover is not progressing for too much time, and at the same time the other Sentinels start claiming that this Sentinel is the objective leader (because for example the old leader is no longer reachable), the Sentinel will flag itself as IM_THE_LEADER and will proceed with the failover.

Note: all the Sentinel state, including the subjective and objective leadership is a dynamic process that is continuously refreshed with period of 10 HZ. There is no "one time decision" step in Sentinel.

Selection of the Slave to promote#

If a master has multiple slaves, the slave to promote to master is selected checking the slave priority (a new configuration option of KeyDB instances that is propagated via INFO output), and picking the one with lower priority value (it is an integer similar to the one of the MX field of the DNS system). All the slaves that appears to be disconnected from the master for a long time are discarded (stale data).

If slaves with the same priority exist, the one with the lexicographically smaller Run ID is selected.

If there is no Slave to select because all the salves are failing the failover is not started at all. Instead if there is no Slave to select because the master never used to have slaves in the monitoring session, then the failover is performed nonetheless just calling the user scripts. However for this to happen a special configuration option must be set for that master (force-failover-without-slaves).

This is useful because there are configurations where a new Instance can be provisioned at IP protocol level by the script, but there are no attached slaves.

Fail over process#

The fail over process consists of the following steps:

  • 1) Turn the selected slave into a master using the SLAVEOF NO ONE command.
  • 2) Turn all the remaining slaves, if any, to slaves of the new master. This is done incrementally, one slave after the other, waiting for the previous slave to complete the synchronization process before starting with the next one.
  • 3) Call a user script to inform the clients that the configuration changed.
  • 4) Completely remove the old failing master from the table, and add the new master with the same name.

If Steps "1" fails, the fail over is aborted.

All the other errors are considered to be non-fatal.

TILT mode#

KeyDB Sentinel is heavily dependent on the computer time: for instance in order to understand if an instance is available it remembers the time of the latest successful reply to the PING command, and compares it with the current time to understand how old it is.

However if the computer time changes in an unexpected way, or if the computer is very busy, or the process blocked for some reason, Sentinel may start to behave in an unexpected way.

The TILT mode is a special "protection" mode that a Sentinel can enter when something odd is detected that can lower the reliability of the system. The Sentinel timer interrupt is normally called 10 times per second, so we expect that more or less 100 milliseconds will elapse between two calls to the timer interrupt.

What a Sentinel does is to register the previous time the timer interrupt was called, and compare it with the current call: if the time difference is negative or unexpectedly big (2 seconds or more) the TILT mode is entered (or if it was already entered the exit from the TILT mode postponed).

When in TILT mode the Sentinel will continue to monitor everything, but:

  • It stops acting at all.
  • It starts to reply negatively to SENTINEL is-master-down-by-addr requests as the ability to detect a failure is no longer trusted.

If everything appears to be normal for 30 second, the TILT mode is exited.

Sentinels monitoring other sentinels#

When a sentinel no longer advertises itself using the Pub/Sub channel for too much time (30 minutes more the configured timeout for the master), but at the same time the master appears to work correctly, the Sentinel is removed from the table of Sentinels for this master, and a notification is sent to the system administrator.

User provided scripts#

Sentinels can optionally call user-provided scripts to perform two tasks:

  • Inform clients that the configuration changed.
  • Notify the system administrator of problems.

The script to inform clients of a configuration change has the following parameters:

  • ip:port of the calling Sentinel.
  • old master ip:port.
  • new master ip:port.

The script to send notifications is called with the following parameters:

  • ip:port of the calling Sentinel.
  • The message to deliver to the system administrator is passed writing to the standard input.

Using the ip:port of the calling sentinel, scripts may call SENTINEL subcommands to get more info if needed.

Concrete implementations of notification scripts will likely use the "mail" command or some other command to deliver SMS messages, emails, tweets.

Implementations of the script to modify the configuration in web applications are likely to use HTTP GET requests to force clients to update the configuration, or any other sensible mechanism for the specific setup in use.

Setup examples#

Imaginary setup:

computer A runs the KeyDB master.
computer B runs the KeyDB slave and the client software.

In this naive configuration it is possible to place a single sentinel, with "minimal agreement" set to the value of one (no acknowledge from other sentinels needed), running on "B".

If "A" will fail the fail over process will start, the slave will be elected to master, and the client software will be reconfigured.

Imaginary setup:

computer A runs the KeyDB master
computer B runs the KeyDB slave
computer C,D,E,F,G are web servers acting as clients

In this setup it is possible to run five sentinels placed at C,D,E,F,G with "minimal agreement" set to 3.

In real production environments there is to evaluate how the different computers are networked together, and to check what happens during net splits in order to select where to place the sentinels, and the level of minimal agreement, so that a single arm of the network failing will not trigger a fail over.

In general if a complex network topology is present, the minimal agreement should be set to the max number of sentinels existing at the same time in the same network arm, plus one.

SENTINEL SUBCOMMANDS#

  • SENTINEL masters, provides a list of configured masters.
  • SENTINEL slaves <master name>, provides a list of slaves for the master with the specified name.
  • SENTINEL sentinels <master name>, provides a list of sentinels for the master with the specified name.
  • SENTINEL is-master-down-by-addr <ip> <port>, returns a two elements multi bulk reply where the first element is :0 or :1, and the second is the Subjective Leader for the failover.

TODO#

  • More detailed specification of user script error handling, including what return codes may mean, like 0: try again. 1: fatal error. 2: try again, and so forth.
  • More detailed specification of what happens when a user script does not return in a given amount of time.
  • Add a "push" notification system for configuration changes.
  • Document that for every master monitored the configuration specifies a name for the master that is reported by all the SENTINEL commands.
  • Make clear that we handle a single Sentinel monitoring multiple masters.

Quick Start#

Obtaining Sentinel#

The current version of Sentinel is called Sentinel 2. It is a rewrite of the initial Sentinel implementation using stronger and simpler to predict algorithms (that are explained in this documentation).

A stable release of KeyDB Sentinel is shipped since KeyDB 2.8.

New developments are performed in the unstable branch, and new features sometimes are back ported into the latest stable branch as soon as they are considered to be stable.

KeyDB Sentinel version 1, shipped with KeyDB 2.6, is deprecated and should not be used.

Running Sentinel#

If you are using the keydb-sentinel executable (or if you have a symbolic link with that name to the keydb-server executable) you can run Sentinel with the following command line:

keydb-sentinel /path/to/sentinel.conf

Otherwise you can use directly the keydb-server executable starting it in Sentinel mode:

keydb-server /path/to/sentinel.conf --sentinel

Both ways work the same.

However it is mandatory to use a configuration file when running Sentinel, as this file will be used by the system in order to save the current state that will be reloaded in case of restarts. Sentinel will simply refuse to start if no configuration file is given or if the configuration file path is not writable.

Sentinels by default run listening for connections to TCP port 26379, so for Sentinels to work, port 26379 of your servers must be open to receive connections from the IP addresses of the other Sentinel instances. Otherwise Sentinels can't talk and can't agree about what to do, so failover will never be performed.

Fundamental things to know about Sentinel before deploying#

  1. You need at least three Sentinel instances for a robust deployment.
  2. The three Sentinel instances should be placed into computers or virtual machines that are believed to fail in an independent way. So for example different physical servers or Virtual Machines executed on different availability zones.
  3. Sentinel + KeyDB distributed system does not guarantee that acknowledged writes are retained during failures, since KeyDB uses asynchronous replication. However there are ways to deploy Sentinel that make the window to lose writes limited to certain moments, while there are other less secure ways to deploy it.
  4. You need Sentinel support in your clients. Popular client libraries have Sentinel support, but not all.
  5. There is no HA setup which is safe if you don't test from time to time in development environments, or even better if you can, in production environments, if they work. You may have a misconfiguration that will become apparent only when it's too late (at 3am when your master stops working).
  6. Sentinel, Docker, or other forms of Network Address Translation or Port Mapping should be mixed with care: Docker performs port remapping, breaking Sentinel auto discovery of other Sentinel processes and the list of slaves for a master. Check the section about Sentinel and Docker later in this document for more information.

Configuring Sentinel#

The KeyDB source distribution contains a file called sentinel.conf that is a self-documented example configuration file you can use to configure Sentinel, however a typical minimal configuration file looks like the following:

sentinel monitor mymaster 127.0.0.1 6379 2
sentinel down-after-milliseconds mymaster 60000
sentinel failover-timeout mymaster 180000
sentinel parallel-syncs mymaster 1
sentinel monitor resque 192.168.1.3 6380 4
sentinel down-after-milliseconds resque 10000
sentinel failover-timeout resque 180000
sentinel parallel-syncs resque 5

You only need to specify the masters to monitor, giving to each separated master (that may have any number of slaves) a different name. There is no need to specify slaves, which are auto-discovered. Sentinel will update the configuration automatically with additional information about slaves (in order to retain the information in case of restart). The configuration is also rewritten every time a slave is promoted to master during a failover and every time a new Sentinel is discovered.

The example configuration above, basically monitor two sets of KeyDB instances, each composed of a master and an undefined number of slaves. One set of instances is called mymaster, and the other resque.

The meaning of the arguments of sentinel monitor statements is the following:

sentinel monitor <master-group-name> <ip> <port> <quorum>

For the sake of clarity, let's check line by line what the configuration options mean:

The first line is used to tell KeyDB to monitor a master called mymaster, that is at address 127.0.0.1 and port 6379, with a quorum of 2. Everything is pretty obvious but the quorum argument:

  • The quorum is the number of Sentinels that need to agree about the fact the master is not reachable, in order for really mark the slave as failing, and eventually start a fail over procedure if possible.
  • However the quorum is only used to detect the failure. In order to actually perform a failover, one of the Sentinels need to be elected leader for the failover and be authorized to proceed. This only happens with the vote of the majority of the Sentinel processes.

So for example if you have 5 Sentinel processes, and the quorum for a given master set to the value of 2, this is what happens:

  • If two Sentinels agree at the same time about the master being unreachable, one of the two will try to start a failover.
  • If there are at least a total of three Sentinels reachable, the failover will be authorized and will actually start.

In practical terms this means during failures Sentinel never starts a failover if the majority of Sentinel processes are unable to talk (aka no failover in the minority partition).

Other Sentinel options#

The other options are almost always in the form:

sentinel <option_name> <master_name> <option_value>

And are used for the following purposes:

  • down-after-milliseconds is the time in milliseconds an instance should not be reachable (either does not reply to our PINGs or it is replying with an error) for a Sentinel starting to think it is down.
  • parallel-syncs sets the number of slaves that can be reconfigured to use the new master after a failover at the same time. The lower the number, the more time it will take for the failover process to complete, however if the slaves are configured to serve old data, you may not want all the slaves to re-synchronize with the master at the same time. While the replication process is mostly non blocking for a slave, there is a moment when it stops to load the bulk data from the master. You may want to make sure only one slave at a time is not reachable by setting this option to the value of 1.

Additional options are described in the rest of this document and documented in the example sentinel.conf file shipped with the KeyDB distribution.

All the configuration parameters can be modified at runtime using the SENTINEL SET command. See the Reconfiguring Sentinel at runtime section for more information.

Example Sentinel deployments#

Now that you know the basic information about Sentinel, you may wonder where you should place your Sentinel processes, how much Sentinel processes you need and so forth. This section shows a few example deployments.

We use ASCII art in order to show you configuration examples in a graphical format, this is what the different symbols means:

+--------------------+
| This is a computer |
| or VM that fails |
| independently. We |
| call it a "box" |
+--------------------+

We write inside the boxes what they are running:

+-------------------+
| KeyDB master M1 |
| KeyDB Sentinel S1 |
+-------------------+

Different boxes are connected by lines, to show that they are able to talk:

+-------------+ +-------------+
| Sentinel S1 |---------------| Sentinel S2 |
+-------------+ +-------------+

Network partitions are shown as interrupted lines using slashes:

+-------------+ +-------------+
| Sentinel S1 |------ // ------| Sentinel S2 |
+-------------+ +-------------+

Also note that:

  • Masters are called M1, M2, M3, ..., Mn.
  • Slaves are called R1, R2, R3, ..., Rn (R stands for replica).
  • Sentinels are called S1, S2, S3, ..., Sn.
  • Clients are called C1, C2, C3, ..., Cn.
  • When an instance changes role because of Sentinel actions, we put it inside square brackets, so [M1] means an instance that is now a master because of Sentinel intervention.

Note that we will never show setups where just two Sentinels are used, since Sentinels always need to talk with the majority in order to start a failover.

Example 1: just two Sentinels, DON'T DO THIS#

+----+ +----+
| M1 |---------| R1 |
| S1 | | S2 |
+----+ +----+
Configuration: quorum = 1
  • In this setup, if the master M1 fails, R1 will be promoted since the two Sentinels can reach agreement about the failure (obviously with quorum set to 1) and can also authorize a failover because the majority is two. So apparently it could superficially work, however check the next points to see why this setup is broken.
  • If the box where M1 is running stops working, also S1 stops working. The Sentinel running in the other box S2 will not be able to authorize a failover, so the system will become not available.

Note that a majority is needed in order to order different failovers, and later propagate the latest configuration to all the Sentinels. Also note that the ability to failover in a single side of the above setup, without any agreement, would be very dangerous:

+----+ +------+
| M1 |----//-----| [M1] |
| S1 | | S2 |
+----+ +------+

In the above configuration we created two masters (assuming S2 could failover without authorization) in a perfectly symmetrical way. Clients may write indefinitely to both sides, and there is no way to understand when the partition heals what configuration is the right one, in order to prevent a permanent split brain condition.

So please deploy at least three Sentinels in three different boxes always.

Example 2: basic setup with three boxes#

This is a very simple setup, that has the advantage to be simple to tune for additional safety. It is based on three boxes, each box running both a KeyDB process and a Sentinel process.

+----+
| M1 |
| S1 |
+----+
|
+----+ | +----+
| R2 |----+----| R3 |
| S2 | | S3 |
+----+ +----+
Configuration: quorum = 2

If the master M1 fails, S2 and S3 will agree about the failure and will be able to authorize a failover, making clients able to continue.

In every Sentinel setup, being KeyDB asynchronously replicated, there is always the risk of losing some write because a given acknowledged write may not be able to reach the slave which is promoted to master. However in the above setup there is an higher risk due to clients partitioned away with an old master, like in the following picture:

+----+
| M1 |
| S1 | <- C1 (writes will be lost)
+----+
|
/
/
+------+ | +----+
| [M2] |----+----| R3 |
| S2 | | S3 |
+------+ +----+

In this case a network partition isolated the old master M1, so the slave R2 is promoted to master. However clients, like C1, that are in the same partition as the old master, may continue to write data to the old master. This data will be lost forever since when the partition will heal, the master will be reconfigured as a slave of the new master, discarding its data set.

This problem can be mitigated using the following KeyDB replication feature, that allows to stop accepting writes if a master detects that is no longer able to transfer its writes to the specified number of slaves.

min-slaves-to-write 1
min-slaves-max-lag 10

With the above configuration (please see the self-commented keydb.conf example in the KeyDB distribution for more information) a KeyDB instance, when acting as a master, will stop accepting writes if it can't write to at least 1 slave. Since replication is asynchronous not being able to write actually means that the slave is either disconnected, or is not sending us asynchronous acknowledges for more than the specified max-lag number of seconds.

Using this configuration the old KeyDB master M1 in the above example, will become unavailable after 10 seconds. When the partition heals, the Sentinel configuration will converge to the new one, the client C1 will be able to fetch a valid configuration and will continue with the new master.

However there is no free lunch. With this refinement, if the two slaves are down, the master will stop accepting writes. It's a trade off.

Example 3: Sentinel in the client boxes#

Sometimes we have only two KeyDB boxes available, one for the master and one for the slave. The configuration in the example 2 is not viable in that case, so we can resort to the following, where Sentinels are placed where clients are:

+----+ +----+
| M1 |----+----| R1 |
| | | | |
+----+ | +----+
|
+------------+------------+
| | |
| | |
+----+ +----+ +----+
| C1 | | C2 | | C3 |
| S1 | | S2 | | S3 |
+----+ +----+ +----+
Configuration: quorum = 2

In this setup, the point of view Sentinels is the same as the clients: if a master is reachable by the majority of the clients, it is fine. C1, C2, C3 here are generic clients, it does not mean that C1 identifies a single client connected to KeyDB. It is more likely something like an application server, a Rails app, or something like that.

If the box where M1 and S1 are running fails, the failover will happen without issues, however it is easy to see that different network partitions will result in different behaviors. For example Sentinel will not be able to setup if the network between the clients and the KeyDB servers will get disconnected, since the KeyDB master and slave will be both not available.

Note that if C3 gets partitioned with M1 (hardly possible with the network described above, but more likely possible with different layouts, or because of failures at the software layer), we have a similar issue as described in Example 2, with the difference that here we have no way to break the symmetry, since there is just a slave and master, so the master can't stop accepting queries when it is disconnected from its slave, otherwise the master would never be available during slave failures.

So this is a valid setup but the setup in the Example 2 has advantages such as the HA system of KeyDB running in the same boxes as KeyDB itself which may be simpler to manage, and the ability to put a bound on the amount of time a master into the minority partition can receive writes.

Example 4: Sentinel client side with less than three clients#

The setup described in the Example 3 cannot be used if there are not enough three boxes in the client side (for example three web servers). In this case we need to resort to a mixed setup like the following:

+----+ +----+
| M1 |----+----| R1 |
| S1 | | | S2 |
+----+ | +----+
|
+------+-----+
| |
| |
+----+ +----+
| C1 | | C2 |
| S3 | | S4 |
+----+ +----+
Configuration: quorum = 3

This is similar to the setup in Example 3, but here we run four Sentinels in the four boxes we have available. If the master M1 becomes not available the other three Sentinels will perform the failover.

In theory this setup works removing the box where C2 and S4 are running, and setting the quorum to 2. However it is unlikely that we want HA in the KeyDB side without having high availability in our application layer.

Sentinel, Docker, NAT, and possible issues#

Docker uses a technique called port mapping: programs running inside Docker containers may be exposed with a different port compared to the one the program believes to be using. This is useful in order to run multiple containers using the same ports, at the same time, in the same server.

Docker is not the only software system where this happens, there are other Network Address Translation setups where ports may be remapped, and sometimes not ports but also IP addresses.

Remapping ports and addresses creates issues with Sentinel in two ways:

  1. Sentinel auto-discovery of other Sentinels no longer works, since it is based on hello messages where each Sentinel announce at which port and IP address they are listening for connection. However Sentinels have no way to understand that an address or port is remapped, so it is announcing an information that is not correct for other Sentinels to connect.
  2. Slaves are listed in the INFO output of a KeyDB master in a similar way: the address is detected by the master checking the remote peer of the TCP connection, while the port is advertised by the slave itself during the handshake, however the port may be wrong for the same reason as exposed in point 1.

Since Sentinels auto detect slaves using masters INFO output information, the detected slaves will not be reachable, and Sentinel will never be able to failover the master, since there are no good slaves from the point of view of the system, so there is currently no way to monitor with Sentinel a set of master and slave instances deployed with Docker, unless you instruct Docker to map the port 1:1.

For the first problem, in case you want to run a set of Sentinel instances using Docker with forwarded ports (or any other NAT setup where ports are remapped), you can use the following two Sentinel configuration directives in order to force Sentinel to announce a specific set of IP and port:

sentinel announce-ip <ip>
sentinel announce-port <port>

Note that Docker has the ability to run in host networking mode (check the --net=host option for more information). This should create no issues since ports are not remapped in this setup.

A quick tutorial#

In the next sections of this document, all the details about Sentinel API, configuration and semantics will be covered incrementally. However for people that want to play with the system ASAP, this section is a tutorial that shows how to configure and interact with 3 Sentinel instances.

Here we assume that the instances are executed at port 5000, 5001, 5002. We also assume that you have a running KeyDB master at port 6379 with a slave running at port 6380. We will use the IPv4 loopback address 127.0.0.1 everywhere during the tutorial, assuming you are running the simulation on your personal computer.

The three Sentinel configuration files should look like the following:

port 5000
sentinel monitor mymaster 127.0.0.1 6379 2
sentinel down-after-milliseconds mymaster 5000
sentinel failover-timeout mymaster 60000
sentinel parallel-syncs mymaster 1

The other two configuration files will be identical but using 5001 and 5002 as port numbers.

A few things to note about the above configuration:

  • The master set is called mymaster. It identifies the master and its slaves. Since each master set has a different name, Sentinel can monitor different sets of masters and slaves at the same time.
  • The quorum was set to the value of 2 (last argument of sentinel monitor configuration directive).
  • The down-after-milliseconds value is 5000 milliseconds, that is 5 seconds, so masters will be detected as failing as soon as we don't receive any reply from our pings within this amount of time.

Once you start the three Sentinels, you'll see a few messages they log, like:

+monitor master mymaster 127.0.0.1 6379 quorum 2

This is a Sentinel event, and you can receive this kind of events via Pub/Sub if you SUBSCRIBE to the event name as specified later.

Sentinel generates and logs different events during failure detection and failover.

Asking Sentinel about the state of a master#

The most obvious thing to do with Sentinel to get started, is check if the master it is monitoring is doing well:

$ keydb-cli -p 5000
127.0.0.1:5000> sentinel master mymaster
1) "name"
2) "mymaster"
3) "ip"
4) "127.0.0.1"
5) "port"
6) "6379"
7) "runid"
8) "953ae6a589449c13ddefaee3538d356d287f509b"
9) "flags"
10) "master"
11) "link-pending-commands"
12) "0"
13) "link-refcount"
14) "1"
15) "last-ping-sent"
16) "0"
17) "last-ok-ping-reply"
18) "735"
19) "last-ping-reply"
20) "735"
21) "down-after-milliseconds"
22) "5000"
23) "info-refresh"
24) "126"
25) "role-reported"
26) "master"
27) "role-reported-time"
28) "532439"
29) "config-epoch"
30) "1"
31) "num-slaves"
32) "1"
33) "num-other-sentinels"
34) "2"
35) "quorum"
36) "2"
37) "failover-timeout"
38) "60000"
39) "parallel-syncs"
40) "1"

As you can see, it prints a number of information about the master. There are a few that are of particular interest for us:

  1. num-other-sentinels is 2, so we know the Sentinel already detected two more Sentinels for this master. If you check the logs you'll see the +sentinel events generated.
  2. flags is just master. If the master was down we could expect to see s_down or o_down flag as well here.
  3. num-slaves is correctly set to 1, so Sentinel also detected that there is an attached slave to our master.

In order to explore more about this instance, you may want to try the following two commands:

SENTINEL slaves mymaster
SENTINEL sentinels mymaster

The first will provide similar information about the slaves connected to the master, and the second about the other Sentinels.

Obtaining the address of the current master#

As we already specified, Sentinel also acts as a configuration provider for clients that want to connect to a set of master and slaves. Because of possible failovers or reconfigurations, clients have no idea about who is the currently active master for a given set of instances, so Sentinel exports an API to ask this question:

127.0.0.1:5000> SENTINEL get-master-addr-by-name mymaster
1) "127.0.0.1"
2) "6379"

Testing the failover#

At this point our toy Sentinel deployment is ready to be tested. We can just kill our master and check if the configuration changes. To do so we can just do:

keydb-cli -p 6379 DEBUG sleep 30

This command will make our master no longer reachable, sleeping for 30 seconds. It basically simulates a master hanging for some reason.

If you check the Sentinel logs, you should be able to see a lot of action:

  1. Each Sentinel detects the master is down with an +sdown event.
  2. This event is later escalated to +odown, which means that multiple Sentinels agree about the fact the master is not reachable.
  3. Sentinels vote a Sentinel that will start the first failover attempt.
  4. The failover happens.

If you ask again what is the current master address for mymaster, eventually we should get a different reply this time:

127.0.0.1:5000> SENTINEL get-master-addr-by-name mymaster
1) "127.0.0.1"
2) "6380"

So far so good... At this point you may jump to create your Sentinel deployment or can read more to understand all the Sentinel commands and internals.

Sentinel API#

Sentinel provides an API in order to inspect its state, check the health of monitored masters and slaves, subscribe in order to receive specific notifications, and change the Sentinel configuration at run time.

By default Sentinel runs using TCP port 26379 (note that 6379 is the normal KeyDB port). Sentinels accept commands using the KeyDB protocol, so you can use keydb-cli or any other unmodified KeyDB client in order to talk with Sentinel.

It is possible to directly query a Sentinel to check what is the state of the monitored KeyDB instances from its point of view, to see what other Sentinels it knows, and so forth. Alternatively, using Pub/Sub, it is possible to receive push style notifications from Sentinels, every time some event happens, like a failover, or an instance entering an error condition, and so forth.

Sentinel commands#

The following is a list of accepted commands, not covering commands used in order to modify the Sentinel configuration, which are covered later.

  • PING This command simply returns PONG.
  • SENTINEL masters Show a list of monitored masters and their state.
  • SENTINEL master <master name> Show the state and info of the specified master.
  • SENTINEL slaves <master name> Show a list of slaves for this master, and their state.
  • SENTINEL sentinels <master name> Show a list of sentinel instances for this master, and their state.
  • SENTINEL get-master-addr-by-name <master name> Return the ip and port number of the master with that name. If a failover is in progress or terminated successfully for this master it returns the address and port of the promoted slave.
  • SENTINEL reset <pattern> This command will reset all the masters with matching name. The pattern argument is a glob-style pattern. The reset process clears any previous state in a master (including a failover in progress), and removes every slave and sentinel already discovered and associated with the master.
  • SENTINEL failover <master name> Force a failover as if the master was not reachable, and without asking for agreement to other Sentinels (however a new version of the configuration will be published so that the other Sentinels will update their configurations).
  • SENTINEL ckquorum <master name> Check if the current Sentinel configuration is able to reach the quorum needed to failover a master, and the majority needed to authorize the failover. This command should be used in monitoring systems to check if a Sentinel deployment is ok.
  • SENTINEL flushconfig Force Sentinel to rewrite its configuration on disk, including the current Sentinel state. Normally Sentinel rewrites the configuration every time something changes in its state (in the context of the subset of the state which is persisted on disk across restart). However sometimes it is possible that the configuration file is lost because of operation errors, disk failures, package upgrade scripts or configuration managers. In those cases a way to to force Sentinel to rewrite the configuration file is handy. This command works even if the previous configuration file is completely missing.

Reconfiguring Sentinel at Runtime#

Starting with KeyDB version 2.8.4, Sentinel provides an API in order to add, remove, or change the configuration of a given master. Note that if you have multiple sentinels you should apply the changes to all to your instances for KeyDB Sentinel to work properly. This means that changing the configuration of a single Sentinel does not automatically propagates the changes to the other Sentinels in the network.

The following is a list of SENTINEL sub commands used in order to update the configuration of a Sentinel instance.

  • SENTINEL MONITOR <name> <ip> <port> <quorum> This command tells the Sentinel to start monitoring a new master with the specified name, ip, port, and quorum. It is identical to the sentinel monitor configuration directive in sentinel.conf configuration file, with the difference that you can't use an hostname in as ip, but you need to provide an IPv4 or IPv6 address.
  • SENTINEL REMOVE <name> is used in order to remove the specified master: the master will no longer be monitored, and will totally be removed from the internal state of the Sentinel, so it will no longer listed by SENTINEL masters and so forth.
  • SENTINEL SET <name> <option> <value> The SET command is very similar to the CONFIG SET command of KeyDB, and is used in order to change configuration parameters of a specific master. Multiple option / value pairs can be specified (or none at all). All the configuration parameters that can be configured via sentinel.conf are also configurable using the SET command.

The following is an example of SENTINEL SET command in order to modify the down-after-milliseconds configuration of a master called objects-cache:

SENTINEL SET objects-cache-master down-after-milliseconds 1000

As already stated, SENTINEL SET can be used to set all the configuration parameters that are settable in the startup configuration file. Moreover it is possible to change just the master quorum configuration without removing and re-adding the master with SENTINEL REMOVE followed by SENTINEL MONITOR, but simply using:

SENTINEL SET objects-cache-master quorum 5

Note that there is no equivalent GET command since SENTINEL MASTER provides all the configuration parameters in a simple to parse format (as a field/value pairs array).

Adding or removing Sentinels#

Adding a new Sentinel to your deployment is a simple process because of the auto-discover mechanism implemented by Sentinel. All you need to do is to start the new Sentinel configured to monitor the currently active master. Within 10 seconds the Sentinel will acquire the list of other Sentinels and the set of slaves attached to the master.

If you need to add multiple Sentinels at once, it is suggested to add it one after the other, waiting for all the other Sentinels to already know about the first one before adding the next. This is useful in order to still guarantee that majority can be achieved only in one side of a partition, in the chance failures should happen in the process of adding new Sentinels.

This can be easily achieved by adding every new Sentinel with a 30 seconds delay, and during absence of network partitions.

At the end of the process it is possible to use the command SENTINEL MASTER mastername in order to check if all the Sentinels agree about the total number of Sentinels monitoring the master.

Removing a Sentinel is a bit more complex: Sentinels never forget already seen Sentinels, even if they are not reachable for a long time, since we don't want to dynamically change the majority needed to authorize a failover and the creation of a new configuration number. So in order to remove a Sentinel the following steps should be performed in absence of network partitions:

  1. Stop the Sentinel process of the Sentinel you want to remove.
  2. Send a SENTINEL RESET * command to all the other Sentinel instances (instead of * you can use the exact master name if you want to reset just a single master). One after the other, waiting at least 30 seconds between instances.
  3. Check that all the Sentinels agree about the number of Sentinels currently active, by inspecting the output of SENTINEL MASTER mastername of every Sentinel.

Removing the old master or unreachable slaves#

Sentinels never forget about slaves of a given master, even when they are unreachable for a long time. This is useful, because Sentinels should be able to correctly reconfigure a returning slave after a network partition or a failure event.

Moreover, after a failover, the failed over master is virtually added as a slave of the new master, this way it will be reconfigured to replicate with the new master as soon as it will be available again.

However sometimes you want to remove a slave (that may be the old master) forever from the list of slaves monitored by Sentinels.

In order to do this, you need to send a SENTINEL RESET mastername command to all the Sentinels: they'll refresh the list of slaves within the next 10 seconds, only adding the ones listed as correctly replicating from the current master INFO output.

Pub/Sub Messages#

A client can use a Sentinel as it was a KeyDB compatible Pub/Sub server (but you can't use PUBLISH) in order to SUBSCRIBE or PSUBSCRIBE to channels and get notified about specific events.

The channel name is the same as the name of the event. For instance the channel named +sdown will receive all the notifications related to instances entering an SDOWN (SDOWN means the instance is no longer reachable from the point of view of the Sentinel you are querying) condition.

To get all the messages simply subscribe using PSUBSCRIBE *.

The following is a list of channels and message formats you can receive using this API. The first word is the channel / event name, the rest is the format of the data.

Note: where instance details is specified it means that the following arguments are provided to identify the target instance:

<instance-type> <name> <ip> <port> @ <master-name> <master-ip> <master-port>

The part identifying the master (from the @ argument to the end) is optional and is only specified if the instance is not a master itself.

  • +reset-master <instance details> -- The master was reset.
  • +slave <instance details> -- A new slave was detected and attached.
  • +failover-state-reconf-slaves <instance details> -- Failover state changed to reconf-slaves state.
  • +failover-detected <instance details> -- A failover started by another Sentinel or any other external entity was detected (An attached slave turned into a master).
  • +slave-reconf-sent <instance details> -- The leader sentinel sent the SLAVEOF command to this instance in order to reconfigure it for the new slave.
  • +slave-reconf-inprog <instance details> -- The slave being reconfigured showed to be a slave of the new master ip:port pair, but the synchronization process is not yet complete.
  • +slave-reconf-done <instance details> -- The slave is now synchronized with the new master.
  • -dup-sentinel <instance details> -- One or more sentinels for the specified master were removed as duplicated (this happens for instance when a Sentinel instance is restarted).
  • +sentinel <instance details> -- A new sentinel for this master was detected and attached.
  • +sdown <instance details> -- The specified instance is now in Subjectively Down state.
  • -sdown <instance details> -- The specified instance is no longer in Subjectively Down state.
  • +odown <instance details> -- The specified instance is now in Objectively Down state.
  • -odown <instance details> -- The specified instance is no longer in Objectively Down state.
  • +new-epoch <instance details> -- The current epoch was updated.
  • +try-failover <instance details> -- New failover in progress, waiting to be elected by the majority.
  • +elected-leader <instance details> -- Won the election for the specified epoch, can do the failover.
  • +failover-state-select-slave <instance details> -- New failover state is select-slave: we are trying to find a suitable slave for promotion.
  • no-good-slave <instance details> -- There is no good slave to promote. Currently we'll try after some time, but probably this will change and the state machine will abort the failover at all in this case.
  • selected-slave <instance details> -- We found the specified good slave to promote.
  • failover-state-send-slaveof-noone <instance details> -- We are trying to reconfigure the promoted slave as master, waiting for it to switch.
  • failover-end-for-timeout <instance details> -- The failover terminated for timeout, slaves will eventually be configured to replicate with the new master anyway.
  • failover-end <instance details> -- The failover terminated with success. All the slaves appears to be reconfigured to replicate with the new master.
  • switch-master <master name> <oldip> <oldport> <newip> <newport> -- The master new IP and address is the specified one after a configuration change. This is the message most external users are interested in.
  • +tilt -- Tilt mode entered.
  • -tilt -- Tilt mode exited.

Handling of -BUSY state#

The -BUSY error is returned by a KeyDB instance when a Lua script is running for more time than the configured Lua script time limit. When this happens before triggering a fail over KeyDB Sentinel will try to send a SCRIPT KILL command, that will only succeed if the script was read-only.

If the instance will still be in an error condition after this try, it will eventually be failed over.

Slaves priority#

KeyDB instances have a configuration parameter called slave-priority. This information is exposed by KeyDB slave instances in their INFO output, and Sentinel uses it in order to pick a slave among the ones that can be used in order to failover a master:

  1. If the slave priority is set to 0, the slave is never promoted to master.
  2. Slaves with a lower priority number are preferred by Sentinel.

For example if there is a slave S1 in the same data center of the current master, and another slave S2 in another data center, it is possible to set S1 with a priority of 10 and S2 with a priority of 100, so that if the master fails and both S1 and S2 are available, S1 will be preferred.

For more information about the way slaves are selected, please check the slave selection and priority section of this documentation.

Sentinel and KeyDB authentication#

When the master is configured to require a password from clients, as a security measure, slaves need to also be aware of this password in order to authenticate with the master and create the master-slave connection used for the asynchronous replication protocol.

This is achieved using the following configuration directives:

  • requirepass in the master, in order to set the authentication password, and to make sure the instance will not process requests for non authenticated clients.
  • masterauth in the slaves in order for the slaves to authenticate with the master in order to correctly replicate data from it.

When Sentinel is used, there is not a single master, since after a failover slaves may play the role of masters, and old masters can be reconfigured in order to act as slaves, so what you want to do is to set the above directives in all your instances, both masters and slaves.

This is also usually a sane setup since you don't want to protect data only in the master, having the same data accessible in the slaves.

However, in the uncommon case where you need a slave that is accessible without authentication, you can still do it by setting up a slave priority of zero, to prevent this slave from being promoted to master, and configuring in this slave only the masterauth directive, without using the requirepass directive, so that data will be readable by unauthenticated clients.

In order for sentinels to connect to KeyDB server instances when they are configured with requirepass, the Sentinel configuration must include the sentinel auth-pass directive, in the format:

sentinel auth-pass <master-group-name> <pass>

Configuring Sentinel instances with authentication#

You can also configure the Sentinel instance itself in order to require client authentication via the AUTH command, however this feature is only available starting with KeyDB 5.0.1.

In order to do so, just add the following configuration directive to all your Sentinel instances:

requirepass "your_password_here"

When configured this way, Sentinels will do two things:

  1. A password will be required from clients in order to send commands to Sentinels. This is obvious since this is how such configuration directive works in KeyDB in general.
  2. Moreover the same password configured to access the local Sentinel, will be used by this Sentinel instance in order to authenticate to all the other Sentinel instances it connects to.

This means that you will have to configure the same requirepass password in all the Sentinel instances. This way every Sentinel can talk with every other Sentinel without any need to configure for each Sentinel the password to access all the other Sentinels, that would be very impractical.

Before using this configuration make sure your client library is able to send the AUTH command to Sentinel instances.

Sentinel clients implementation#

Sentinel requires explicit client support, unless the system is configured to execute a script that performs a transparent redirection of all the requests to the new master instance (virtual IP or other similar systems). The topic of client libraries implementation is covered in the document Sentinel clients guidelines.

More advanced concepts#

In the following sections we'll cover a few details about how Sentinel work, without to resorting to implementation details and algorithms that will be covered in the final part of this document.

SDOWN and ODOWN failure state#

KeyDB Sentinel has two different concepts of being down, one is called a Subjectively Down condition (SDOWN) and is a down condition that is local to a given Sentinel instance. Another is called Objectively Down condition (ODOWN) and is reached when enough Sentinels (at least the number configured as the quorum parameter of the monitored master) have an SDOWN condition, and get feedback from other Sentinels using the SENTINEL is-master-down-by-addr command.

From the point of view of a Sentinel an SDOWN condition is reached when it does not receive a valid reply to PING requests for the number of seconds specified in the configuration as is-master-down-after-milliseconds parameter.

An acceptable reply to PING is one of the following:

  • PING replied with +PONG.
  • PING replied with -LOADING error.
  • PING replied with -MASTERDOWN error.

Any other reply (or no reply at all) is considered non valid. However note that a logical master that advertises itself as a slave in the INFO output is considered to be down.

Note that SDOWN requires that no acceptable reply is received for the whole interval configured, so for instance if the interval is 30000 milliseconds (30 seconds) and we receive an acceptable ping reply every 29 seconds, the instance is considered to be working.

SDOWN is not enough to trigger a failover: it only means a single Sentinel believes a KeyDB instance is not available. To trigger a failover, the ODOWN state must be reached.

To switch from SDOWN to ODOWN no strong consensus algorithm is used, but just a form of gossip: if a given Sentinel gets reports that a master is not working from enough Sentinels in a given time range, the SDOWN is promoted to ODOWN. If this acknowledge is later missing, the flag is cleared.

A more strict authorization that uses an actual majority is required in order to really start the failover, but no failover can be triggered without reaching the ODOWN state.

The ODOWN condition only applies to masters. For other kind of instances Sentinel doesn't require to act, so the ODOWN state is never reached for slaves and other sentinels, but only SDOWN is.

However SDOWN has also semantic implications. For example a slave in SDOWN state is not selected to be promoted by a Sentinel performing a failover.

Sentinels and Slaves auto discovery#

Sentinels stay connected with other Sentinels in order to reciprocally check the availability of each other, and to exchange messages. However you don't need to configure a list of other Sentinel addresses in every Sentinel instance you run, as Sentinel uses the KeyDB instances Pub/Sub capabilities in order to discover the other Sentinels that are monitoring the same masters and slaves.

This feature is implemented by sending hello messages into the channel named __sentinel__:hello.

Similarly you don't need to configure what is the list of the slaves attached to a master, as Sentinel will auto discover this list querying KeyDB.

  • Every Sentinel publishes a message to every monitored master and slave Pub/Sub channel __sentinel__:hello, every two seconds, announcing its presence with ip, port, runid.
  • Every Sentinel is subscribed to the Pub/Sub channel __sentinel__:hello of every master and slave, looking for unknown sentinels. When new sentinels are detected, they are added as sentinels of this master.
  • Hello messages also include the full current configuration of the master. If the receiving Sentinel has a configuration for a given master which is older than the one received, it updates to the new configuration immediately.
  • Before adding a new sentinel to a master a Sentinel always checks if there is already a sentinel with the same runid or the same address (ip and port pair). In that case all the matching sentinels are removed, and the new added.

Sentinel reconfiguration of instances outside the failover procedure#

Even when no failover is in progress, Sentinels will always try to set the current configuration on monitored instances. Specifically:

  • Slaves (according to the current configuration) that claim to be masters, will be configured as slaves to replicate with the current master.
  • Slaves connected to a wrong master, will be reconfigured to replicate with the right master.

For Sentinels to reconfigure slaves, the wrong configuration must be observed for some time, that is greater than the period used to broadcast new configurations.

This prevents Sentinels with a stale configuration (for example because they just rejoined from a partition) will try to change the slaves configuration before receiving an update.

Also note how the semantics of always trying to impose the current configuration makes the failover more resistant to partitions:

  • Masters failed over are reconfigured as slaves when they return available.
  • Slaves partitioned away during a partition are reconfigured once reachable.

The important lesson to remember about this section is: Sentinel is a system where each process will always try to impose the last logical configuration to the set of monitored instances.

Slave selection and priority#

When a Sentinel instance is ready to perform a failover, since the master is in ODOWN state and the Sentinel received the authorization to failover from the majority of the Sentinel instances known, a suitable slave needs to be selected.

The slave selection process evaluates the following information about slaves:

  1. Disconnection time from the master.
  2. Slave priority.
  3. Replication offset processed.
  4. Run ID.

A slave that is found to be disconnected from the master for more than ten times the configured master timeout (down-after-milliseconds option), plus the time the master is also not available from the point of view of the Sentinel doing the failover, is considered to be not suitable for the failover and is skipped.

In more rigorous terms, a slave whose the INFO output suggests to be disconnected from the master for more than:

(down-after-milliseconds * 10) + milliseconds_since_master_is_in_SDOWN_state

Is considered to be unreliable and is disregarded entirely.

The slave selection only considers the slaves that passed the above test, and sorts it based on the above criteria, in the following order.

  1. The slaves are sorted by slave-priority as configured in the keydb.conf file of the KeyDB instance. A lower priority will be preferred.
  2. If the priority is the same, the replication offset processed by the slave is checked, and the slave that received more data from the master is selected.
  3. If multiple slaves have the same priority and processed the same data from the master, a further check is performed, selecting the slave with the lexicographically smaller run ID. Having a lower run ID is not a real advantage for a slave, but is useful in order to make the process of slave selection more deterministic, instead of resorting to select a random slave.

KeyDB masters (that may be turned into slaves after a failover), and slaves, all must be configured with a slave-priority if there are machines to be strongly preferred. Otherwise all the instances can run with the default run ID (which is the suggested setup, since it is far more interesting to select the slave by replication offset).

A KeyDB instance can be configured with a special slave-priority of zero in order to be never selected by Sentinels as the new master. However a slave configured in this way will still be reconfigured by Sentinels in order to replicate with the new master after a failover, the only difference is that it will never become a master itself.

Algorithms and internals#

In the following sections we will explore the details of Sentinel behavior. It is not strictly needed for users to be aware of all the details, but a deep understanding of Sentinel may help to deploy and operate Sentinel in a more effective way.

Quorum#

The previous sections showed that every master monitored by Sentinel is associated to a configured quorum. It specifies the number of Sentinel processes that need to agree about the unreachability or error condition of the master in order to trigger a failover.

However, after the failover is triggered, in order for the failover to actually be performed, at least a majority of Sentinels must authorize the Sentinel to failover. Sentinel never performs a failover in the partition where a minority of Sentinels exist.

Let's try to make things a bit more clear:

  • Quorum: the number of Sentinel processes that need to detect an error condition in order for a master to be flagged as ODOWN.
  • The failover is triggered by the ODOWN state.
  • Once the failover is triggered, the Sentinel trying to failover is required to ask for authorization to a majority of Sentinels (or more than the majority if the quorum is set to a number greater than the majority).

The difference may seem subtle but is actually quite simple to understand and use. For example if you have 5 Sentinel instances, and the quorum is set to 2, a failover will be triggered as soon as 2 Sentinels believe that the master is not reachable, however one of the two Sentinels will be able to failover only if it gets authorization at least from 3 Sentinels.

If instead the quorum is configured to 5, all the Sentinels must agree about the master error condition, and the authorization from all Sentinels is required in order to failover.

This means that the quorum can be used to tune Sentinel in two ways:

  1. If a the quorum is set to a value smaller than the majority of Sentinels we deploy, we are basically making Sentinel more sensible to master failures, triggering a failover as soon as even just a minority of Sentinels is no longer able to talk with the master.
  2. If a quorum is set to a value greater than the majority of Sentinels, we are making Sentinel able to failover only when there are a very large number (larger than majority) of well connected Sentinels which agree about the master being down.

Configuration epochs#

Sentinels require to get authorizations from a majority in order to start a failover for a few important reasons:

When a Sentinel is authorized, it gets a unique configuration epoch for the master it is failing over. This is a number that will be used to version the new configuration after the failover is completed. Because a majority agreed that a given version was assigned to a given Sentinel, no other Sentinel will be able to use it. This means that every configuration of every failover is versioned with a unique version. We'll see why this is so important.

Moreover Sentinels have a rule: if a Sentinel voted another Sentinel for the failover of a given master, it will wait some time to try to failover the same master again. This delay is the failover-timeout you can configure in sentinel.conf. This means that Sentinels will not try to failover the same master at the same time, the first to ask to be authorized will try, if it fails another will try after some time, and so forth.

KeyDB Sentinel guarantees the liveness property that if a majority of Sentinels are able to talk, eventually one will be authorized to failover if the master is down.

KeyDB Sentinel also guarantees the safety property that every Sentinel will failover the same master using a different configuration epoch.

Configuration propagation#

Once a Sentinel is able to failover a master successfully, it will start to broadcast the new configuration so that the other Sentinels will update their information about a given master.

For a failover to be considered successful, it requires that the Sentinel was able to send the SLAVEOF NO ONE command to the selected slave, and that the switch to master was later observed in the INFO output of the master.

At this point, even if the reconfiguration of the slaves is in progress, the failover is considered to be successful, and all the Sentinels are required to start reporting the new configuration.

The way a new configuration is propagated is the reason why we need that every Sentinel failover is authorized with a different version number (configuration epoch).

Every Sentinel continuously broadcast its version of the configuration of a master using KeyDB Pub/Sub messages, both in the master and all the slaves. At the same time all the Sentinels wait for messages to see what is the configuration advertised by the other Sentinels.

Configurations are broadcast in the __sentinel__:hello Pub/Sub channel.

Because every configuration has a different version number, the greater version always wins over smaller versions.

So for example the configuration for the master mymaster start with all the Sentinels believing the master is at 192.168.1.50:6379. This configuration has version 1. After some time a Sentinel is authorized to failover with version 2. If the failover is successful, it will start to broadcast a new configuration, let's say 192.168.1.50:9000, with version 2. All the other instances will see this configuration and will update their configuration accordingly, since the new configuration has a greater version.

This means that Sentinel guarantees a second liveness property: a set of Sentinels that are able to communicate will all converge to the same configuration with the higher version number.

Basically if the net is partitioned, every partition will converge to the higher local configuration. In the special case of no partitions, there is a single partition and every Sentinel will agree about the configuration.

Consistency under partitions#

KeyDB Sentinel configurations are eventually consistent, so every partition will converge to the higher configuration available. However in a real-world system using Sentinel there are three different players:

  • KeyDB instances.
  • Sentinel instances.
  • Clients.

In order to define the behavior of the system we have to consider all three.

The following is a simple network where there are 3 nodes, each running a KeyDB instance, and a Sentinel instance:

+-------------+
| Sentinel 1 |----- Client A
| KeyDB 1 (M) |
+-------------+
|
|
+-------------+ | +------------+
| Sentinel 2 |-----+-- // ----| Sentinel 3 |----- Client B
| KeyDB 2 (S) | | KeyDB 3 (M)|
+-------------+ +------------+

In this system the original state was that KeyDB 3 was the master, while KeyDB 1 and 2 were slaves. A partition occurred isolating the old master. Sentinels 1 and 2 started a failover promoting Sentinel 1 as the new master.

The Sentinel properties guarantee that Sentinel 1 and 2 now have the new configuration for the master. However Sentinel 3 has still the old configuration since it lives in a different partition.

We know that Sentinel 3 will get its configuration updated when the network partition will heal, however what happens during the partition if there are clients partitioned with the old master?

Clients will be still able to write to KeyDB 3, the old master. When the partition will rejoin, KeyDB 3 will be turned into a slave of KeyDB 1, and all the data written during the partition will be lost.

Depending on your configuration you may want or not that this scenario happens:

  • If you are using KeyDB as a cache, it could be handy that Client B is still able to write to the old master, even if its data will be lost.
  • If you are using KeyDB as a store, this is not good and you need to configure the system in order to partially prevent this problem.

Since KeyDB is asynchronously replicated, there is no way to totally prevent data loss in this scenario, however you can bound the divergence between KeyDB 3 and KeyDB 1 using the following KeyDB configuration option:

min-slaves-to-write 1
min-slaves-max-lag 10

With the above configuration (please see the self-commented keydb.conf example in the KeyDB distribution for more information) a KeyDB instance, when acting as a master, will stop accepting writes if it can't write to at least 1 slave. Since replication is asynchronous not being able to write actually means that the slave is either disconnected, or is not sending us asynchronous acknowledges for more than the specified max-lag number of seconds.

Using this configuration the KeyDB 3 in the above example will become unavailable after 10 seconds. When the partition heals, the Sentinel 3 configuration will converge to the new one, and Client B will be able to fetch a valid configuration and continue.

In general KeyDB + Sentinel as a whole are a an eventually consistent system where the merge function is last failover wins, and the data from old masters are discarded to replicate the data of the current master, so there is always a window for losing acknowledged writes. This is due to KeyDB asynchronous replication and the discarding nature of the "virtual" merge function of the system. Note that this is not a limitation of Sentinel itself, and if you orchestrate the failover with a strongly consistent replicated state machine, the same properties will still apply. There are only two ways to avoid losing acknowledged writes:

  1. Use synchronous replication (and a proper consensus algorithm to run a replicated state machine).
  2. Use an eventually consistent system where different versions of the same object can be merged.

KeyDB currently is not able to use any of the above systems, and is currently outside the development goals. However there are proxies implementing solution "2" on top of KeyDB stores such as SoundCloud Roshi, or Netflix Dynomite.

Sentinel persistent state#

Sentinel state is persisted in the sentinel configuration file. For example every time a new configuration is received, or created (leader Sentinels), for a master, the configuration is persisted on disk together with the configuration epoch. This means that it is safe to stop and restart Sentinel processes.

TILT mode#

KeyDB Sentinel is heavily dependent on the computer time: for instance in order to understand if an instance is available it remembers the time of the latest successful reply to the PING command, and compares it with the current time to understand how old it is.

However if the computer time changes in an unexpected way, or if the computer is very busy, or the process blocked for some reason, Sentinel may start to behave in an unexpected way.

The TILT mode is a special "protection" mode that a Sentinel can enter when something odd is detected that can lower the reliability of the system. The Sentinel timer interrupt is normally called 10 times per second, so we expect that more or less 100 milliseconds will elapse between two calls to the timer interrupt.

What a Sentinel does is to register the previous time the timer interrupt was called, and compare it with the current call: if the time difference is negative or unexpectedly big (2 seconds or more) the TILT mode is entered (or if it was already entered the exit from the TILT mode postponed).

When in TILT mode the Sentinel will continue to monitor everything, but:

  • It stops acting at all.
  • It starts to reply negatively to SENTINEL is-master-down-by-addr requests as the ability to detect a failure is no longer trusted.

If everything appears to be normal for 30 second, the TILT mode is exited.

Note that in some way TILT mode could be replaced using the monotonic clock API that many kernels offer. However it is not still clear if this is a good solution since the current system avoids issues in case the process is just suspended or not executed by the scheduler for a long time.

WARNING: This document is a draft and the guidelines that it contains may change in the future as the Sentinel project evolves.

Guidelines for KeyDB clients with support for KeyDB Sentinel#

KeyDB Sentinel is a monitoring solution for KeyDB instances that handles automatic failover of KeyDB masters and service discovery (who is the current master for a given group of instances?). Since Sentinel is both responsible to reconfigure instances during failovers, and to provide configurations to clients connecting to KeyDB masters or slaves, clients require to have explicit support for KeyDB Sentinel.

This document is targeted at KeyDB clients developers that want to support Sentinel in their clients implementation with the following goals:

  • Automatic configuration of clients via Sentinel.
  • Improved safety of KeyDB Sentinel automatic failover.

For details about how KeyDB Sentinel works, please check the KeyDB Documentation, as this document only contains information needed for KeyDB client developers, and it is expected that readers are familiar with the way KeyDB Sentinel works.

KeyDB service discovery via Sentinel#

KeyDB Sentinel identify every master with a name like "stats" or "cache". Every name actually identifies a group of instances, composed of a master and a variable number of slaves.

The address of the KeyDB master that is used for a specific purpose inside a network may change after events like an automatic failover, a manually triggered failover (for instance in order to upgrade a KeyDB instance), and other reasons.

Normally KeyDB clients have some kind of hard-coded configuration that specifies the address of a KeyDB master instance within a network as IP address and port number. However if the master address changes, manual intervention in every client is needed.

A KeyDB client supporting Sentinel can automatically discover the address of a KeyDB master from the master name using KeyDB Sentinel. So instead of a hard coded IP address and port, a client supporting Sentinel should optionally be able to take as input:

  • A list of ip:port pairs pointing to known Sentinel instances.
  • The name of the service, like "cache" or "timelines".

This is the procedure a client should follow in order to obtain the master address starting from the list of Sentinels and the service name.

Step 1: connecting to the first Sentinel#

The client should iterate the list of Sentinel addresses. For every address it should try to connect to the Sentinel, using a short timeout (in the order of a few hundreds of milliseconds). On errors or timeouts the next Sentinel address should be tried.

If all the Sentinel addresses were tried without success, an error should be returned to the client.

The first Sentinel replying to the client request should be put at the start of the list, so that at the next reconnection, we'll try first the Sentinel that was reachable in the previous connection attempt, minimizing latency.

Step 2: ask for master address#

Once a connection with a Sentinel is established, the client should retry to execute the following command on the Sentinel:

SENTINEL get-master-addr-by-name master-name

Where master-name should be replaced with the actual service name specified by the user.

The result from this call can be one of the following two replies:

  • An ip:port pair.
  • A null reply. This means Sentinel does not know this master.

If an ip:port pair is received, this address should be used to connect to the KeyDB master. Otherwise if a null reply is received, the client should try the next Sentinel in the list.

Step 3: call the ROLE command in the target instance#

Once the client discovered the address of the master instance, it should attempt a connection with the master, and call the ROLE command in order to verify the role of the instance is actually a master.

If the ROLE commands is not available (it was introduced in KeyDB 2.8.12), a client may resort to the INFO replication command parsing the role: field of the output.

If the instance is not a master as expected, the client should wait a short amount of time (a few hundreds of milliseconds) and should try again starting from Step 1.

Handling reconnections#

Once the service name is resolved into the master address and a connection is established with the KeyDB master instance, every time a reconnection is needed, the client should resolve again the address using Sentinels restarting from Step 1. For instance Sentinel should contacted again the following cases:

  • If the client reconnects after a timeout or socket error.
  • If the client reconnects because it was explicitly closed or reconnected by the user.

In the above cases and any other case where the client lost the connection with the KeyDB server, the client should resolve the master address again.

Sentinel failover disconnection#

Starting with KeyDB 2.8.12, when KeyDB Sentinel changes the configuration of an instance, for example promoting a slave to a master, demoting a master to replicate to the new master after a failover, or simply changing the master address of a stale slave instance, it sends a CLIENT KILL type normal command to the instance in order to make sure all the clients are disconnected from the reconfigured instance. This will force clients to resolve the master address again.

If the client will contact a Sentinel with yet not updated information, the verification of the KeyDB instance role via the ROLE command will fail, allowing the client to detect that the contacted Sentinel provided stale information, and will try again.

Note: it is possible that a stale master returns online at the same time a client contacts a stale Sentinel instance, so the client may connect with a stale master, and yet the ROLE output will match. However when the master is back again Sentinel will try to demote it to slave, triggering a new disconnection. The same reasoning applies to connecting to stale slaves that will get reconfigured to replicate with a different master.

Connecting to slaves#

Sometimes clients are interested to connect to slaves, for example in order to scale read requests. This protocol supports connecting to slaves by modifying step 2 slightly. Instead of calling the following command:

SENTINEL get-master-addr-by-name master-name

The clients should call instead:

SENTINEL slaves master-name

In order to retrieve a list of slave instances.

Symmetrically the client should verify with the ROLE command that the instance is actually a slave, in order to avoid scaling read queries with the master.

Connection pools#

For clients implementing connection pools, on reconnection of a single connection, the Sentinel should be contacted again, and in case of a master address change all the existing connections should be closed and connected to the new address.

Error reporting#

The client should correctly return the information to the user in case of errors. Specifically:

  • If no Sentinel can be contacted (so that the client was never able to get the reply to SENTINEL get-master-addr-by-name), an error that clearly states that KeyDB Sentinel is unreachable should be returned.
  • If all the Sentinels in the pool replied with a null reply, the user should be informed with an error that Sentinels don't know this master name.

Sentinels list automatic refresh#

Optionally once a successful reply to get-master-addr-by-name is received, a client may update its internal list of Sentinel nodes following this procedure:

  • Obtain a list of other Sentinels for this master using the command SENTINEL sentinels <master-name>.
  • Add every ip:port pair not already existing in our list at the end of the list.

It is not needed for a client to be able to make the list persistent updating its own configuration. The ability to upgrade the in-memory representation of the list of Sentinels can be already useful to improve reliability.

Subscribe to Sentinel events to improve responsiveness#

The Sentinel documentation shows how clients can connect to Sentinel instances using Pub/Sub in order to subscribe to changes in the KeyDB instances configurations.

This mechanism can be used in order to speedup the reconfiguration of clients, that is, clients may listen to Pub/Sub in order to know when a configuration change happened in order to run the three steps protocol explained in this document in order to resolve the new KeyDB master (or slave) address.

However update messages received via Pub/Sub should not substitute the above procedure, since there is no guarantee that a client is able to receive all the update messages.