In this article, we discuss split-brain situations, with the aim of making you aware of what it is, what the consequences are and what you can or cannot do about it. When setting up an SQL cluster, you often have various configuration options and it is important to be aware of the implications of split-brain, in order to make an informed choice for your configuration.
Split-brain problems are the most annoying database problems you can experience in an SQL cluster, but what is a split-brain situation?
A split-brain situation is a situation where two or more servers in an SQL cluster can no longer see each other but are accessible to clients. Those clients then still write data to one of the servers, but the data is no longer written to all servers.
The consequence of this problem is that in your database cluster, at least one server gets different content than the other server(s). This is very complicated to fix and preventing the problem (where possible) is much more important than curing it.
This article covers aspects that apply to various SQL solutions, where we jump into the configuration however, a MariaDB setup with MaxScale is assumed.
How does a split-brain situation occur?
Split-brain occurs, for example, when the following conditions are met:
- The connection between the SQL servers is lost (i.e. the SQL servers no longer see each other).
- The SQL services on the SQL servers still work.
- The SQL servers are still accessible to clients (i.e. clients who initiate write actions).
This can happen, for example, with a configuration error on a VPS, or interruption of the private network for whatever reason.
If there is no network connection to the outside world, or if you do not execute write queries to your SQL servers (except for when you update your website), then no split-brain occurs. The condition for the first example is that the interconnection of your SQL cluster will be restored before new write actions are executed.
In which split-brain situation can you not prevent the consequences?
The consequences of a split-brain situation are difficult to prevent. Simply put, you cannot avoid the consequences when you use an automatic failover setup, or multi-master setup, where one master becomes unreachable for the other VPSs, and the other VPSs have no way of determining that master is still publicly accessible.
When can such a situation arise, for example?
There is mainly one common scenarios imaginable in redundant database scenarios which such a situation can arise:
- You're using a seperate connection for your database queries between the database servers and those with the outside world, for example a private network for connections between the servers and a public network for connections with the outside world or web servers (for example) and only the private network becomes unavailable, but the public network remains available.
A more detailed example that could apply to our services would be: If you use two database servers and a problem occurs with the routing table at one of the routers which processes the traffic between two availability zones, a split-brain situation may arise. A routing problem can namely make the public WAN and private LAN between both locations unattainable at the same time, while the connection to the outside world remains intact. The result is that both VPSs operate as a master, without knowing of each other.
The reason that such a routing problem can cause a split-brain situation is that the VPSs try to find the fastest route to each other, so via the direct line between the two availability zones. An external user of your services will still be able to access your VPSs because it does not connect to your VPS via the direct line between the two availability zones. This is a rare situation and it may very well be that you will never be confronted with this specific scenario.
If one of these situations were to occur, you would need to merge your databases manually, after the network of your SQL cluster is restored.
One way to (mostly) prevent this is to use three availability zones with a VPS in each zone. For example, in case of a routing problem at one availability zone, the other two availability zones will still see each other and together keep the SQL cluster intact. This can be achieved by configuring your cluster in such a way that if a server cannot see half the other VPSs + 1 anymore, it'll no longer perform write actions (this is how Galera works and is called a quorum vote). In case of a routing problem at two availability zones, however, you would still have this problem. The chance that that happens is fortunately very small.
Preventing split-brain consequences when the private network is unavailable
This section does not apply to a Galera cluster, or other setups where a quorum vote is executed.
The best way to prevent split-brain as much as possible is to use three database servers in three different availability zones. If one server is unreachable, the other can still determine that they together form the majority and keep the cluster intact (five availability zones with a total of five database servers would be even better).
There are also two scenarios imaginable where you use only two database servers and the consequences of split-brain (but not the cause) can be prevented:
- You use two VPSs with a control panel such as Plesk or DirectAdmin with a master-slave setup and only the private network is unreachable, but not the public WAN. Please note that the makers of the control panels (and therefore also ourselves) do not officially support a cluster solution with their products.
- When you use two servers in colocations and have two separate network adapters at your disposal and only the private network is unreachable, but not the public WAN.
The key here is that you use a seperate physical connection for your private and public network. Please note that this is not how the TransIP infrastructure works, but the network itself is already a redundant network, so the chances of such a scenario are extremely small.
In case of VPSs with a control panel, the web server and database server are always on the same VPS and can also be accessed via the public IP. Suppose you use a master-slave setup for that, then, in situations where only the private network is unreachable and there is no routing problem, you can use the public network to perform a heartbeat check. In the case of a colocation, this is also possible by using an extra physical network adapter.
MaxScale has the particularly useful feature that you can have a script be activated when a specific event occurs. You can use this property to prevent the consequences of a split-brain situation in the above scenarios. You do this by performing a series of checks at a masterdown event. If the script determines that the public IP is available, but not the private IP, then you automatically disable MariaDB on the slave VPS.
- Follow the steps below as the root user and only on your slave VPS unless otherwise indicated.
- Read the section below carefully before implementing it and decide if it's applicable to your configuration
First, create the script, a log file, and the necessary directories (you are free to use other directories / filenames):
mkdir /var/log/failover/ mkdir /etc/failover/ touch /var/log/failover/failover.log touch /etc/failover/failover.sh
The permissions are set by default for the root user. However, MaxScale needs full rights to these files. Therefore, adjust the rights first (-R stands for recursive and ensures that all files and subdirectories are included):
chown -R maxscale:maxscale /var/log/failover chown -R maxscale:maxscale /etc/failover/ chmod -R 744 /var/log/failover/ chmod -R 744 /etc/failover/
Open your MaxScale configuration:
Add the following two lines to the [MariaDB-Monitor] part:
events=master_down script=/etc/failover/failover.sh initiator=$INITIATOR event=$EVENT
- events: Indicates at which events the script is executed. In this case, we let the script trigger at a master_down event. All other events which you can use can be found on this page under Script events.
- scripts: Refers to your script directoryo
- initiator: Gives you information about the VPS that goes down. The output of $INITIATOR looks like initiator=[192.168.1.1]:3306
- event: Indicates whether a master_down event has taken place
The result then looks something like this:
[MariaDB-Monitor] type=monitor module=mariadbmon servers=server1, server2 user=slave passwd=D8FB67994030F6D8FFEAB1D69E384B84 monitor_interval=2000 events=master_down script=/etc/failover/failover.sh initiator=$INITIATOR event=$EVENT auto_failover=true auto_rejoin=true
Save the changes and exit nano (ctrl + x> y> enter).
Maxscale has no rights to stop MariaDB (you set this in step 8). That is why we give Maxscale permission to stop MariaDB with root rights (for safety reasons, no more than that!). Open the sudoers file as follows:
Add the content below to an empty spot. To maintain overview, we recommend adding this under the other Cmnd_Alias commands.
Visudo opens in vi by default. If you use nano yourself and are not familiar with vi, first press i. You will then see –insert displayed at the bottom. This means you can add text.
## Allows maxscale to shut down MariaDB. Cmnd_Alias SPLITBRAIN = /usr/bin/systemctl stop mariadb.service Defaults!SPLITBRAIN !requiretty maxscale ALL=(root) NOPASSWD: SPLITBRAIN
To save the changes, press the Esc > :wq! > enter keys in succession.
Install netcat (if it is not already on your VPSs):
yum -y install nc
Open the failover.sh script:
Add the following content to the file and save the changes (ctl + x > y > enter). Under the code, an explanation of the complete content and why this approach is specifically used follows.
#!/bin/bash TIMEOUT_SECONDS=2 HOST=192.168.1.1 HOST1=220.127.116.11 HOST2=18.104.22.168 PORT=80 PORT1=3306 PORT2=443 LOG=/var/log/failover/failover.log #Checks if the predefined port is reachable on the private LAN of the master. #This first test is strictly for error reporting purposes (0=yes, 1=no) nc -w $TIMEOUT_SECONDS $HOST $PORT </dev/null; NC=$? #checks if the SQL port is reachable on the public WAN (0=yes, 1=no) nc -w $TIMEOUT_SECONDS $HOST1 $PORT1 </dev/null; NC1=$? #checks if Google can be reached (0=yes, 1=no) nc -w $TIMEOUT_SECONDS $HOST2 $PORT2 </dev/null; NC2=$? if [ $NC -eq 1 ] then PLAN="Private network status: UNAVAILABLE! Current master at $initiator cannot be reached over port $PORT. Netcat returned $NC" else PLAN="Private network status: ONLINE! Master down event most likely caused by human error." fi if [ $NC2 -eq 1 ] then WAN="Slave WAN is down. Netcat Google test returned $NC2" echo "$WAN" >> $LOG else WAN="Slave WAN is up. Netcat Google test returned $NC2" echo "$WAN" >> $LOG fi if [ $NC1 -eq 0 ] then sudo /usr/bin/systemctl stop mariadb.service echo "** $(date): *** $event event from current master at $initiator ***" >> $LOG echo "$PLAN" >> $LOG echo "SQL status: ONLINE! Public SQL port $PORT1 on current master at $HOST1 available. Netcat returned $NC1" >> $LOG echo "MARIADB HAS BEEN SHUT DOWN TO PREVENT SPLIT-BRAIN!" >> $LOG echo "**********************************************************************" >> $LOG fi
- Explanation of the code
- TIMEOUT: Number of seconds after which netcat stops trying to connect.
- HOST: The private network IP address of your master VPS.
- HOST1: The public WAN IP address of your master VPS.
- HOST2: Google's public IP address.
- PORT: Any port that is accessible on the private network of the VPS. You can choose one from the list which you see with the command netstat-tulpen | less.
- PORT1: The SQL port, in this case, 3306.
- PORT2: The Web server port 443.
- LOG: The location of the log file in which you keep track of the findings and actions of this script.
- nc -w $TIMEOUT_SECONDS $HOST $PORT </dev/null; The syntax, item-by-item, means the following:
- nc Start of the netcat command.
- -w Lets a timeout occur if netcat receives no response.
- $TIMEOUT_SECONDS The number of seconds after which a timeout occurs. Normally, netcat connects in less than 0.1 seconds. With two seconds you give it more than enough time.
- $HOST The IP address that is being tested.
- $PORT The port being tested on the $HOST IP address.
- </dev/null Normally, netcat keeps a successful connection open until you cut it off. With this addition, you write the result to </dev/null. The device ensures that netcat does not keep the connection active, it disappears.
- ; close the command.
- NC=$? : Returns a 0 if netcat can connect to the specified IP address and port number, or a 1 if that fails.
- if [ $NC -eq 1] .... fi: The slave-VPs test whether the private network of the master is available and writes the result in the variable $PLAN, see explanation under 'Why this approach?'.
- if [$NC2 -eq 1] .... fi: The slave VPS test whether Google is available and writes the result into the log, see explanation under "Why this approach?".
- if [$ NC1 -eq 0] .... fi: The slave VPS test whether the SQL port of the master VPS is available, if so, the slave VPS will stop its own MariaDB service and writes the date, the results of the first test, the message that the master is available, and that MariaDB has set itself up on the slave VPS in the log file ($LOG).
Why this approach?
When the master VPS goes down, several checks are performed that test the connection and make troubleshooting easy.
- You test a random (non-SQL) port on the private network IP address of the master, to determine whether the private network is reachable or not.
- It does not matter for switching your server on or off whether the set port on the private network is on or off. There is a master_down event, so a failover is started anyway. The accessibility of the private network is only tested for logging / troubleshooting purposes.
- The Google test is used to determine whether your public WAN of your slave is still available. This also serves for troubleshooting and can possibly be omitted.
- If the public WAN is accessible, MariaDB is disabled on the slave. It is then clear that the master can still accept queries. Turn this script off temporarily (by commenting out and restarting MaxScale) when you temporarily take your SQL master offline for maintenance.
That concludes this article. You are now familiar with what split-brain means and that in very specific cases you can use scripts to get VPSs offline automatically, for example.
Should you have any questions left regarding this article, do not hesitate to contact our support department. You can reach them via the ‘ContactUs’ button at the bottom of this page.
If you want to discuss this article with other users, please leave a message under 'Comments'.