Fail-Over, Client Side
So far, I have shown how a server replaces a failed one. All is well when seen from within the replication array, but what about the poor client that was pointing at the now defunct master and keeps getting connection time-out errors?
Here comes the bad news: there is no silver bullet for this problem. Because it's on the application side, you must find an application solution. There are a few server-side solutions, but they are either limited to only two nodes or they depend heavily on a specific operating system (such as CARP, UCARP, or Linux-HA).
The consoling news is that MySQL Cluster is no better in this field. It may come as a surprise to many, but the answer for MySQL high availability is to use multiple data sources in Java. Unless some integration between operating system and MySQL comes up, the situation is that you are pretty much on your own when you need a high-availability system.
The good news is that if circular replication is satisfactory for your needs, a simple load balancer will be enough to guarantee at the same time a good spread of resources and a high-availability system. You can buy a hardware load balancer, or you can implement a software one within your application. You may also monitor the logs of the fail-over events we have seen in this article to exclude a dead node from the ones through which your balancer should loop. If you use such an approach and get random connections coming from the balancer, be sure to use them by transaction, not by query. Even if you don't use transactions, use only one connection within the same unit, be it a web page, a module function, or an administration program.
Let me try to explain further. You are using a device that gets you a random database connection to one of the nodes. You could just use such device for each query in your application, thus spreading the load among all nodes. However, this extreme randomization would not be healthy. It may happen that you do an
INSERT query in one node, and then a
SELECT query in another node, where the effects of your insert may not yet be replicated. To avoid this, your application should get a connection and use it to execute all queries that are logically related.
What I have shown in this article is, I believe, a big step forward in the direction of a more useful replication system. There are still several missing pieces to extend this proof of concept into a well-behaved engine.
One, I have to mention that MySQL AB is planning an extension of the current MySQL replication so that it will include some devices to resolve conflicts (such as the clash of unique keys). It is still in early stages, but feel free to explore.
I also need to mention some half-baked features in MySQL 5.x. You know that one of MySQL 5.0's major enhancements was the information schema database, which should offer a coherent view of database metadata objects. Unfortunately, a missing piece in this collection of metadata is all the data concerning replication. Therefore, because you can access replication status only through
SHOW SOMETHING statements (as of today, in MySQL 5.1.9), stored procedures cannot access this information. A further problem is that the parameters of
CHANGE MASTER TO must be text literals. Variables are not accepted, thus reducing the flexibility of stored procedures. This inconsistency has been reported through the appropriate channels, and we hope somebody will act on it. For the time being, it all means that you can achieve fine-grained replication administration only through external procedures.
(Actually, that is not exactly true. There are some undocumented and unofficial--even deprecated--practices that can overcome these limitations. For the brave, I will illustrate these techniques in Higher Order MySQL, a talk at the MySQL Users Conference 2006.)
Enhanced circular replication offers additional features that I have not shown here. It is possible, for instance, to exchange messages between servers. That is quite useful in the event of master replacement, when the slave could ask the master to perform a
RESET MASTER before resuming replication. I leave these amenities to some other article, to avoid burdening this one too much.
However, let me remind you that the code in this article is just a proof of concept, which needs some hardening before being used for production. A real-world application will need to double-check whether the master is really dead before switching to a new master, and the new master must be questioned before the switch to ensure that it's ready to take over; the other nodes should be informed; and so on. You can perform all these actions using the currently available technology, although they will be more effective and easier to implement when the currently planned improvements on the data dictionary are available--according to the information at hand, in MySQL version 5.1.
Playing with the System
Experimenting with replication is not so easy. You need to have several servers to play with master and slaves. For a test of fail-over, you need at least three servers.
Because not everyone can enjoy an abundance of hardware, I offer you MySQL 5 Replication Playground, where all the nodes are in the same box, and they just use different ports and sockets to simulate separate machines.
Should you wish to play with this system, all you need to do is install one instance of MYSQL 5.1 and then the replication playground in your home directory. (You don't need root access.) Download it, peruse the readme, run the installation script, and play with it.
Thanks to Patrizio Tassone and Alberto Coduti for the graphics. Thanks to Roland Bouman, Jay Pipes, and Lars Thalmann for catching my mistakes and for providing useful comments and corrections. I am also indebted to them for some rephrasing and definitions in this text. Thanks to Massimiliano Stucchi for providing a test environment for FreeBSD and for catching some more mistakes.
Giuseppe Maxia is a QA developer in MySQL community team. A system analyst with 20 years of IT experience, he has worked as a database consultant and designer for several years. He is a frequent speaker at open source events and he's the author of many articles. He lives in Sardinia (Italy).
Return to ONLamp.com.