MySQL Cluser Server is a fault-tolerant, redundant, scalable database architecture built on the open-source MySQL application, and capable of delivering 99.999% reliability. In this paper we describe the process we used to setup, configure, and test a three-node mySQL cluster server in a test environment.
Schematic
Hardware
We used four Sun Ultra Enterprise servers in our test environment, but the process for setting up a mySQL cluster server on other UNIX- or Linux-based platforms is very similar, and this setup guide should be applicable with little or no modification.
Our four machines each fall into one of three roles:
1. storage nodes ( mysql-ndb-1 and mysql-ndb-2 )
2. API node ( mysql-api-1 )
3. management server and management console ( mgmt )
Note that the storage nodes are also API nodes, but the API node is not a storage node. The API node is a full member of the cluster, but it does not store any cluster data, and its state (whether it is up or down) does not affect the integrity or availablility of the data on the storage nodes. It can be thought of as a "client" of the cluster. Applications such as web servers live on the API nodes and communicate with the mySQL server process running locally, on the API node itself, which takes care of fetching data from the storage nodes. The storage nodes are API nodes as well, and technically additional applications could be installed there and communicate with the cluster via the mySQL server processes running on them, but for management and performance reasons this probably should be considered a sub-optimal configuration in a production environment.
Software
Sun Solaris 8 operating system
mysql-max-4.1.9
We used the precompiled binary distribution of mySQL server for Sun SPARC Solaris 8. Obviously, for implementation on other platforms, the appropriate binary distribution should be used. In all cases, the "max" mySQL distribution is required. The mySQL 4.1 download page can be found here .
Procedure
Step 1 . On both storage nodes, mysql-ndb-1 (192.168.0.33) and mysql-ndb-2 (192.168.0.34) , obtain and install mySQL server:
Do not start the mysql servers yet.
Step 2 . Setup the management server and management console on host mgmt (192.168.0.32). This requires only two executables be extracted form the mysql distribution. The rest can be deleted.
The file config.ini contains configuration information for the cluster:
Start the management server and verify that it is running:
Step 3 . On both storage nodes, mysql-ndb-1 (192.168.0.33) and mysql-ndb-2 (192.168.0.34) , configure the mySQL servers:
This is the configuration file (/etc/my.cnf) for the mysql server on both storage nodes:
On both storage nodes, start the NDB storage engine and mysql server and verify that they are running:
If the mysql server did not startup properly, check the logfile in /usr/local/mysql/data/${HOSTNAME}.err and correct the problem.
Step 4 . Start the management console on the management server machine ( mgmt ) and query the status of the cluster:
Step 5 . Create a test database, populate a table using the NDBCLUSTER engine, and verify correct operation:
On both storage node s mysql-ndb-1 and mysql-ndb-2 create the test database :
Back on storage node mysql-ndb-1 , populate the database with a table containing some simple data:
Now go to storage node mysql-ndb-2 and verify that the data is accessible:
This is a good sign, but note that it does not actually prove that the data is being replicated. The storage node ( mysql-ndb-2 ) is also a cluster API node, and this test merely shows that it is able to retrieve data from the cluster. It demonstrates nothing with re spect to the underlying storage mechanism in the cluster. This can be more clearly demonstrated with the following test.
Kill off the NDB engine process (ndbd) on one of the storage nodes ( mysql-ndb-2 ) in order to simulate failure of the storage engine:
The management server will recognize that the storage engine on mysql-ndb-2 (192.168.0.34) has failed, but his API connection is still active:
On the first storage node ( mysql-ndb-1 ) populate another new table with some test data:
Back on the second storage node ( mysql-ndb-2 ) perform the same select command:
The storage engine and the API server are two separate, distinct processes that are not inherently dependent on one another. Once the ndbd storage engine process is restarted on the second storage node, the data is replicated, as the following test demonstrates.
First, restart the storage engine process on mysql-ndb-2 :
Next, shutdown the storage engine on mysql-ndb-1 either using the management console or command line kill:
Now, to determine if the SQL data was replicated when the storage engine on mysql-ndb-2 was restarted, try the query on either (or both) hosts:
This shows that the data is being replicated on both storage nodes. Restart the storage engine on mysql-ndb-1 :
Step 6 . Next, we add a cluster API node. This node is a full member of the cluster, but does not run the NDB storage engine. Data is not replicated on this node, and it functions essentially as a "client" of the cluster server. Typically, we would install applications that require access to the mySQL data (web servers, etc) on this machine. The applications talk to the mySQL server on localhost, which then handles the underlying communication with the cluster in order to fetch the requested data.
First, install the mysql server on the API node mysql-api-1 (192.168.0.35):
Install a simple /etc/my.cnf file:
Now start the mySQL server:
At this point you can check the cluster status on the management console and verify that the API node is now connected:
Our configuration now resembles the diagram at the top of the page.
Step 7 . Finally, we should verify the fault-tolerance of the cluster when servicing queries from the API node.
With the cluster up and operating corrrectly, use the API node to create a new table and insert some test data:
Now, insert some random data into the table, either by hand or you can use a quick script to do it:
Looks good. Now, disconnect the network cable from the first storage node so that it falls out of the cluster. Within a few seconds, the management console will recognize that it has disappeared:
Is the cluster data still available to the API node?
mysql-api-1# mysql -u root
Welcome to the MySQL monitor. Commands end with ; or /g.
Your MySQL connection id is 258552 to server version: 4.1.9-maxType 'help;' or '/h' for help. Type '/c' to clear the buffer.
mysql> use foo;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -ADatabase changed
mysql> select * from test3;
+------+
| i |
+------+
| 54 |
| 91 |
| 79 |
| 52 |
| 92 |
| 20 |
| 18 |
| 84 |
| 49 |
| 22 |
+------+
10 rows in set (0.02 sec)
Now, plug the disconnected storage node back into the network. It will attempt to rejoin the cluster, but probably will be shutdown by the management server, and something similar to the following will appear in the error log (/var/lib/mysql-cluster/mdb_2_error.log ):
Date/Time: Saturday 12 February 2005 - 12:46:21
Type of error: error
Message: Arbitrator shutdown
Fault ID: 2305
Problem data: Arbitrator decided to shutdown this node
Object of reference: QMGR (Line: 3796) 0x0000000a
ProgramName: /usr/local/mysql/bin/ndbd
ProcessID: 1185
TraceFile: /var/lib/mysql-cluster/ndb_2_trace.log.3
***EOM***
Restart the ndb storage engine process on that node and verify that it rejoins the cluster properly:
mysql-ndb-1# /usr/local/mysql/bin/ndbd
ndb_mgm> show
Cluster Configuration
---------------------
[ndbd(NDB)] 2 node(s)
id=2 @192.168.0.33 (Version: 4.1.9, Nodegroup: 0)
id=3 @192.168.0.34 (Version: 4.1.9, Nodegroup: 0, Master)[ndb_mgmd(MGM)] 1 node(s)
id=1 @192.168.0.32 (Version: 4.1.9)[mysqld(API)] 4 node(s)
id=4 (Version: 4.1.9)
id=5 (Version: 4.1.9)
id=6 @192.168.0.35 (Version: 4.1.9)
id=7 (not connected, accepting connect from any host)
Miscellaneous