Table of Contents
MySQL Cluster is a high-availability, high-redundancy version of
MySQL adapted for the distributed computing environment. It uses the
NDB Cluster storage engine to enable running
several MySQL servers in a cluster. This storage engine is available
in MySQL 5.0 binary releases and in RPMs compatible
with most modern Linux distributions. (Note that both the
mysql-server and mysql-max
RPMs must be installed in order to have MySQL Cluster capability.)
The operating systems on which MySQL Cluster is currently available are Linux, Mac OS X, and Solaris. (Some users have reported success with running MySQL Cluster on FreeBSD, although this is not yet officially supported by MySQL AB.) We are working to make Cluster run on all operating systems supported by MySQL, including Windows, and will update this page as new platforms are supported.
This chapter represents a work in progress, and its contents are subject to revision as MySQL Cluster continues to evolve. Additional information regarding MySQL Cluster can be found on the MySQL AB Web site at http://www.mysql.com/products/cluster/.
Additional resources
Answers to some commonly asked questions about Cluster may be found in the Section 15.10, “MySQL Cluster FAQ”.
The MySQL Cluster mailing list: http://lists.mysql.com/cluster.
The MySQL Cluster forum: http://forums.mysql.com/list.php?25.
If you are new to MySQL Cluster, you may find our Developer Zone article How to set up a MySQL Cluster for two servers to be helpful.
MySQL Cluster is a technology which enables clustering of in-memory databases in a share-nothing system. The share-nothing architecture allows the system to work with very inexpensive hardware, and without any specific requirements on hardware or software. It also does not have any single point of failure because each component has its own memory and disk.
MySQL Cluster integrates the standard MySQL server with an
in-memory clustered storage engine called NDB.
In our documentation, the term NDB refers to
the part of the setup that is specific to the storage engine,
whereas “MySQL Cluster” refers to the combination of
MySQL and the NDB storage engine.
A MySQL Cluster consists of a set of computers, each running a number of processes including MySQL servers, data nodes for NDB Cluster, management servers, and (possibly) specialized data access programs. The relationship of these components in a cluster is shown here:

All these programs work together to form a MySQL Cluster. When data is stored in the NDB Cluster storage engine, the tables are stored in the data nodes. Such tables are directly accessible from all other MySQL servers in the cluster. Thus, in a payroll application storing data in a cluster, if one application updates the salary of an employee, all other MySQL servers that query this data can see this change immediately.
The data stored in the data nodes for MySQL Cluster can be mirrored; the cluster can handle failures of individual data nodes with no other impact than that a small number of transactions are aborted due to losing the transaction state. Since transactional applications are expected to handle transaction failure, this should not be a source of problems.
By bringing MySQL Cluster to the Open Source world, MySQL makes clustered data management with high availability, high performance, and scalability available to all who need it.
NDB is an in-memory storage engine offering high-availability and data-persistence features.
The NDB storage engine can be configured with a range of failover and load-balancing options, but it is easiest to start with the storage engine at the cluster level. MySQL Cluster's NDB storage engine contains a complete set of data, dependent only on other data within the cluster itself.
We will now describe how to set up a MySQL Cluster consisting of an NDB storage engine and some MySQL servers.
The cluster portion of MySQL Cluster is currently configured independently of the MySQL servers. In a MySQL Cluster, each part of the cluster is considered to be a node.
Note: In many contexts, the term "node" is used to indicate a computer, but when discussing MySQL Cluster it means a process. There can be any number of nodes on a single computer, for which we use the term cluster host.
There are three types of cluster nodes, and in a minimal MySQL Cluster configuration, there will be at least three nodes, one of each of these types:
The management (MGM) node: The role of this
type of node is to manage the other nodes within the MySQL
Cluster, such as providing configuration data, starting and
stopping nodes, running backup, and so forth. Because this
node type manages the configuration of the other nodes, a node
of this type should be started first, before any other node.
An MGM node is started with the command
ndb_mgmd.
The data node: This is the type of node that stores the cluster's data. There are as many data nodes as there are replicas, times the number of fragments. For example, with two replicas, each having two fragments, you will need four data nodes. It is not necessary to have more than one replica. A data node is started with the command ndbd.
The SQL node: This is the
node that accesses the cluster data. In the case of MySQL
Cluster, a client node is a traditional MySQL server that uses
the NDB Cluster storage engine. An SQL node is typically
started with the command mysqld
--ndbcluster or simply by using
mysqld with ndbcluster
added to my.cnf.
Configuration of a cluster involves configuring each individual node in the cluster and setting up individual communication links between nodes. MySQL Cluster is currently designed with the intention that storage nodes are homogenous in terms of processor power, memory space, and bandwidth. In addition, in order to provide a single point of configuration, all configuration data for the cluster as a whole is located in one configuration file.
The management server (MGM node) manages the cluster configuration file and the cluster log. Each node in the cluster retrieves the configuration data from the management server, and so requires a way to determine where the management server resides. When interesting events occur in the data nodes, the nodes transfer information about these events to the management server, which then writes the information to the cluster log.
In addition, there can be any number of cluster client processes or applications. These are of two types:
Standard MySQL clients: These are no different for MySQL Cluster than they are for standard (non-Cluster) MySQL. In other words, MySQL Cluster can be accessed from existing MySQL applications written in PHP, Perl, C, C++, Java, Python, Ruby, and so on.
Management clients: These clients connect to the management server and provide commands for starting and stopping nodes gracefully, starting and stopping message tracing (debug versions only), showing node versions and status, starting and stopping backups, and so on.
This section is a “How-To” in which we describe the basics for how to plan, install, configure, and run a MySQL Cluster. Unlike the example in Section 15.4, “MySQL Cluster Configuration”, the result of following the guidelines and procedures outlined below should be a usable MySQL Cluster which meets minimum requirements for availability and safeguarding of data.
In this section, we will cover hardware and software requirements; networking issues; installation of MySQL Cluster; configuration issues; starting, stopping, and restarting the cluster; loading of a sample database; and performing queries.
Basic Assumptions
This How-To makes the following assumptions:
We are setting up our cluster with 4 nodes, each on a separate host, and each with a fixed network address on a typical Ethernet as shown here:
| Node | IP Address |
| Management (MGM) node | 192.168.0.10 |
| MySQL server (SQL) node | 192.168.0.20 |
| Data (NDBD) node "A" | 192.168.0.30 |
| Data (NDBD) node "B" | 192.168.0.40 |
This may be made clearer in the following diagram:

Note: In the interest of
simplicity (and reliability), we will use only numeric IP
addresses in this How-To. However, if DNS resolution is
available on your network, then it is possible to use
hostnames in lieu of IP addresses in configuring Cluster.
Alternatively, you can also use the
/etc/hosts file or your operating
system's equivalent for providing a means to do host lookup if
such is available.
Each host in our scenario is an Intel-based desktop PC running a common, generic Linux distribution installed to disk in a standard configuration, and running no unnecessary services. The core OS with a standard TCP/IP networking client should be sufficient. Also for the sake of simplicity, we also assume that the filesystems on all hosts are set up identically. In the event that they are not, you will need to adapt these instructions accordingly.
Standard 100 Mbps or 1 gigabit Ethernet cards are installed on each machine, along with the proper drivers for the cards, and that all 4 hosts are connected via a standard-issue Ethernet networking applicance such as a switch. (All machines should use network cards with the same throughout; that is, all 4 machines in the cluster should have 100 Mbps cards or all 4 machines should have 1 Gbps cards.) MySQL Cluster will work in a 100 Mbps network; however, gigabit Ethernet will provide better performance.
Note that MySQL Cluster is not intended for use in a network whose connectivity is less than 100 Mbps. For this reason (among others), attempting to run a MySQL Cluster over a public network such as the Internet is not likely to be successful, and is not recommended.
For our sample data, we will use the world
database which is available for download from the MySQL AB
website. As this database takes up a relatively small amount
of space, we assume that each machine has 256 MB RAM, which
should be sufficient for running the operating system, host
NDB process, and (for the data nodes) for storing the
database.
While we refer to a Linux operating system in this How-To, the instructions and procedures that we provide here should be easily adaptable to either Solaris or Mac OS X. We also assume that you already know how to perform a minimal installation and configuration of the operating system with networking capability, or that you are able to obtain assistance in this elsewhere if needed.
We discuss MySQL Cluster hardware, software, and networking requirements in somewhat greater detail in the next section. (See Section 15.3.1, “Hardware, Software, and Networking”.)
One of the strengths of MySQL Cluster is that it can be run on commodity hardware and has no ususual requirements in this regard, other than for large amounts of RAM, due to the fact that all live data storage is done in memory. (Note that this is subject to change, and that we intend to implement disk-based storage in a future MySQL Cluster release.) Naturally, multiple and faster CPUs will enhance performance. Memory requirements for Cluster processes are relatively small.
The software requirements for Cluster are also modest. Host
operating systems do not require any unusual modules, services,
applications, or configuration to support MySQL Cluster. For Mac
OS X or Solaris, the standard installation is sufficient. For
Linux, a standard, “out of the box” installation
should be all that is necessary. The MySQL software requirements
are simple: all that is needed is a production release of
MySQL-max 5.0; you must use the
-max version of MySQL in order to have
Cluster support. It is not necessary to compile MySQL yourself
merely to be able to use Cluster. In this How-To, we assume that
you are using the -max binary appropriate to
your Linux. Solaris, or Mac OS X operating system, available via
the MySQL software downloads page at
http://dev.mysql.com/downloads/.
For inter-node communication, Cluster supports TCP/IP networking in any standard topology, and the minimum expected for each host is a standard 100 Mbps Ethernet card, plus a switch, hub, or router to provide network connectivity for the cluster as a whole. We strongly recommend that a MySQL Cluster be run on its own subnet which is not shared with non-Cluster machines for the following reasons:
Security: Communications between Cluster nodes are not encrypted or shielded in any way. The only means of protecting transmissions within a MySQL Cluster is to run your Cluster on a protected network. If you intend to use MySQL Cluster for Web applications, the cluster should definitely reside behind your firewall and not in your network's De-Militarised Zone (DMZ) or elsewhere.
Efficiency: Setting up a MySQL Cluster on a private or protected network allows for the cluster to make exclusive use of bandwidth between cluster hosts. Using a separate switch for your MySQL Cluster not only helps protect against unauthorised access to Cluster data, it also ensures that Cluster nodes are shielded from interference caused by transmissions between other computers on the network. For enhanced reliability, you can use dual switches and dual cards to remove the network as a single point of failure; many device drivers support failover for such communication links.
It is also possible to use the high-speed Scalable Coherent Interface (SCI) with MySQL Cluster, but this is not a requirement. See Section 15.7, “Using High-Speed Interconnects with MySQL Cluster” for more about this protocol and its use with MySQL Cluster.
Each MySQL Cluster host computer running storage or SQL nodes must have installed on it a MySQL-max binary. For management nodes, it is not necessary to install the MySQL server binary, but you do have to install the MGM server daemon and client binaries (ndb_mgmd and ndb_mgm, respectively). In this section, we will cover the steps necessary to install the correct binaries for each type of Cluster node.
MySQL AB provides precompiled binaries which support Cluster,
and there is generally no need to compile these yourself. (If
you do require a custom binary, see
Section 2.8.3, “Installing from the Development Source Tree”.) Therefore, the first
step in the installation process for each cluster host is to
download the file
mysql-max-5.0.16-pc-linux-gnu-i686.tar.gz
from the MySQL downloads
area. We assume that you have placed it in each
machine's /var/tmp directory.
RPMs are also available for both 32-bit and 64-bit Linux
platforms; the -max binaries installed by the
RPMs support the NDBCluster storage engine.
If you choose to use these rather than the binary files, be
aware that you must install both the
-server and -max packages
on all machines that are to host cluster nodes. (See
Section 2.4, “Installing MySQL on Linux” for more information about
installing MySQL using the RPMs.) After installing from RPM, you
will still need to configure the cluster as discussed in
Section 15.3.3, “Configuration”.
Note: After completing the installation, do not yet start any of the binaries. We will show you how to do so following the configuration of all nodes.
Storage and SQL Node Installation
On each of the 3 machines designated to host storage or SQL nodes, perform the following steps as the system root user:
Check your /etc/passwd and
/etc/group files (or use whatever tools
are provided by your operating system for manging users and
groups) to see whether or not there are already a
mysql group and mysql
user on the system, as some OS distributions create these as
part of the operating system installation process. If these
are not already present, then create a new
mysql user group, then add a
mysql user to this group:
groupadd mysql useradd -g mysql mysql
Change to the directory containing the downloaded file; unpack the archive; create a symlink to the mysql-max executable. Note that the actual file and directory names will vary according to the MySQL version number.
cd /var/tmp tar -xzvf -C /usr/local/bin mysql-max-5.0.16-pc-linux-gnu-i686.tar.gz ln -s /usr/local/bin/mysql-max-5.0.16-pc-linux-gnu-i686 mysql
Change to the mysql directory, and run
the supplied script for creating the system databases:
cd mysql scripts/mysql_install_db --user=mysql
Set the necessary permissions for the MySQL server and data directories:
chown -R root . chown -R mysql data chgrp -R mysql .
Note that the data directory on each machine hosting a data
node is /usr/local/mysql/data. We will
make use of this piece of information when we configure the
management node. (See Section 15.3.3, “Configuration”.)
Copy the MySQL startup script to the appropriate directory, make it executable, and set it to start when the operating system is booted up:
cp support-files/mysql.server /etc/rc.d/init.d/ chmod +x /etc/rc.d/init.d/mysql.server chkconfig --add mysql.server
Here we use Red Hat's chkconfig for creating links to the startup scripts; use whatever means is appropriate for this purpose on your operating system/distribution, such as update-rc.d on Debian.
Remember that the steps listed above must be performed separately for each machine on which a storage or SQL node is to reside.
Management Node Installation
For the MGM (management) node, it is not necessary to install
the mysqld executable, only the binaries for
the MGM server and client, which can be found in the downloaded
-max archive. Again we assume that you have
placed this file in /var/tmp. As system
root (that is, after using sudo, su
root, or your system's equivalent for temporarily
assuming the system administrator account's privileges), perform
the following steps to install ndb_mgmd and
ndb_mgm on the Cluster management node host:
Move to the /var/tmp directory, and
extract the ndb_mgm and
ndb_mgmd from the archive into a suitable
directory such as /usr/local/bin:
cd /var/tmp tar -zxvf mysql-max-5.0.16-pc-linux-gnu-i686.tar.gz /usr/local/bin '*/bin/ndb_mgm*'
Move to the directory into which you unpacked the files, and then make both of these executable:
cd /usr/local/bin chmod +x ndb_mgm*
In Section 15.3.3, “Configuration”, we will create and write configuration files for all of the nodes in our example Cluster.
For our 4-node, 4-host MySQL Cluster, we will need to write 4 configuration files, 1 per node/host.
Each data node or SQL node will require a
my.cnf file that provides two pieces of
information: a
connectstring telling the
node where to find the MGM node, and a line telling the
MySQL server on this host (the machine hosting the data
node) to run in NDB mode.
For more information on connectstrings, see
Section 15.4.4.2, “The MySQL Cluster connectstring”.
The management node will need a
config.ini file telling it how many
replicas are to be maintained, how much memory to allocate
for data and indexes on each data node, where to find the
data nodes, where data will be saved to disk on each data
node, and where to find any SQL nodes.
Configuring the Storage and SQL Nodes
The my.cnf file needed for the data nodes
is fairly simple. The configuration file should be located in
the /etc directory and can be edited (and
created if necessary) using any text editor, for example:
vi /etc/my.cnf
For each data node and SQL node in our example setup,
my.cnf should look like this:
# Options for mysqld process: [MYSQLD] ndbcluster # run NDB engine ndb-connectstring=192.168.0.10 # location of MGM node # Options for ndbd process: [MYSQL_CLUSTER] ndb-connectstring=192.168.0.10 # location of MGM node
After entering the above, save this file and exit the text editor. Do this for the machines hosting data node "A", data node "B", and the SQL node.
Configuring the Management Node
The first step in configuring the MGM node is to create the directory in which the configuration file can be found and then to create the file itself. For example (running as root):
mkdir /var/lib/mysql-cluster cd /var/lib/mysql-cluster vi config.ini
We show vi being used here to create the file, but any text editor should work just as well.
For our representative setup, the
config.ini file should read as follows:
# Options affecting ndbd processes on all data nodes:
[NDBD DEFAULT]
NoOfReplicas=2 # Number of replicas
DataMemory=80M # How much memory to allocate for data storage
IndexMemory=18M # How much memory to allocate for index storage
# For DataMemory and IndexMemory, we have used the
# default values. Since the "world" database takes up
# only about 500KB, this should be more than enough for
# this example Cluster setup.
# TCP/IP options:
[TCP DEFAULT]
portnumber=2202 # This the default; however, you can use any
# port that is free for all the hosts in cluster
# Note: It is recommended beginning with MySQL 5.0 that
# you do not specify the portnumber at all and simply allow
# the default value to be used instead
# Management process options:
[NDB_MGMD]
hostname=192.168.0.10 # Hostname or IP address of MGM node
datadir=/var/lib/mysql-cluster # Directory for MGM node logfiles
# Options for data node "A":
[NDBD]
# (one [NDBD] section per data node)
hostname=192.168.0.30 # Hostname or IP address
datadir=/usr/local/mysql/data # Directory for this data node's datafiles
# Options for data node "B":
[NDBD]
hostname=192.168.0.40 # Hostname or IP address
datadir=/usr/local/mysql/data # Directory for this data node's datafiles
# SQL node options:
[MYSQLD]
hostname=192.168.0.20 # Hostname or IP address
# (additional mysqld connections can be
# specified for this node for various
# purposes such as running ndb_restore)
(NOTE: The "world" database can be downloaded from http://dev.mysql.com/doc/ where it can be found listed under "Examples".)
Once all the configuration files have been created and these minimal options have been specified, you are ready to proceed with starting the cluster and verifying that all processes are running. We discuss how this is done in Section 15.3.4, “Initial Startup”.
For more detailed information about the available MySQL Cluster configuration parameters and their uses, see Section 15.4.4, “Configuration File” and Section 15.4, “MySQL Cluster Configuration”. For configuration of MySQL Cluster as relates to making backups, see Section 15.6.5.4, “Configuration for Cluster Backup”.
Note: The default port for Cluster management nodes is 1186; the default port for data nodes is 2202. Beginning with MySQL 5.0.3, this restriction is lifted, and the cluster will automatically allocate ports for data nodes from those that are already free.
Starting the cluster is not very difficult once it has been configured. Each cluster node process must be started separately, and on the host where it resides. While it is possible to start the nodes in any order, it is recommended that the management node be started first, followed by the storage nodes, and then finally by any SQL nodes:
On the management host, issue the following command from the system shell to start the MGM node process:
shell> ndb_mgmd -f /var/lib/mysql-cluster/config.ini
Note that ndb_mgmd must be told where to
find its configuration file, using the -f
or --config-file option. (See
Section 15.5.3, “ndb_mgmd, the Management Server Process” for details.)
On each of the data node hosts, run this command to start the NDBD process for the first time:
shell> ndbd --initial
Note that it is very important to use the
--initial parameter
only when starting
ndbd for the first time, or when
restarting after backup/restore or configuration change.
This is because this parameter will cause the node to delete
any files created by earlier ndbd
instances needed for recovery, including the recovery log
files.
If you have used the RPMs to install MySQL on the cluster
host where the SQL node is to reside, you can (and should)
use the startup script installed to
/etc/init.d to start the MySQL server
process on the SQL node. Note that you need to install the
-max server RPM in addition
to the Standard server RPM in order to run the
-max server binary.
If all has gone well, and the cluster has been set up correctly, the cluster should now be operational. You can test this by invoking the ndb_mgm management node client; the output should look like what is shown below:
shell> ndb_mgm
-- NDB Cluster -- Management Client --
ndb_mgm> SHOW
Connected to Management Server at: localhost:1186
Cluster Configuration
---------------------
[ndbd(NDB)] 2 node(s)
id=2 @192.168.0.30 (Version: 5.0.16, Nodegroup: 0, Master)
id=3 @192.168.0.40 (Version: 5.0.16, Nodegroup: 0)
[ndb_mgmd(MGM)] 1 node(s)
id=1 @192.168.0.10 (Version: 5.0.16)
[mysqld(SQL)] 1 node(s)
id=4 (Version: 5.0.16)
You may see some slight differences in the output depending upon the exact version of MySQL that you are using.
Note: If you are using an older
version of MySQL, you may see the SQL node referenced as
‘[mysqld(API)]’. This reflects an
older usage which is now deprecated.
You should now be ready to work with databases, tables, and data in MySQL Cluster. See Section 15.3.5, “Loading Sample Data and Performing Queries” for a brief discussion.
Working with data in MySQL Cluster is not much different than working with it MySQL without Cluster. There are two points to keep in mind when doing so:
Tables must be created with the ENGINE=NDB
or ENGINE=NDBCLUSTER option, or be altered
(using ALTER TABLE) to use the NDB
CLuster storage engine in order to have them replicated in
the cluster. If you are importing tables from an existing
database using the output of mysqldump,
you can open the SQL script(s) in a text editor and add this
option to any table creation statements, or replace any
existing ENGINE (or
TYPE) option(s) with one of these. For
example, suppose that you have the sample
world database on another MySQL server
(that does not support MySQL Cluster), and you wish to
export the definition for the City table:
shell> mysqldump --add-drop-table world City > city_table.sql
The resulting city_table.sql file will
contain this table creation statement (and the
INSERT statements necessary to import the
table data):
DROP TABLE IF EXISTS City; CREATE TABLE City ( ID int(11) NOT NULL auto_increment, Name char(35) NOT NULL default '', CountryCode char(3) NOT NULL default '', District char(20) NOT NULL default '', Population int(11) NOT NULL default '0', PRIMARY KEY (ID) ) ENGINE=MyISAM; INSERT INTO City VALUES (1,'Kabul','AFG','Kabol',1780000); INSERT INTO City VALUES (2,'Qandahar','AFG','Qandahar',237500); INSERT INTO City VALUES (3,'Herat','AFG','Herat',186800); # (remaining INSERT statements omitted)
You will need to make sure that MySQL uses the NDB storage
engine for this table. There are two ways that this can be
accomplished. One of these is, before
importing the table into the Cluster database, to modify its
definition so that it reads (still using
City as an example):
DROP TABLE IF EXISTS City; CREATE TABLE City ( ID int(11) NOT NULL auto_increment, Name char(35) NOT NULL default '', CountryCode char(3) NOT NULL default '', District char(20) NOT NULL default '', Population int(11) NOT NULL default '0', PRIMARY KEY (ID) ) ENGINE=NDBCLUSTER; INSERT INTO City VALUES (1,'Kabul','AFG','Kabol',1780000); INSERT INTO City VALUES (2,'Qandahar','AFG','Qandahar',237500); INSERT INTO City VALUES (3,'Herat','AFG','Herat',186800); # (etc.)
This will need to be done for the definition of each table
that is to be part of the clustered database. The easiest
way to accomplish this is simply to do a search-and-replace
on the world.sql file and replace all
instances of TYPE=MyISAM or
ENGINE=MyISAM with
ENGINE=NDBCLUSTER. If you do not wish to
modify the file, you can also use ALTER
TABLE; see below for particulars.
Assuming that you have already created a database named
world on the SQL node of the cluster, you
can then use the mysql command-line
client to read city_table.sql, and
create and populate the corresponding table in the usual
manner:
shell> mysql world < city_table.sql
It is very important to keep in mind that the above command must be executed on the host where the SQL node is running -- in this case, on the machine with the IP address 192.168.0.20.
To create a copy of the world database on
the SQL node, save the file to
/usr/local/mysql/data, then run
shell>cd /usr/local/mysql/datashell>mysql world < world.sql
Of course, the SQL script must be readable by the
mysql system user. If you save the file
to a different location, adjust the above accordingly.
It is important to note that NDB Cluster
in MySQL 5.0 does not support autodiscovery of
databases. (See
Section 15.8, “Known Limitations of MySQL Cluster”.) This means
that, once the world database and its
tables have been created on one data node, you need to issue
the command CREATE DATABASE world;
(beginning with MySQL 5.0.2, you may use CREATE
SCHEMA world; instead), followed by FLUSH
TABLES; on each SQL node in the cluster. This will
cause the node to recognise the database and read its table
definitions.
Running SELECT queries on the SQL node is no different than running them on any other instance of a MySQL server. To run queries from the command line, you first need to log in to the MySQL Monitor in the usual way:
shell>mysql -u root -pEnter password: Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 1 to server version: 5.0.16 Type 'help;' or '\h' for help. Type '\c' to clear the buffer. mysql>
If you did not modify the ENGINE= clauses
in the table definitions prior to importing the SQL script,
then you should at this point run the following commands:
mysql>USE world;mysql>ALTER TABLE City ENGINE=NDBCLUSTER;mysql>ALTER TABLE Country ENGINE=NDBCLUSTER;mysql>ALTER TABLE CountryLanguage ENGINE=NDBCLUSTER;
Note that we simply use the MySQL server's default
root account with its empty password
here. Of course, in a production setting, you should
always follow the standard security
precautions for installing a MySQL server, including the
setting of a strong root password and creation of a user
account with only those privileges required to accomplish
the tasks necessary for that user. For more information
about these, see Section 5.7, “The MySQL Access Privilege System”.
It is worth taking into account that Cluster nodes do not
make use of the MySQL privileges system when accessing one
another, and setting or changing MySQL user accounts
(including the root account) has no
effect on interaction between nodes, only on applications
accessing the SQL node.
Selecting a database and running a SELECT query against a table in that database is also accomplished in the usual manner, as is exiting the MySQL Monitor:
mysql>USE world;mysql>SELECT Name, Population FROM City ORDER BY Population DESC LIMIT 5;+-----------+------------+ | Name | Population | +-----------+------------+ | Bombay | 10500000 | | Seoul | 9981619 | | São Paulo | 9968485 | | Shanghai | 9696300 | | Jakarta | 9604900 | +-----------+------------+ 5 rows in set (0.34 sec) mysql>\qBye shell>
Applications using MySQL can use standard APIs. It is
important to remember that your application must access the
SQL node, and not the MGM or storage nodes. This brief
example shows how we might execute the same query as above
using PHP 5's mysqli extension running on
a Web server elsewhere on the network:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<meta http-equiv="Content-Type"
content="text/html; charset=iso-8859-1">
<title>SIMPLE mysqli SELECT</title>
</head>
<body>
<?php
# connect to SQL node:
$link = new mysqli('192.168.0.20', 'root', '', 'world');
# parameters for mysqli constructor are:
# host, user, password, database
if( mysqli_connect_errno() )
die("Connect failed: " . mysqli_connect_error());
$query = "SELECT Name, Population
FROM City
ORDER BY Population DESC
LIMIT 5";
# if no errors...
if( $result = $link->query($query) )
{
?>
<table border="1" width="40%" cellpadding="4" cellspacing ="1">
<tbody>
<tr>
<th width="10%">City</th>
<th>Population</th>
</tr>
<?
# then display the results...
while($row = $result->fetch_object())
printf(<tr>\n <td align=\"center\">%s</td><td>%d</td>\n</tr>\n",
$row->Name, $row->Population);
?>
</tbody
</table>
<?
# ...and verify the number of rows that were retrieved
printf("<p>Affected rows: %d</p>\n", $link->affected_rows);
}
else
# otherwise, tell us what went wrong
echo mysqli_error();
# free the result set and the mysqli connection object
$result->close();
$link->close();
?>
</body>
</html>
We assume that the process running on the Web server can reach the IP address of the SQL node.
In a similar fashion, you can use the MySQL C API, Perl-DBI, Python-mysql, or MySQL AB's own Connectors to perform the tasks of data definition and manipulation just as you would normally with MySQL.
Also remember that each NDB
table must have a primary key. If no primary key
is defined by the user when a table is created, the
NDB Cluster storage engine will
automatically generate a hidden one.
(Note: This hidden key
takes up space just as does any other table index. It is not
uncommon to encounter problems due to insufficient memory
for accommodating these automatically created keys.)
To shut down the cluster simply enter the following in a shell on the machine hosting the MGM node:
shell> ndb_mgm -e shutdown
This will cause the ndb_mgm,
ndb_mgmd, and any ndbd
processes to terminate gracefully. Any SQL nodes can be
terminated using mysqladmin shutdown and
other means. Note that the -e option here is
used to pass a command to the ndb_mgm client
from the shell. See Section 4.3.1, “Using Options on the Command Line”.
To restart the cluster, simply run these commands:
On the management host (192.168.0.10 in
our setup):
shell> ndb_mgmd -f /var/lib/mysql-cluster/config.ini
On each of the data node hosts
(192.168.0.30 and
192.168.0.40):
shell> ndbd
Remember not to invoke this
command with the --initial option when
restarting an NDBD node normally.
And on the SQL host (192.168.0.20):
shell> mysqld &
For information on making Cluster backups, see Section 15.6.5.2, “Using The Management Server to Create a Backup”.
To restore the cluster from backup requires the use of the ndb_restore command. This is covered in Section 15.6.5.3, “How to Restore a Cluster Backup”.
More information on configuring MySQL Cluster can be found in Section 15.4, “MySQL Cluster Configuration”.
A MySQL server that is part of a MySQL Cluster differs in only one
respect from a normal (non-clustered) MySQL server, in that it
employs the NDB Cluster storage engine. This
engine is also referred to simply as NDB, and
the two forms of the name are synonomous.
In order to avoid unnecessary allocation of resources, the server
is configured by default with the NDB storage
engine disabled. To enable NDB, you will need
to modify the server's my.cnf configuration
file, or start the server with the --ndbcluster
option.
Since the MySQL server is a part of the cluster, it will also need
to know how to access an MGM node in order to obtain the cluster
configuration data. The default behavior is to look for the MGM
node on localhost. However, should you need to
specify its location elsewhere, this can be done in
my.cnf or on the MySQL server command line.
Before the NDB storage engine can be used, at
least one MGM node must be operational, as well as any desired
data nodes.
NDB, the Cluster storage engine, is available
in binary distributions for Linux, Mac OS X, and Solaris. It is
not yet supported on Windows, but we intend to make it available
for win32 and other platforms in the near future.
If you choose to build from a source tarball or the MySQL
5.0 BitKeeper tree, be sure to use the
--with-ndbcluster option when running
configure. You can also use the
BUILD/compile-pentium-max build script. Note
that this script includes OpenSSL, so you must have or obtain
OpenSSL to build successfully; otherwise you will need to modify
compile-pentium-max to exclude this
requirement. Of course, you can also just follow the standard
instructions for compiling your own binaries, then perform the
usual tests and installation procedure. See
Section 2.8.3, “Installing from the Development Source Tree”.
In the next few sections, we assume that you are already familiar with installing MySQL, and here we cover only the differences between configuring MySQL Cluster and configuring MySQL without clustering. (See Chapter 2, Installing MySQL if you require more information about the latter.)
You will find Cluster configuration easiest if you have already
have all management and data nodes running first; this is likely
to be the most time-consuming part of the configuration. Editing
the my.cnf file is fairly straightforward,
and this section will cover only any differences from
configuring MySQL without clustering.
In order to familiarise you with the basics, we will describe the simplest possible configuration for a functional MySQL Cluster. After this, you should be able to design your desired setup from the information provided in the other relevant sections of this chapter.
First, you need to create a configuration directory such as
/var/lib/mysql-cluster, by executing the
following command as the system root user:
shell> mkdir /var/lib/mysql-cluster
In this directory, create a file named
config.ini with the following information,
substituting appropriate values for HostName
and DataDir as necessary for your system.
# file "config.ini" - showing minimal setup consisting of 1 data node, # 1 management server, and 3 MySQL servers. # The empty default sections are not required, and are shown only for # the sake of completeness. # Data nodes must provide a hostname but MySQL Servers are not required # to do so. # If you don't know the hostname for your machine, use localhost. # The DataDir parameter also has a default value, but it is recommended to # set it explicitly. # Note: DB, API, and MGM are aliases for NDBD, MYSQLD, and NDB_MGMD # respectively. DB and API are deprecated and should not be used in new # installations. [NDBD DEFAULT] NoOfReplicas= 1 [MYSQLD DEFAULT] [NDB_MGMD DEFAULT] [TCP DEFAULT] [NDB_MGMD] HostName= myhost.example.com [NDBD] HostName= myhost.example.com DataDir= /var/lib/mysql-cluster [MYSQLD] [MYSQLD] [MYSQLD]
You can now start the management server as follows:
shell>cd /var/lib/mysql-clustershell>ndb_mgmd
Then start a single DB node by running ndbd.
When starting ndbd for a given DB node for
the very first time, you should use the
--initial option as shown here:
shell> ndbd --initial
For subsequent ndbd starts, you will generally not want to use this option:
shell> ndbd
This is because the --initial option will
delete all existing data and log files (as well as all table
metadata) for this data node and create new ones. One exception
to this rule is when restarting the cluster and restoring from
backup after adding new data nodes.
By default, ndbd will look for the management
server at localhost on port 1186.
Note: If you have installed
MySQL from a binary tarball, you will need to specify the path
of the ndb_mgmd and ndbd
servers explicitly. (Normally, these will be found in
/usr/local/mysql/bin.)
Finally, go to the MySQL data directory (usually
/var/lib/mysql or
/usr/local/mysql/data), and make sure that
the my.cnf file contains the option
necessary to enable the NDB storage engine:
[mysqld] ndbcluster
You can now start the MySQL server as usual:
shell> mysqld_safe --user=mysql &
Wait a moment to make sure the MySQL server is running properly.
If you see the notice mysql ended, check the
server's .err file to find out what went
wrong.
If all has gone well so far, you now can start using the cluster:
shell>mysqlWelcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 1 to server version: 5.0.16-Max Type 'help;' or '\h' for help. Type '\c' to clear the buffer. mysql>SHOW ENGINES\G... *************************** 12. row *************************** Engine: NDBCLUSTER Support: YES Comment: Clustered, fault-tolerant, memory-based tables *************************** 13. row *************************** Engine: NDB Support: YES Comment: Alias for NDBCLUSTER ...
(Note that the row numbers shown in the example output above may be different from those shown on your system, depending upon the MySQL version being used and how it is configured.)
shell>mysqlWelcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 1 to server version: 5.0.16-Max Type 'help;' or '\h' for help. Type '\c' to clear the buffer. mysql>USE test;Database changed mysql>CREATE TABLE ctest (i INT) ENGINE=NDBCLUSTER;Query OK, 0 rows affected (0.09 sec) mysql>SHOW CREATE TABLE ctest \G*************************** 1. row *************************** Table: ctest Create Table: CREATE TABLE `ctest` ( `i` int(11) default NULL ) ENGINE=ndbcluster DEFAULT CHARSET=latin1 1 row in set (0.00 sec)
To check that your nodes were set up properly, start the management client as shown:
shell> ndb_mgm
You can then use the SHOW command from within the management client in order to obtain a report on the cluster's status:
NDB> SHOW Cluster Configuration --------------------- [ndbd(NDB)] 1 node(s) id=2 @127.0.0.1 (Version: 3.5.3, Nodegroup: 0, Master) [ndb_mgmd(MGM)] 1 node(s) id=1 @127.0.0.1 (Version: 3.5.3) [mysqld(API)] 3 node(s) id=3 @127.0.0.1 (Version: 3.5.3) id=4 (not connected, accepting connect from any host) id=5 (not connected, accepting connect from any host)
At this point, you have successfully set up a working MySQL
Cluster. You can now store data in the cluster by using any
table created with ENGINE=NDBCLUSTER or its
alias ENGINE=NDB.
connectstringConfiguring MySQL Cluster requires working with two files:
my.cnf: Specifies options for all MySQL
Cluster executables. This file, with which you should be
familiar with from previous work with MySQL, must be
accessible by each executable running in the cluster.
config.ini: This file is read only by
the MySQL Cluster management server, which then distributes
the information contained in this file to all processes
participating in the cluster.
config.ini contains a description of
each node involved in the cluster. This includes
configuration parameters for data nodes and configuration
parameters for connections between all nodes in the cluster.
We are continuously making improvements in Cluster configuration and attempting to simplify this process. While we strive to maintain backwards compatibility, there may be times when introduce an incompatible change. In such cases we will try to let Cluster users know in advance if a change is not backwards compatible. If you find such a change which we have not documented, please use our Bugs Database to report it.
In order to support MySQL Cluster, you will need to update
my.cnf as shown in the example below.
Note that the options shown here should not be confused with
those occurring in config.ini files. You
may also specify these parameters when invoking the
executables from the command line.
# my.cnf # example additions to my.cnf for MySQL Cluster # (valid in MySQL 5.0) # enable ndbcluster storage engine, and provide connectstring for # management server host (default port is 1186) [mysqld] ndbcluster ndb-connectstring=ndb_mgmd.mysql.com # provide connectstring for management server host (default port: 1186) [ndbd] connect-string=ndb_mgmd.mysql.com # provide connectstring for management server host (default port: 1186) [ndb_mgm] connect-string=ndb_mgmd.mysql.com # provide location of cluster configuration file [ndb_mgmd] config-file=/etc/config.ini
(For more information on connectstrings, see
Section 15.4.4.2, “The MySQL Cluster connectstring”.)
# my.cnf # example additions to my.cnf for MySQL Cluster # (will work on all versions) # enable ndbcluster storage engine, and provide connectstring for management # server host to the default port 1186 [mysqld] ndbcluster ndb-connectstring=ndb_mgmd.mysql.com:1186
You may also use a separate [mysql_cluster]
section in the cluster my.cnf for
settings to be read by and affecting all executables:
# cluster-specific settings [mysql_cluster] ndb-connectstring=ndb_mgmd.mysql.com:1186
Currently the configuration file is in INI format, and is
named config.ini by default. It is read
by ndb_mgmd at startup and can be placed
anywhere. Its location and name are specified by using
--config-file=[
on the command line with ndb_mgmd. If the
configuration file is not specified,
ndb_mgmd by default tries to read a
<path>]<filename>config.ini file located in the current
working directory.
Default values are defined for most parameters, and can also
be specified in config.ini. To create a
default value section, simply add the word
DEFAULT to the section name. For example,
data nodes are configured using [NDBD]
sections. If all data nodes use the same data memory size, and
this is not the same as the default size, create an
[NDBD DEFAULT] section containing a
DataMemory line to specify the default data
memory size for all data nodes.
The INI format consists of sections preceded by section
headings (surrounded by square brackets), followed by the
appropriate parameter names and values. One deviation from the
standard format is that the parameter name and value can be
separated by a colon (‘:’) as
well as the equals sign (‘=’);
another is that sections are not uniquely identified by name.
Instead, unique entries (such as two different nodes of the
same type) are identified by a unique ID.
At a minimum, the configuration file must define the computers and nodes involved in the cluster and on which computers these nodes are located. An example of a simple configuration file for a cluster consisting of one management server, two data nodes and two MySQL servers is shown below:
# file "config.ini" - 2 data nodes and 2 SQL nodes # This file is placed in the startup directory of ndb_mgmd (the management # server) # The first MySQL Server can be started from any host. The second can be started # only on the host mysqld_5.mysql.com [NDBD DEFAULT] NoOfReplicas= 2 DataDir= /var/lib/mysql-cluster [NDB_MGMD] Hostname= ndb_mgmd.mysql.com DataDir= /var/lib/mysql-cluster [NDBD] HostName= ndbd_2.mysql.com [NDBD] HostName= ndbd_3.mysql.com [MYSQLD] [MYSQLD] HostName= mysqld_5.mysql.com
There are six different sections in this configuration file:
[COMPUTER]: Defines the the cluster
hosts.
[NDBD]: Defines the cluster's data
nodes.
[MYSQLD]: Defines the cluster's MySQL
server nodes.
[MGM] or [NDB_MGMD]:
Defines the cluster's management server node.
[TCP]: Defines TCP/IP connections
between nodes in the cluster, with TCP/IP being the
default connection protocol.
[SHM]: Defines shared-memory
connections between nodes. Formerly, this type of
connection was available only in binaries that were built
using the --with-ndb-shm option. In MySQL
5.0-Max, it is enabled by default, but should
still be considered experimental.
Note that each node has its own section in the
config.ini. For instance, since this
cluster has two data nodes, the configuration file contains
two sections defining these nodes.
You can define DEFAULT values for each
section. All Cluster parameter names are case-insensitive.
With the exception of the MySQL Cluster management server (ndb_mgmd), each node making up a MySQL Cluster requires a connectstring which points to the management server's location. This is used in establishing a connection to the management server as well as in performing other tasks depending on the node's role in the cluster. The syntax for a connectstring is as follows:
<connectstring> :=
[<nodeid-specification>,]<host-specification>[,<host-specification>]
<nodeid-specification> := node_id
<host-specification> := host[:port]
node_id is an integer larger than 1 which
identifies a node in config.ini.
port is an integer referring to a
regular Unix port. host is a string
representing a valid Internet host address.
example 1 (long): "nodeid=2,myhost1:1100,myhost2:1100,192.168.0.3:1200" example 2 (short): "myhost1"
All nodes will use localhost:1186 as the
default connectstring value if none is provided. If
<port> is omitted from the
connectstring, the default port is 1186. This port should
always be available on the network, since it has been assigned
by IANA for this purpose (see
http://www.iana.org/assignments/port-numbers
for details).
By listing multiple
<host-specification> values, it is
possible to designate several redundant management servers. A
cluster node will attempt to contact successive management
servers on each host in the order specified, until a
successful connection has been established.
There are a number of different ways to specify the connectstring:
Each executable has its own command line option which enables specifying the management server at startup. (See the documentation for the respective executable.)
It is also possible to set the connectstring for all nodes
in the cluster at once by placing it in a
[mysql_cluster] section in the
management server's my.cnf file.
For backwards compatibility, two other options are available, using the same syntax:
Set the NDB_CONNECTSTRING environment
variable to contain the connectstring.
Write the connectstring for each executable into a
text file named Ndb.cfg and place
this file in the executable's startup directory.
However, these are now deprecated and should not be used for new installations.
The recommended method for specifying the connectstring is to
set it on the command line or in the
my.cnf file for each executable.
The [COMPUTER] section has no real
significance other than serving as a way to avoid the need of
defining host names for each node in the system. All
parameters mentioned here are required.
[COMPUTER]Id
This is an integer value, used to refer to the host computer elsewhere in the configuration file.
[COMPUTER]HostName
This is the computer's hostname or IP address.
The [NDB_MGMD] section (or its alias
[MGM]) is used to configure the behavior of
the management server. All parameters in the following list
can be omitted and, if so, will assume their default values.
Note: If neither the
ExecuteOnComputer nor the
HostName parameter is present, the default
value localhost will be assumed for both.
[NDB_MGMD]Id
Each node in the cluster has a unique identity, which is represented by an integer value in the range 1 to 63 inclusive. This ID is used by all internal cluster messages for addressing the node.
[NDB_MGMD]ExecuteOnComputer
This refers to one of the computers defined in the
[COMPUTER] section.
[NDB_MGMD]PortNumber
This is the port number on which the management server listens for configuration requests and management commands.
[NDB_MGMD]LogDestination
This parameter specifies where to send cluster logging
information. There are three options in this regard:
CONSOLE, SYSLOG, and
FILE:
CONSOLE outputs the log to
stdout:
CONSOLE
SYSLOG sends the log to a
syslog facility, possible values
being one of auth,
authpriv, cron,
daemon, ftp,
kern, lpr,
mail, news,
syslog, user,
uucp, local0,
local1, local2,
local3, local4,
local5, local6,
or local7.
Note: Not every facility is necessarily supported by every operating system.
SYSLOG:facility=syslog
FILE pipes the cluster log output
to a regular file on the same machine. The following
values can be specified:
filename: The name of the
logfile.
maxsize: The maximum size to
which the file can grow before logging rolls over
to a new file. When this occurs, the old logfile
is renamed by appending .x to
the filename, where x is the
next number not yet used with this name.
maxfiles: The maximum number of
logfiles.
FILE:filename=cluster.log,maxsize=1000000,maxfiles=6
It is possible to specify multiple log destinations as shown here, using a semicolon-delimited string:
CONSOLE;SYSLOG:facility=local0;FILE:filename=/var/log/mgmd
The default value for the FILE
parameter is
FILE:filename=ndb_,
where node_id_cluster.log,maxsize=1000000,maxfiles=6node_id is the ID of the
node.
[NDB_MGMD]ArbitrationRank
This parameter is used to define which nodes can act as
arbitrators. Only MGM nodes and SQL nodes can be
arbitrators. ArbitrationRank can take
one of the following values:
0: The node will never be used as
an arbitrator.
1: The node has high priority; that
is, it will be preferred as an arbitrator over
low-priority nodes.
2: Indicates a low-priority node
which be used as an arbtrator only if a node with a
higher priority is not available for that purpose.
Normally, the management server should be configured as an
arbitrator by setting its
ArbitrationRank to 1 (the default
value) and that of all SQL nodes to 0.
[NDB_MGMD]ArbitrationDelay
An integer value which causes the management server's responses to arbitration requests to be delayed by that number of milliseconds. By default, this value is 0; it is normally not necessary to change it.
[NDB_MGMD]DataDir
This sets the directory where output files from the
management server will be placed. These files include
cluster log files, process output files, and the daemon's
pid file. (For log files, this can be overridden by
setting the FILE parameter for
[NDB_MGMD]LogDestination as discussed
previously in this section.)
The [NDBD] section is used to configure the
behavior of the cluster's data nodes. There are many
parameters which control buffer sizes, pool sizes, timeouts,
and so forth. The only mandatory parameters are:
Either ExecuteOnComputer or
HostName.
The parameter NoOfReplicas
These need to be defined in the [NDBD
DEFAULT] section.
Most data node parameters are set in the [NDBD
DEFAULT] section. Only those parameters explicitly
stated as being able to set local values are allowed to be
changed in the [NDBD] section.
HostName, Id and
ExecuteOnComputer must
be defined in the local [NDBD] section.
Identifying Data Nodes
The Id value (that is, the the data node
identifier) can be allocated on the command line when the node
is started or in the configuration file.
For each parameter it is possible to use k,
M, or G as a suffix to
indicate units of 1024, 1024*1024, or 1024*1024*1024. (For
example, 100k means 100 * 1024 = 102400.)
Parameters and values are currently case-sensitive.
[NBDB]Id
This is the node ID used as the address of the node for all cluster internal messages. This is an integer between 1 and 63. Each node in the cluster has a unique identity.
[NDBD]ExecuteOnComputer
This refers to one of the computers (hosts) defined in the
COMPUTER section.
[NDBD]HostName
Specifying this parameter has an effect similar to
specifying ExecuteOnComputer. It
defines the hostname of the computer the storage node on
which is to reside. Either this parameter or
ExecuteOnComputer is required in order
to specify a hostname other than
localhost.
(OBSOLETE)
[NDBD]ServerPort
Each node in the cluster uses a port to connect to other nodes. This port is used also for non-TCP transporters in the connection setup phase. Since, the default port is allocated dynamically in such a way as to ensure that no two nodes on the same computer receive the same port number, it should not normally be necessary to specify a value for this parameter.
[NDBD]NoOfReplicas
This global parameter can be set only in the
[NDBD DEFAULT] section, and defines the
number of replicas for each table stored in the cluster.
This parameter also specifies the size of node groups. A
node group is a set of nodes all storing the same
information.
Node groups are formed implicitly. The first node group is
formed by the set of data nodes with the lowest node IDs,
the next node group by the set of the next lowest node
identities, and so on. By way of example, assume that we
have 4 data nodes and that NoOfReplicas
is set to 2. The four data nodes have node IDs 2, 3, 4 and
5. Then the first node group is formed from nodes 2 and 3,
and the second node group by nodes 4 and 5. It is
important to configure the cluster in such a manner that
nodes in the same node groups are not placed on the same
computer, as in this situation a single hardware failure
would cause the entire cluster to crash.
If no node IDs are provided then the order of the data
nodes will be the determining factor for the node group.
Whether or not explicit assignments are made, they can be
viewed in the output of the management client's
SHOW command.
There is no default value for
NoOfReplicas; the maximum possible
value is 4.
[NDBD]DataDir
This parameter specifies the directory where trace files, log files, pid files and error logs are placed.
[NDBD]FileSystemPath
This parameter specifies the directory where all files
created for metadata, REDO logs, UNDO logs and data files
are placed. The default is the directory specified by
DataDir.
Note: This directory must
exist before the ndbd process is
initiated.
The recommended directory hierarchy for MySQL Cluster
includes /var/lib/mysql-cluster,
under which a directory for the node's filesystem is
created. This subdirectory contains the node ID. For
example, if the node ID is 2, then this subdirectory is
named ndb_2_fs.
[NDBD]BackupDataDir
It is also possible to specify the directory in which
backups are placed. By default, this directory is
.
(See above.)
FileSystemPath/BACKUP
Data Memory and Index Memory
DataMemory and
IndexMemory are parameters specifying the
size of memory segments used to store the actual records and
their indexes. In setting values for these, it is important to
understand how DataMemory and
IndexMemory are used, as they usually need
to be updated in order to reflect actual usage by the cluster:
[NDBD]DataMemory
This parameter defines the amount of space available for storing database records. The entire amount is allocated in memory, so it is extremely important that the machine has sufficient physical memory to accomodate this value.
The memory allocated by DataMemory is
used to store both the the actual records and indexes.
Each record is currently of fixed size. (Even
VARCHAR columns are stored as
fixed-width columns.) There is a 16-byte overhead on each
record; an additional amount for each record is incurred
because it is stored in a 32KB page with 128 byte page
overhead (see below). There is also a small amount wasted
per page due to the fact that each record is stored in
only one page. The maximum record size is currently 8052
bytes.
The memory space defined by DataMemory
is also used to store ordered indexes, which use about 10
bytes per record. Each table row is represented in the
ordered index. A common error among users is to assume
that all indexes are stored in the memory allocated by
IndexMemory, but this is not the case:
only primary key and unique hash indexes use this memory;
ordered indexes use the memory allocated by
DataMemory. However, creating a primary
key or unique hash index also creates an ordered index on
the same keys, unless you specify USING
HASH in the index creation statement. This can
be verified by running ndb_desc -d
db_name
table_name in the
management client.
The memory space allocated by
DataMemory consists of 32KB pages,
which are allocated to table fragments. Each table is
normally partitioned into the same number of fragments as
there are data nodes in the cluster. Thus, for each node,
there are the same number of fragments as are set in
NoOfReplicas. Once a page has been
allocated, it is currently not possible to return it to
the pool of free pages, except by deleting the table.
Performing a node recovery also will compress the
partition because all records are inserted into empty
partitions from other live nodes.
The DataMemory memory space also
contains UNDO information: For each update, a copy of the
unaltered record is allocated in the
DataMemory. There is also a reference
to each copy in the ordered table indexes. Unique hash
indexes are updated only when the unique index columns are
updated, in which case a new entry in the index table is
inserted and the old entry is deleted upon commit. For
this reason, it is also necessary to allocate enough
memory to handle the largest transactions performed by
applications using the cluster. In any case, performing a
few large transactions holds no advantage over using many
smaller ones, for the following reasons:
Large transactions are not any faster than smaller ones
Large transactions increase the number of operations that are lost and must be repeated in event of transaction failure
Large transactions use more memory
The default value for DataMemory is
80MB; the minimum is 1MB. There is no maximum size, but in
reality the maximum size has to be adapted so that the
process does not start swapping when the limit is reached.
This limit is determined by the amount of physical RAM
available on the machine and by the amount of memory that
the operating system may commit to any one process. 32-bit
operating systems are generally limited to 2-4GB per
process; 64-bit operating systems can use more. For large
databases, it may be preferable to use a 64-bit operating
system for this reason. In addition, it is also possible
to run more than one ndbd process per
machine, and this may prove advantageous on machines with
multiple CPUs.
[NDBD]IndexMemory
This parameter controls the amount of storage used for hash indexes in MySQL Cluster. Hash indexes are always used for primary key indexes, unique indexes, and unique constraints. Note that when defining a primary key and a unique index, two indexes will be created, one of which is a hash index used for all tuple accesses as well as lock handling. It is also used to enforce unique constraints.
The size of the hash index is 25 bytes per record, plus the size of the primary key. For primary keys larger than 32 bytes another 8 bytes is added.
Consider a table defined by
CREATE TABLE example ( a INT NOT NULL, b INT NOT NULL, c INT NOT NULL, PRIMARY KEY(a), UNIQUE(b) ) ENGINE=NDBCLUSTER;
There are 12 bytes overhead (having no nullable columns
saves 4 bytes of overhead) plus 12 bytes of data per
record. In addition we have two ordered indexes on columns
a and b consuming
roughly 10 bytes each per record. There is a primary key
hash index on the base table using roughly 29 bytes per
record. The unique constraint is implemented by a separate
table with b as primary key and
a as a column. This table will consume
an additional 29 bytes of index memory per record in the
example table as well as 12 bytes of
overhead, plus 8 bytes of record data.
Thus, for one million records, we need 58 MB for index memory to handle the hash indexes for the primary key and the unique constraint. We also need 64 MB for the records of the base table and the unique index table, plus the two ordered index tables.
You can see that hash indexes takes up a fair amount of memory space; however, they provide very fast access to the data in return. They are also used in MySQL Cluster to handle uniqueness constraints.
Currently the only partitioning algorithm is hashing and ordered indexes are local to each node. Thus ordered indexes cannot be used to handle uniqueness constraints in the general case.
An important point for both IndexMemory
and DataMemory is that the total
database size is the sum of all data memory and all index
memory for each node group. Each node group is used to
store replicated information, so if there are four nodes
with 2 replicas, then there will be two node groups. Thus,
the total data memory available is
2*DataMemory for each data node.
It is highly recommended that
DataMemory and
IndexMemory be set to the same values
for all nodes. Since data is distributed evenly over all
nodes in the cluster the maximum amount of space available
for any node can be no greater than that of the smallest
node in the cluster.
DataMemory and
IndexMemory can be changed, but
decreasing either of these can be risky; doing so can
easily lead to a node or even an entire MySQL Cluster that
is unable to restart due to there being insufficient
memory space. Increasing these values should be
acceptable, but it is recommended that such upgrades are
performed in the same manner as a software upgrade,
beginning with an update of the configuration file, then
restarting the management server followed by restarting
each data node in turn.
Updates do not increase the amount of index memory used. Inserts take effect immediately; however, rows are not actually deleted until the transaction is committed.
The default value for IndexMemory is
18MB. The minimum is 1MB.
Transaction Parameters
The next three parameters which we discuss are important
because they affect the number of parallel transactions and
the sizes of transactions that can be handled by the system.
MaxNoOfConcurrentTransactions sets the
number of parallel transactions possible in a node;
MaxNoOfConcurrentOperations sets the number
of records that can be in update phase or locked
simultaneously.
Both of these parameters (especially
MaxNoOfConcurrentOperations) are likely
targets for users setting specific values and not using the
default value. The default value is set for systems using
small transactions, in order to ensure that these do not use
excessive memory.
[NDBD]MaxNoOfConcurrentTransactions
For each active transaction in the cluster there must be a record in one of the cluster nodes. The task of coordinating transactions is spread amongst the nodes: the total number of transaction records in the cluster is the number of transactions in any given node times the number of nodes in the cluster.
Transaction records are allocated to individual MySQL servers. Normally there is at least one transaction record allocated per connection that using any table in the cluster. For this reason, one should ensure that there are more transaction records in the cluster than there are concurrent connections to all MySQL servers in the cluster.
This parameter must be set to the same value for all cluster nodes.
Changing this parameter is never safe and doing so can cause a cluster to crash. When a node crashes one of the nodes (actually the oldest surviving node) will build up the transaction state of all transactions ongoing in the crashed node at the time of the crash. It is thus important that this node has as many transaction records as the failed node.
The default value for this parameter is 4096.
[NDBD]MaxNoOfConcurrentOperations
It is a good idea to adjust the value of this parameter according to the size and number of transactions. When performing transactions of only a few operations each and not involving a great many records, there is no need to set this parameter very high. When performing large transactions involving many records need to set this parameter higher.
Records are kept for each transaction updating cluster data, both in the transaction co-ordinator and in the nodes where the actual updates are performed. These records contain state information needed in order to find UNDO records for rollback, lock queues, and other purposes.
This parameter should be set to the number of records to be updated simultaneously in transactions, divided by the number of cluster data nodes. For example, in a cluster which has 4 data nodes and which is expected to handle 1,000,000 concurrent updates using transactions, you should set this value to 1000000 / 4 = 250000.
Read queries which set locks also cause operation records to be created. Some extra space is allocated within individual nodes to accomodate cases where the distribution is not perfect over the nodes.
When queries make use of the unique hash index, there are actually two operation records used per record in the transaction. The first record represents the read in the index table and the second handles the operation on the base table.
The default value for this parameter is 32768.
This parameter actually handles two values that can be configured separately. The first of these specifies how many operation records are to be placed with the transaction co-ordinator. The second part specifies how many operation records are to be local to the database.
A very large transaction performed on an 8-node cluster
requires as many operation records in the transaction
co-ordinator as there are reads, updates, and deletes
involved in the transaction. However, the operation
records of the are spread over all 8 nodes. Thus, if it is
necessary to configure the system for one very large
transaction, then it is a good idea to configure the two
parts separately.
MaxNoOfConcurrentOperations will always
be used to calculate the number of operation records in
the transaction co-ordinator portion of the node.
It is also important to have an idea of the memory requirements for operation records. These consume about 1KB per record.
[NDBD]MaxNoOfLocalOperations
By default, this parameter is calculated as 1.1 *
MaxNoOfConcurrentOperations which fits
systems with many simultaneous transactions, none of them
being very large. If there is a need to handle one very
large transaction at a time and there are many nodes, then
it is a good idea to override the default value by
explicitly specifying this parameter.
Transaction Temporary Storage
The next set of parameters is used to determine temporary storage when executing a query which is part of a Cluster transaction. All records are released when the query is completed and the cluster is waiting for the commit or rollback.
The default values for these parameters are adequate for most situations. However, users with a need to support transactions involving large numbers of rows or operations may need to increase these to enable better parallelism in the system, while users whose applications require relatively small transactions can decrease the values in order to save memory.
[NDBD]MaxNoOfConcurrentIndexOperations
For queries using a unique hash index another, temporary
set of operation records is used during a query's
execution phase. This parameter sets the size of that pool
of records. Thus this record is only allocated while
executing a part of a query, as soon as this part has been
executed the record is released. The state needed to
handle aborts and commits is handled by the normal
operation records where the pool size is set by the
parameter MaxNoOfConcurrentOperations.
The default value of this parameter is 8192. Only in rare cases of extremely high parallelism using unique hash indexes should it be necessary to increase this value. Using a smaller value is possible and can save memory if the DBA is certain that a high degree parallelism is not required for the cluster.
[NDBD]MaxNoOfFiredTriggers
The default value of
MaxNoOfFiredTriggers is 4000, which is
sufficient for most situations. In some cases it can even
be decreased if the DBA feels certain the need for
parallelism in the cluster is not high.
A record is created when an operation is performed that affects a unique hash index. Inserting or deleting a record in a table with unique hash indexes or updating a column that is part of a unique hash index fires an insert or a delete in the index table. The resulting record is used to represent this index table operation while waiting for the original operation that fired it to complete. This operation is short lived but can still require a large number of records in its pool for situations with many parallel write operations on a base table containing a set of unique hash indexes.
[NDBD]TransactionBufferMemory
The memory affected by this parameter is used for tracking operations fired when updating index tables and reading unique indexes. This memory is used to store the key and column information for these operations. It is only very rarely that the value for this parameter needs to be altered from the default.
Normal read and write operations use a similar buffer,
whose usage is even more short-lived. The compile-time
parameter ZATTRBUF_FILESIZE (found in
ndb/src/kernel/blocks/Dbtc/Dbtc.hpp)
set to 4000*128 bytes (500KB). A similar buffer for key
info, ZDATABUF_FILESIZE (also in
Dbtc.hpp) contains 4000 * 16 = 62.5KB
of buffer space. Dbtc is the module
which handles transaction co-ordination.
Scans and Buffering
There are additional parameters in the
Dblqh module (in
ndb/src/kernel/blocks/Dblqh/Dblqh.hpp)
which affect reads and updates. These include
ZATTRINBUF_FILESIZE, set by default to
10000*128 bytes (1250KB) and
ZDATABUF_FILE_SIZE, set by default to
10000*16 bytes (roughly 156KB) of buffer space. To date,
there have been neither any reports from users nor any
results from our own extensive tests suggesting that either
of these compile-time limits should be increased.
The default value for
TransactionBufferMemory is 1MB.
[NDBD]MaxNoOfConcurrentScans
This parameter is used to control the number of parallel
scans that can be performed in the cluster. Each
transaction co-ordinator can handle the number of parallel
scans defined for this parameter. Each scan query is
performed by scanning all partitions in parallel. Each
partition scan uses a scan record in the node where the
partition is located, the number of records being the
value of this parameter times the number of nodes. The
cluster should be able to sustain
MaxNoOfConcurrentScans scans
concurrently from all nodes in the cluster.
Scans are actually performed in two cases. The first of these cases occurs when no hash or ordered indexes exists to handle the query, in which case the query is executed by performing a full table scan. The second case is encountered when there is no hash index to support the query but there is an ordered index. Using the ordered index means executing a parallel range scan. Since the order is kept on the local partitions only, it is necessary to perform the index scan on all partitions.
The default value of
MaxNoOfConcurrentScans is 256. The
maximum value is 500.
This parameter specifies the number of scans possible in the transaction co-ordinator. If the number of local scan records is not provided, it is calculated as the