MySql database replication

The following procedure has been taken from the official MySQL web site: http://dev.mysql.com/doc/refman/4.1/en/replication.html.

First of all, the replication method here explained is also called “asynchronous”, because write queries are first executed by a primary database server called Master and then (even though almost immediatly in most cases) by a secondary database server called Slave. Synchronous replication instead is a characteristic of MySQL Cluster, which will be soon the subject of the next node… I hope…

The main goals are:

  • data availability – in case of main server failure you can easily switch all your clients to the slave
  • load balancingonly SELECT queries may be sent to the slave to reduce the query processing load of the master
  • backup – you can perform database backups using a slave server without disturbing the master. The master continues to process updates while the backup is being made.

Every point of the procedure is marked with a M if done on the Master and with a S if on the Slave:

  1. M: add a user with replication privileges

    GRANT REPLICATION SLAVE ON *.* TO ‘repl_user’@’%’ IDENTIFIED BY ‘repl_password’;

    you should use an host field without wildcards (%) as a more secure connection

  2. M: enable binary logging in my.cnf (optionally include ore exclude some DB) and be sure to use the server as Master (id = 1); the lines affected are the following:

    [mysqld]
    server-id = 1
    log-bin = /var/log/mysql/mysql-bin.log
    max_binlog_size = 104857600
    binlog-do-db = mydatabasename
    binlog-ignore-db = test

  3. M: lock write operations to avoid modification while taking the copy of the database server

    FLUSH TABLES WITH READ LOCK;

    Note: don’t disconnect the client that has sent the query or the write lock will be canceled.

  4. M: take a copy of the database with shell commands (e.g. tar zcvf snapshot.tgz) or using mysqldump (also from phpMyAdmin)
  5. M: send the following query and take note of the results

    SHOW MASTER STATUS;

    Now we know exactly the status of the Master log at the moment of the copy done: filename and position; these will be the starting point to be set on the slave in the next points

  6. M: now you may unlock write operations

    UNLOCK TABLES;

  7. S: import the whole DB server saved at point 4
  8. S: be sure to have set the server id in my.cnf

    server-id=2

    binary logging is not necessary here, unless you are setting up chain replicated servers or bi-directional replication

  9. S: execute the following queries as root, using parameters from point 1 and 5

    CHANGE MASTER TO
    -> MASTER_HOST=’hostname_or_IP_ADDRESS’,
    -> MASTER_USER=’repl_user’,
    -> MASTER_PASSWORD=’repl_password’,
    -> MASTER_LOG_FILE=’log_file_name’,
    -> MASTER_LOG_POS=log_position;

    START SLAVE;

    obviously replace the example parameters with your own ones; first we have pointed the slave to the master at a specified position with the specified user, than we have started the replication

CHECKLIST


If you are wondering if everithing is working correctly, you just have to execute

SHOW PROCESSLIST;

on the slave and among the lines you should find “Waiting for master to send event”, or in case di connection problems “Connecting to master”.
Then execute

SHOW SLAVE STATUS\G;

and check the value of “Slave_IO_State” (should be “Waiting for master to send event”) and of “Seconds_Behind_Master” (should be an integer > 0). This last value is the difference from the timestamp of the last query logged on the master and the timestamp of the last query executed on the slave taken from the master log, so usually is set to 0.

TROUBLESHOOTING

There are two main reasons of a replication stop: connection problems (wrong password or network problems) or differences between the two servers; to be as clear as possible: if a query executed on the Master gives a result that is different from the result obtained by the same query on the slave, the slave automatically stops. Note that the “different result” could be an error message on the Master and a success on the Slave.
In this case you can obtain information about the query and the error using again

SHOW SLAVE STATUS\G;

To get rid of the stop (of course without starting the whole procedure from the beginning) you may alternatively:

  • manually make the Slave identical to the Master (at the point that generated the error) and try

    START SLAVE;

  • ignore the error and make the slave skip the line

    SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1;
    START SLAVE;

    note that a query containing AUTO_INCREMENT or LAST_INSERT_ID() takes two log lines so you have to set SQL_SLAVE_SKIP_COUNTER = 2

EXAMPLE: IMPROVE SLAVE PERFORMANCES

If the main goal of replication is a real time backup, could happen (as happened to me!) to configure an heavy-load Master and a slower Slave, using as main engine InnoDB.

With default settings the slave was more and more behind the master (exactly the late grown 1 second per second).
The solution that let the progressive reduction of the seconds behind master has been the fine tuning of the following settings on the slave (a 2Gb memory server):

innodb_buffer_pool_size = 1500M
innodb_flush_log_at_trx_commit = 0
innodb_flush_method=O_DSYNC

More details about these settings at http://dev.mysql.com/doc/refman/4.1/en/innodb-tuning.html, anyway shortly:

  • the first one optimize memory utilization
  • the second modify the frequency of the write on disk (the risk is the loss of some of the latest committed transactions if a crash occurs, but as it is a backup slave, we can afford it)
  • the third modify the default command (fsync) used to write data on disk.

Leave a Reply

Your email address will not be published.