Tuesday, January 28, 2014

How To Converting MySQL InnoDB server to TokuDB


How To Converting MySQL InnoDB server to TokuDB

What is TokuDB?


TokuDB is a storage engine for MySQL and MariaDB that dramatically improves the performance of large databases (typically 50 GBs or larger). By replacing 40-year-old B-tree technology with patented Fractal Tree® technology, TokuDB speeds indexing by 10x or more, and reducing the size of the database by upt o 90%. Because of its exceptional indexing performance, TokuDB is ideal for applications that must simultaneously query and update large volumes of rapidly arriving data (e.g., clickstream analytics). And you do not have to change any application code.
Our case
We wanted to backup our main MySQL database server with half TB of data. Since TokuDB compresses data by a great deal, I decided to use their TokuDB engine for our slave server. In following article I’ll try to cover how migration went, and how we set it up to work.
In current setup we have Master – Master (or better, Slave – Slave) replication between our 2 main MySQL databases. (Both databases have same data) We decided to shut one down, copy it and create a third one on TokuDB, which would be slave to one of them, and backup it on regular basis. (If data is compressed, like TokuDB says, it should be pretty neat solution.)
Procedure of converting MySQL with InnoDB to MySQL with TokuDB
  • Stopping slave
First of all we stopped production slave on one of our MySQL servers with stop slave. We logged information about replication at this point with: show slave status\G. To make sure nobody writes to our MySQL while copying, we also shut it down.
mysql:admin@server001> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State:
                  Master_Host: server002
                  Master_User: replica
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.045177
          Read_Master_Log_Pos: 675772374
               Relay_Log_File: relay.035280
                Relay_Log_Pos: 675693021
        Relay_Master_Log_File: mysql-bin.045177
             Slave_IO_Running: No
            Slave_SQL_Running: No
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table: tba%.imported_%,tba%.tmp_%
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 675765524
              Relay_Log_Space: 675700060
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 2
1 row in set (0.00 sec)
Important parts were bolded and they mark point, from which replication must proceed, once we setup new slave.
  • Copying of MySQL data with rsync
We copied whole slave data folder with rsync. We setup rsyncd on the target server. On CentOS you need xinetd package installed to enable rsyncd daemon. Also you need to modify: /etc/xinetd.d/rsync and change line to: disable = no. Once you set that up, start up xinetd: service xinetd start All we had to do is setup /etc/rsyncd.conf. Test config was:
[data]
path = /srv/mysql/data/
hosts allow = *
hosts deny = none
list = true
uid = root
gid = root
read only = false
All we had to do is copy data with command rsync --progress -rav /srv/mysql/data server003::data This seemed as best option to copy data as fast as possible through network.
  • Starting up slave
Becouse we already copied data and logged position from which we have to start replicating on new server003, we started back our production server and let it catch replication again.
  • Setting up TokuDB
Once we had our data on target server and production MySQL’s were back, we could have simply started new server on the position logged one step before, but we also wanted to change engine from InnoDB to TokuDB. We made sure to put skip-slave-start in /etc/my.cnf and remove any binary log and relay log files (we want to clean this one up, before bringing it back to replication). We also removed master.info and relay-log.info files (might not be necessary if you wanted to simply start your MySQL and it would know where to connect and what position to start from)

We followed instruction for installing MySQL from binary and installed: Percona-5.6.13-tokudb-7.0.4. Let me mention also that ALTER table engine=TokuDB won’t work if you have foreign keys on your tables, becouse TokuDB does not have them implemented (yet). It’s important if you plan to drop any foreign keys to install 5.6.x version of MySQL, becouse dropping constraints/foreign keys is inline operation from 5.6 on.
MySQL 5.5 does not have it and it takes ages to drop foreign keys.

Make sure that when you try to start up MySQL for the first time, you have to write: default-storage-engine = innodb into your my.cnf or it won’t start. Make sure you run mysql_upgrade. When you have your MySQL up, you have to register TokuDB plugins with following commands:
mysql> install plugin tokudb soname 'ha_tokudb.so';
mysql> install plugin tokudb_user_data soname 'ha_tokudb.so';
mysql> install plugin tokudb_user_data_exact soname 'ha_tokudb.so';
mysql> install plugin tokudb_file_map soname 'ha_tokudb.so';
mysql> install plugin tokudb_fractal_tree_info soname 'ha_tokudb.so';
mysql> install plugin tokudb_fractal_tree_block_map soname 'ha_tokudb.so';
You can then optionally change your default engine back to TokuDB:
mysql> set global default_storage_engine=TokuDB;

Bash script I used to find all foreign keys on my databases (replace prefix_of_my_tables with your own):
 
#!/bin/bash
echo "SELECT CONCAT('ALTER TABLE ',table_schema,'.',table_name,' DROP FOREIGN KEY ',constraint_name,';') FROM information_schema.table_constraints WHERE CONSTRAINT_TYPE = 'FOREIGN KEY' AND CONSTRAINT_SCHEMA LIKE 'prefix_of_my_tables_%'" | mysql -h localhost | grep -v CONCAT > drop_all_foreign.sql
echo "We have written drop_all_foreign.sql"

We executed drop_all_foreign.sql and went on to the script to create .sql to alter all existing InnoDB tables (replace my_table_prefix  with your own):
 
#!/bin/bash
DATABASES=`echo "show databases"  | mysql -h localhost | grep -v Database | grep "my_table_prefix_"`

echo "Generating alter_tables.sql"
echo "" > alter_tables.sql

for DB in ${DATABASES[@]}; do
        TABLES=`echo "SELECT TABLE_NAME FROM information_schema.TABLES WHERE TABLE_SCHEMA = '$DB' AND engine = 'InnoDB'" | mysql -h localhost $DB | grep -v Tables_in | grep -v TABLE_NAME`
        for TABLE in ${TABLES[@]}; do
                echo "ALTER TABLE $DB.$TABLE ENGINE=TokuDB;" >> alter_tables.sql
        done
done
echo "Generated!"

So now we had alter_tables.sql generated and all we had to do at this point is execute it. We used screen before using cat alter_tables.sql | mysql, becouse we knew this would take quite long.

Command has to compress all InnoDB tables to TokuDB and 500GB of data takes quite some time to compress.
  • Getting slave to replication
After ALTER script has executed successfully, we just have to bring back our slave with command and settings for master log file and position from previous step. We also had to add new replication user for new slave on master. Command for bringing up MySQL slave to the position we have data from master we used:

CHANGE MASTER TO MASTER_HOST='master_server', MASTER_USER='slave_user', MASTER_PASSWORD='<some_password>', MASTER_LOG_FILE='mysql-bin.045177', MASTER_LOG_POS=675765524; 
Done.