mydumper & myloader : fast backup and restore
Lately we gave a try to mydumper as an alternative to mysqldump, and results are quite promising.
We found that mydumper performs very fast exporting both small and large datasets!!
We also found that the with large datasets restore with myloader doesn't perform a lot better than simple restore from mysqldump SQL dump: this depends from the storage engine and not from the client used to restore.
On a box we run 2 tests:
1) with a dataset that was fitting in the InnoDB buffer pool;
2) with a dataset larger than the InnoDB buffer pool.
shell$ time ./sysbench --test=tests/db/parallel_prepare.lua --oltp-tables-count=128 --oltp-table-size=1000000 --mysql-table-engine=innodb --mysql-user=root --num-threads=12 run real 22m0.013s
user 204m22.054s
sys 0m37.430s
Doing the backup with mydumper:
shell$ time ./mydumper -t 8 -B sbtest
real 0m29.807s
user 2m35.111s
sys 0m26.102s
... and with mysqldump:
shell$ time mysqldump --single-transaction sbtest > sbtest.sql
real 6m24.607s
user 5m19.355s
sys 0m46.761s
Within this test, mydumper looks around 13 times faster than mysqldump.
We also tried compression, but I/O was fast enough to make compression only an unnecessary overhead: in other words, on that hardware and with this dataset, mydumper with compression was slower than mydumper without compression.
To complete the test, we tried recovery time, after deleting and recreating and empty database:
shell$ mysql -e "drop database sbtest ; create database sbtest"
shell$ time ./myloader -t 8 -d export-20110720-090954
real 9m12.548s
user 0m55.193s
sys 0m28.316s
shell$ mysql -e "drop database sbtest ; create database sbtest"
shell$ time ( echo "SET SQL_LOG_BIN=0;" ; cat sbtest.sql ) | mysql sbtest
real 46m46.140s
user 9m3.604s
sys 0m48.256s
With this dataset, restore time using myloader was 5 times faster than using the SQL dump from mysqldump.
48 tables instead of 128 tables;
10M rows on each table instead of 1M rows;
a total dataset on disk of 114GB instead of 31GB.
First, we created the tables with sysbench:
shell$ time ./sysbench --test=tests/db/parallel_prepare.lua --oltp-tables-count=48 --oltp-table-size=10000000 --mysql-table-engine=innodb --mysql-user=root --num-threads=12 run
real 107m24.657s
user 689m2.852s
sys 2m11.980s
Backup with mydumper:
shell$ time ./mydumper -t 8 -B sbtest
real 7m42.703s
user 15m14.873s
sys 2m20.203s
The size of the backup is quite big because not compressed: 91GB
On average, mydumper was writing on disk at around 200MB/s.
Backup with mysqldump:
shell$ time mysqldump --single-transaction sbtest > sbtest.sql
real 32m53.972s
user 20m29.853s
sys 2m47.674s
mydumper was again faster than mysqldump , but not as much as in the previous test: only 4 times faster.
Was now the time to measure recovery time:
shell$ mysql -e "drop database sbtest ; create database sbtest"
shell$ time ./myloader -t 6 -d export-20110720-171706
real 130m58.403s
user 4m5.209s
sys 1m51.801s
shell$ mysql -e "drop database sbtest ; create database sbtest"
shell$ time ( echo "SET SQL_LOG_BIN=0;" ; cat sbtest.sql ) | mysql sbtest
real 204m18.121s
user 34m33.520s
sys 3m43.826s
myloader is just a bit more than 50% times faster than importing the SQL dump from mysqdump
Conclusion from second test:
a) With larger dataset, mydumper slows down because the system does more I/O as the dataset doesn't fit in memory, but still way faster than mysqldump.
b) With larger dataset, load time with myloader slowed down a lot. Although, the root cause of the performance drop isn't mydumper , but:
- more I/O (dataset + dump don't fit in RAM);
- InnoDB inserts rate degenerates with bigger tables.
On the topic of InnoDB inserts rate degeneration with big tables , probably another blog post will follow.
Notes on hardware and configuration:
CPU: 2 x 6cores with HT enabled
96 GB of RAM
FusionIO
innodb_buffer_pool_size=64G
innodb_log_file_size=2047M
innodb_io_capacity=4000
innodb_flush_log_at_trx_commit=2
(binlog disabled)
We found that mydumper performs very fast exporting both small and large datasets!!
We also found that the with large datasets restore with myloader doesn't perform a lot better than simple restore from mysqldump SQL dump: this depends from the storage engine and not from the client used to restore.
On a box we run 2 tests:
1) with a dataset that was fitting in the InnoDB buffer pool;
2) with a dataset larger than the InnoDB buffer pool.
TEST #1
We created 128 tables of 1M rows each, for a total dataset of 31GB on disk:shell$ time ./sysbench --test=tests/db/parallel_prepare.lua --oltp-tables-count=128 --oltp-table-size=1000000 --mysql-table-engine=innodb --mysql-user=root --num-threads=12 run real 22m0.013s
user 204m22.054s
sys 0m37.430s
Doing the backup with mydumper:
shell$ time ./mydumper -t 8 -B sbtest
real 0m29.807s
user 2m35.111s
sys 0m26.102s
... and with mysqldump:
shell$ time mysqldump --single-transaction sbtest > sbtest.sql
real 6m24.607s
user 5m19.355s
sys 0m46.761s
Within this test, mydumper looks around 13 times faster than mysqldump.
We also tried compression, but I/O was fast enough to make compression only an unnecessary overhead: in other words, on that hardware and with this dataset, mydumper with compression was slower than mydumper without compression.
To complete the test, we tried recovery time, after deleting and recreating and empty database:
shell$ mysql -e "drop database sbtest ; create database sbtest"
shell$ time ./myloader -t 8 -d export-20110720-090954
real 9m12.548s
user 0m55.193s
sys 0m28.316s
shell$ mysql -e "drop database sbtest ; create database sbtest"
shell$ time ( echo "SET SQL_LOG_BIN=0;" ; cat sbtest.sql ) | mysql sbtest
real 46m46.140s
user 9m3.604s
sys 0m48.256s
With this dataset, restore time using myloader was 5 times faster than using the SQL dump from mysqldump.
TEST #2
Test #2 is very similar to test #1 , but with some different in the dataset:48 tables instead of 128 tables;
10M rows on each table instead of 1M rows;
a total dataset on disk of 114GB instead of 31GB.
First, we created the tables with sysbench:
shell$ time ./sysbench --test=tests/db/parallel_prepare.lua --oltp-tables-count=48 --oltp-table-size=10000000 --mysql-table-engine=innodb --mysql-user=root --num-threads=12 run
real 107m24.657s
user 689m2.852s
sys 2m11.980s
Backup with mydumper:
shell$ time ./mydumper -t 8 -B sbtest
real 7m42.703s
user 15m14.873s
sys 2m20.203s
The size of the backup is quite big because not compressed: 91GB
On average, mydumper was writing on disk at around 200MB/s.
Backup with mysqldump:
shell$ time mysqldump --single-transaction sbtest > sbtest.sql
real 32m53.972s
user 20m29.853s
sys 2m47.674s
mydumper was again faster than mysqldump , but not as much as in the previous test: only 4 times faster.
Was now the time to measure recovery time:
shell$ mysql -e "drop database sbtest ; create database sbtest"
shell$ time ./myloader -t 6 -d export-20110720-171706
real 130m58.403s
user 4m5.209s
sys 1m51.801s
shell$ mysql -e "drop database sbtest ; create database sbtest"
shell$ time ( echo "SET SQL_LOG_BIN=0;" ; cat sbtest.sql ) | mysql sbtest
real 204m18.121s
user 34m33.520s
sys 3m43.826s
myloader is just a bit more than 50% times faster than importing the SQL dump from mysqdump
Conclusion from second test:
a) With larger dataset, mydumper slows down because the system does more I/O as the dataset doesn't fit in memory, but still way faster than mysqldump.
b) With larger dataset, load time with myloader slowed down a lot. Although, the root cause of the performance drop isn't mydumper , but:
- more I/O (dataset + dump don't fit in RAM);
- InnoDB inserts rate degenerates with bigger tables.
On the topic of InnoDB inserts rate degeneration with big tables , probably another blog post will follow.
Notes on hardware and configuration:
CPU: 2 x 6cores with HT enabled
96 GB of RAM
FusionIO
innodb_buffer_pool_size=64G
innodb_log_file_size=2047M
innodb_io_capacity=4000
innodb_flush_log_at_trx_commit=2
(binlog disabled)
Comments