For the past two months, I have been running tests on TokuDB in my free time. TokuDB is a storage engine put out by Tokutek. TokuDB uses fractal tree indexes instead of B-tree indexes to improve performance, which is dramatically noticeable when dealing with large tables (over 100 million rows).
For those that like the information “above the fold”, here is a table with results from a test comparing InnoDB and TokuDB. All the steps are explained in the post below, if you want more details, but here’s the table:
Action | InnoDB | TokuDB |
---|---|---|
Importing ~40 million rows | 119 min 20.596 sec | 69 min 1.982 sec |
INSERTing again, ~80 million rows total | 5 hours 13 min 52.58 sec | 56 min 44.56 sec |
INSERTing again, ~160 million rows total | 20 hours 10 min 32.35 sec | 2 hours 2 min 11.95 sec |
Size of table on disk | 42 Gb | 15 Gb |
COUNT(*) query with GROUP BY | 58 min 10.11 sec | 5 min 3.21 sec |
DELETE query | 2 hours 46 min 18.13 sec | 1 hour 14 min 57.75 sec |
Size of table on disk | 42 Gb | 12 Gb |
OPTIMIZE TABLE | 1 day 2 hours 19 min 21.96 sec | 21 min 4.41 sec |
Size of table on disk | 41 Gb | 12 Gb |
TRUNCATE TABLE | 1 min 0.13 sec | 0.27 sec |
Size of table on disk | 41 Gb | 193 Mb (after waiting 60 seconds before doing an ls -l) |
OPTIMIZE TABLE | 23.88 sec | 0.03 sec |
Size of table on disk | 176 Kb | 193 Mb |
Installing TokuDB is not quite as easy as plugging in a storage engine. TokuDB requires a patch to the MySQL source code, so you can either patch the source code yourself or download an already-patched version from Tokutek that contains TokuDB as well. I used the already-patched version of MySQL from Tokutek, and it was no different than setting up a regular MySQL install — install, configure and go.
On disk, a table using the TokuDB storage engine is different from both InnoDB and MyISAM. It has a .frm file, as all MySQL tables do. In addition, there is a directory which contains a main.tokudb file with the data, a status.tokudb file (I believe with an action queue), and a key-KEYNAME.tokudb file for each index:
# ls -1 testtoku.frm testtoku.tokudb # ls -1 */* testtoku.tokudb/key-DATE_COLLECTED.tokudb testtoku.tokudb/key-HASHCODE.tokudb testtoku.tokudb/main.tokudb testtoku.tokudb/status.tokudb
A bit of playing around, and we see that we cannot get much from the file — with MyISAM tables, you can see the data in the table by doing a “strings” command on it:
# cd testtoku.tokudb # file * key-DATE_COLLECTED.tokudb data key-HASHCODE.tokudb data main.tokudb data status.tokudb data # strings * tokudata tokuleaf x^cd fdac tokudata tokuleaf x^cd bN fda
For a basic test I compared bulk insertion, simple querying and deletes with TokuDB and InnoDB. I did not use any special features of TokuDB. I started with an sql file produced by mysqldump that was 2.0 Gb in size, which had 19 million rows, and performed some simple tests on it. The table has a signed INT as a primary key, and the goal of this test was to see how easy it would be to delete test data. “Test data” is defined as anything that had a particular field (HASHCODE, defined as VARCHAR(32)) in common with more than 10,000 rows.
0) imported 19,425,235 rows
1) SELECT COUNT(*),HASHCODE FROM test[engine] GROUP BY HASHCODE HAVING COUNT(*)>10000;
2) DELETE FROM PRIMARY_KEY_HASH WHERE HASHCODE IN ([list of id's]);
This deleted about 3.3% of the records in the table (647,732 rows)
3) OPTIMIZE TABLE test[engine]
– to defragment
Tests were done on an Amazon EC2 instance — AMI ID:ami-2547a34c which is a Fedora 64-bit machine, using the m1.xlarge size (16 Gb RAM).
Action | InnoDB | TokuDB |
---|---|---|
Importing over 19 million rows | 33 min 2.107 sec | 31 min 24.793 sec |
Size of table on disk | 4.4 Gb | 2.4 Gb |
COUNT(*) query with GROUP BY | 8.64 sec | 29.28 sec |
DELETE query | 26.06 sec | 2 min 19.51 sec |
Size of table on disk | 4.4 Gb | 1.9 Gb |
OPTIMIZE TABLE | 35 min 15.04 sec | 1 min 20.42 sec |
Size of table on disk | 4.3 Gb | 1.2 Gb |
InnoDB performed exceedingly well because the InnoDB buffer pool was sized larger than the data (12 Gb buffer pool vs. 4.4 Gb table), and the data import caused the buffer pool to have all the data and indexes already cached when the queries were run. Even so, TokuDB only fared slightly worse than InnoDB in overall performance.
The most interesting part of the table, for me, is the fact that there is no need to defragment the table — Even though the size on disk does decrease after the OPTIMIZE TABLE, the Tokutek folks explained that there’s a queue of work to be done (such as defragmentation) that is done automatically, and OPTIMIZE TABLE processes the rest of the queue. This is why the size of the table on disk was already reduced even before teh OPTIMIZE TABLE was done, and if I had waited a minute or so before performing the OPTIMIZE TABLE it would have automatically been done and I would have seen no results with the OPTIMIZE TABLE.
(specifically, I was told “The fractal tree is a dynamic data structure which may rearrange itself when queries run. In addition, since the fractal tree is periodically checkpointed, there may be more than one version of the data changed since the last check point was taken in the underlying file.” and pointed to a blog post this post about quantifying fragmentation effects.
The table shows that for smaller amounts of data (fewer than 100 million rows), TokuDB is about 9% faster for inserts, but somewhat slower for even simple queries and deletes. There is no need to defragment TokuDB, which saves a lot of time in the long run.
As TokuDB is recommended for tables larger than 100 million rows, let’s see this same test with a large amount of data. This time we started with an import of 39,334,901 rows, a 4.0 Gb file produced by mysqldump. However, since we want more than 100 million rows, after the import we did 2 inserts to produce almost 160 million records:
INSERT INTO test[engine] (HASHCODE, [other non-primary key fields]) SELECT (HASHCODE, [other non-primary key fields]) FROM test[engine];
# after this there are almost 80 million records (78,669,802)
INSERT INTO test[engine] (HASHCODE, [other non-primary key fields]) SELECT (HASHCODE, [other non-primary key fields]) FROM test[engine];
# after this there are almost 160 million records (157,339,604)
Action | InnoDB | TokuDB |
---|---|---|
Importing ~40 million rows | 119 min 20.596 sec | 69 min 1.982 sec |
INSERTing again, ~80 million rows total | 5 hours 13 min 52.58 sec | 56 min 44.56 sec |
INSERTing again, ~160 million rows total | 20 hours 10 min 32.35 sec | 2 hours 2 min 11.95 sec |
Size of table on disk | 42 Gb | 15 Gb |
COUNT(*) query with GROUP BY | 58 min 10.11 sec | 5 min 3.21 sec |
DELETE query | 2 hours 46 min 18.13 sec | 1 hour 14 min 57.75 sec |
Size of table on disk | 42 Gb | 12 Gb |
OPTIMIZE TABLE | 1 day 2 hours 19 min 21.96 sec | 21 min 4.41 sec |
Size of table on disk | 41 Gb | 12 Gb |
TRUNCATE TABLE | 1 min 0.13 sec | 0.27 sec |
Size of table on disk | 41 Gb | 193 Mb (after waiting 60 seconds before doing an ls -l) |
OPTIMIZE TABLE | 23.88 sec | 0.03 sec |
Size of table on disk | 176 Kb | 193 Mb |
Clearly, TokuDB is better than InnoDB for all these values. And I did not even use any of the special features of TokuDB — no extra indexes were added!
One great aspect about TokuDB is that it gives you approximate statistics on how many rows have been inserted:
mysql> show processlistG *************************** 1. row *************************** Id: 3 User: root Host: localhost db: test Command: Query Time: 1 State: Inserted about 4000 rows Info: INSERT INTO `testtoku` VALUES (14600817,NULL,'c40325fb0406ccf2ad3e3c91aa95a6f2','000bxi504',
The “State” value is per query, so in doing a bulk insert with many rows, I saw this number go up and down. However, it’s very useful nonetheless.
The SHOW TABLE STATUS shows that the statistics are exact (like MyISAM — InnoDB metadata is approximate):
mysql> select count(*) from testtoku; +----------+ | count(*) | +----------+ | 18431319 | +----------+ 1 row in set (15.01 sec) mysql> SHOW TABLE STATUSG *************************** 1. row *************************** Name: testtoku Engine: TokuDB Version: 10 Row_format: Dynamic Rows: 18431319 Avg_row_length: 0 Data_length: 0 Max_data_length: 0 Index_length: 0 Data_free: 0 Auto_increment: 19425236 Create_time: NULL Update_time: NULL Check_time: NULL Collation: latin1_swedish_ci Checksum: NULL Create_options: Comment: 1 row in set (0.00 sec)
TokuDB does have that “action queue” I mentioned before — the statistics are exact, but may not be up-to-date if there are still actions to be performed. However, any statement that touches every record will perform all the actions left in the queue — so after statements like OPTIMIZE TABLE
and even SELECT COUNT(*) FROM tbl
, the statistics are up-to-date.
Just in case anyone wants it, here is the my.cnf used for both InnoDB and TokuDB tests:
[mysqld] datadir = /mnt/mysql/data port = 3306 socket = /tmp/mysql.sock skip-locking key_buffer_size = 384M max_allowed_packet = 1M table_open_cache = 512 sort_buffer_size = 2M read_buffer_size = 2M read_rnd_buffer_size = 8M myisam_sort_buffer_size = 64M thread_cache_size = 8 query_cache_size = 32M thread_concurrency = 8 server-id = 1 innodb_data_home_dir = /mnt/mysql/data/ innodb_data_file_path = ibdata1:10M:autoextend innodb_log_group_home_dir = /mnt/mysql/data/ innodb_buffer_pool_size = 12G innodb_additional_mem_pool_size = 20M innodb_log_file_size = 100M innodb_log_buffer_size = 8M innodb_flush_log_at_trx_commit = 2 innodb_lock_wait_timeout = 50 innodb_file_per_table
No comments:
Post a Comment