Fun With MySQL: 2013

Tuesday, October 22, 2013

3 Ways to Make Tab Delimited Files From Your MySQL Table.

3 Ways to Make Tab Delimited Files From Your MySQL Table

Method One:

Using MySQL’s SELECT INTO OUTFILE feature, you can direct your query’s results to a file using some additional parameters to format the content. I needed to do this in two steps in order to get the column headers at the top of the file.

Command:
mysql --user=root --password='' -e "SELECT GROUP_CONCAT(COLUMN_NAME SEPARATOR '\t') FROM INFORMATION_SCHEMA.COLUMNS WHERE table_schema='phineas_and_ferb' and table_name='characters' INTO OUTFILE '~/tmp/output.txt' FIELDS TERMINATED BY '\t' OPTIONALLY ENCLOSED BY '' ESCAPED BY '' LINES TERMINATED BY '\n';"

mysql --user=root --password='' phineas_and_ferb -e "SELECT * FROM characters INTO OUTFILE '~/tmp/data.txt' FIELDS TERMINATED BY '\t' OPTIONALLY ENCLOSED BY '\"' LINES TERMINATED BY '\n';"

cat ~/tmp/data.txt >> ~/tmp/output.txt

Advantage: Optional quoting of output fields allows integers to be interpreted correctly by applications importing the data.
Disadvantages: Adding column headers requires an extra command and the use of a temp file. The queries are more complicated than other methods.

Method Two: Redirect query results to file

Execute a simple query against the database table and redirect it to an output file.
Command:
mysql --user=root --password='' --column-names=TRUE phineas_and_ferb -e "SELECT * from characters;" > ~/tmp/output.txt

Advantages: Column headers are automatically included in the output. Results are automatically tab-delimited.
Disadvantage: None of the output fields are quoted.

Method Three: mysqldump

Run mysqldump to directly write the data to a file. Again, I included an additional command to get the column headers at that top of the output file.

Command:
mysql --user=root --password='' -e "SELECT GROUP_CONCAT(COLUMN_NAME SEPARATOR '\t') FROM INFORMATION_SCHEMA.COLUMNS WHERE table_schema='phineas_and_ferb' and table_name='characters' INTO OUTFILE '~/tmp/output.txt' FIELDS TERMINATED BY '\t' OPTIONALLY ENCLOSED BY '' ESCAPED BY '' LINES TERMINATED BY '\n';"

mysqldump --single-transaction --user=root --password='' -T ~/tmp/ phineas_and_ferb --fields-enclosed-by=\"

cat ~/tmp/characters.txt >> ~/tmp/output.txt

Advantage: Simplified method to get quoting around output fields.
Disadvantages: All output fields are quoted. Adding column headers requires an extra command and the use of a temp file.

TokuDB - Compression Test — InnoDB vs. TokuDB

Compression — Highest Compression

Compression is an always-on feature of TokuDB. We tested InnoDB compression with two values of key_block_size (4k and 8k) and with compression disabled. To find the maximum compression, we loaded some web application performance data (log style data with stored procedure names, database instance names, begin and ending execution timestamps, duration row counts, and parameter values). TokuDB achieved 29x compression, far more than InnoDB.

Platform: Ubuntu 11.04; Intel Corei7/920 @ 3.6Ghz; 12GB RAM; 2x 7.2k SATA.

TokuDB - Replication — Eliminate Slave Lag

Replication — Eliminate Slave Lag

MySQL’s single threaded design often leads to slave lag. With TokuDB, slave lag is eliminated. This insures replication can be used for read scaling, backups, and disaster recovery, without resorting to sharding, expensive hardware, or placing limits on what can be replicated.The graph below shows the slave trying to keep up with new orders in a TPCC-like environment. At 1,000 TPS there is no lag for InnoDB or TokuDB. Beyond that, MySQL with InnoDB begins to fall behind.

Platform: Master – Centos 5.6; 2x Xeon L5520; 72GB RAM; 8x 300GB 10k SAS in RAID10. Slave – Centos 5.7; 2x Xeon E5310; 8GB RAM; 6x 1TB SATA with 2 in RAID1 and 4 in RAID0.

With TokuDB - Schema Changes in Seconds, not Hours

Hot Schema — Schema Changes in Seconds, not Hours back to top

TokuDB v5.0 introduced Hot Column Addition (HCAD). You can add or delete columns from an existing table with minimal downtime — just the time for MySQL itself to close and reopen the table. The total downtime is seconds to minutes. We detailed an experiment that showed this in this blog. TokuDB v5.0 also introduced Hot Indexing. You can add an index to an existing table with minimal downtime. The total downtime is seconds to a few minutes, because when the index is finished being built, MySQL closes and reopens the table. This means that the downtime occurs not when the command is issued, but later on. Still, it is quite minimal, as we showed in this blog.

Platform: CentOS 5.5; 2x Xeon E5310; 4GB RAM; 4x 1TB 7.2k SATA in RAID0.

Monday, October 21, 2013

Database Backup Strategy

So how do you make backups of huge MySQL databases that are in your production environment without affecting your customers? The answer is with Percona’s Xtrabackup tool. It performs binary backups of heavily loaded MySQL servers amazingly fast. It even supports incremental backups so that you don’t have to backup your entire database every single time. However, even it requires a table lock at end of its procedure if you want the binary log position of the backup. Luckily, there’s a “–no-lock” option and a little trick you can use to get the binary log position when you use it.
Now that we’re using Xtrabackup to backup our live data and we know how to get the binary log position, we just have to automate the procedure. This is harder than you think, because for every incremental backup you need information on the last one you did so that it knows where to start the new one. If you store your backups as compressed data (which you should to save space), this information must be stored separately which means you have to parse it out yourself. Also, in order to restore a backup, you need a list of all the incremental backups so that you can restore them in order.
I spent a long time creating the perfect automation script for all this. For a full backup, the procedure is as such:

Run ‘innobackupex’ with –no-lock, –stream, and –compress to create a compressed backup.
Use ‘sed’ to parse the log sequence number from the output, which is used for incremental backups.
Save the LSN in a separate file so you can refer to it later.
Save the filename of the backup in it’s own file, so that you can easily keep track of the all the backups you’ve done in case you need to restore them in order.
Upload the final compressed backup and the files from step 3 and 4 to Amazon’s S3. To do this, it’s best to split the backup up into smaller files and upload them all in parallel.

For an incremental backup, the procedure is very similar:

Grab the LSN from the file that was created during the full backup
Run ‘innobackupex’ with the same options as before, but add –incremental and –incremental-lsn=<LSN>
Use ‘sed’ to parse the new log sequence number from the output.
Overwrite the LSN file with the new one.
Append the incremental backup’s filename to the backup list file.
Upload everything to S3.

Restoring the backups is perhaps the trickiest part:

Grab the list of all the backups that have happened from the backup list file.
Loop through them, and for each one:

Uncompress the backup
Run ‘innobackupex’ with –redo-only, –apply-log, and –incremental-dir=<full backup directory> on the uncompressed backup. If it’s the original full backup then leave out the –incremental-dir part.

Now that all the incremental backups have been applied to the full backup (now called the base), finish up the process by running ‘innobackupex’ with –apply-log on the base directory.
chown -R mysql:mysql <base directory>
Start MySQL on the base directory

We’ve been running this script regularly for weeks now, and it has been working great. We do one full backup per day and an incremental backup each hour. Also, since the backups contain the binary log position, we also have the ability to do point-in-time recovery by replaying the bin-logs. It’s important to note that creating these backups uses a lot of disk IOPS, so it’s wise to do them on a separate drive.

Thursday, October 17, 2013

Testing TokuDB – Faster and smaller for large tables

For the past two months, I have been running tests on TokuDB in my free time. TokuDB is a storage engine put out by Tokutek. TokuDB uses fractal tree indexes instead of B-tree indexes to improve performance, which is dramatically noticeable when dealing with large tables (over 100 million rows).

For those that like the information “above the fold”, here is a table with results from a test comparing InnoDB and TokuDB. All the steps are explained in the post below, if you want more details, but here’s the table:

Action	InnoDB	TokuDB
Importing ~40 million rows	119 min 20.596 sec	69 min 1.982 sec
INSERTing again, ~80 million rows total	5 hours 13 min 52.58 sec	56 min 44.56 sec
INSERTing again, ~160 million rows total	20 hours 10 min 32.35 sec	2 hours 2 min 11.95 sec
Size of table on disk	42 Gb	15 Gb
COUNT(*) query with GROUP BY	58 min 10.11 sec	5 min 3.21 sec
DELETE query	2 hours 46 min 18.13 sec	1 hour 14 min 57.75 sec
Size of table on disk	42 Gb	12 Gb
OPTIMIZE TABLE	1 day 2 hours 19 min 21.96 sec	21 min 4.41 sec
Size of table on disk	41 Gb	12 Gb
TRUNCATE TABLE	1 min 0.13 sec	0.27 sec
Size of table on disk	41 Gb	193 Mb (after waiting 60 seconds before doing an ls -l)
OPTIMIZE TABLE	23.88 sec	0.03 sec
Size of table on disk	176 Kb	193 Mb

Installing TokuDB is not quite as easy as plugging in a storage engine. TokuDB requires a patch to the MySQL source code, so you can either patch the source code yourself or download an already-patched version from Tokutek that contains TokuDB as well. I used the already-patched version of MySQL from Tokutek, and it was no different than setting up a regular MySQL install — install, configure and go.

On disk, a table using the TokuDB storage engine is different from both InnoDB and MyISAM. It has a .frm file, as all MySQL tables do. In addition, there is a directory which contains a main.tokudb file with the data, a status.tokudb file (I believe with an action queue), and a key-KEYNAME.tokudb file for each index:

# ls -1
testtoku.frm
testtoku.tokudb


# ls -1 */*
testtoku.tokudb/key-DATE_COLLECTED.tokudb
testtoku.tokudb/key-HASHCODE.tokudb
testtoku.tokudb/main.tokudb
testtoku.tokudb/status.tokudb

A bit of playing around, and we see that we cannot get much from the file — with MyISAM tables, you can see the data in the table by doing a “strings” command on it:

# cd testtoku.tokudb
# file *
key-DATE_COLLECTED.tokudb data
key-HASHCODE.tokudb data
main.tokudb data
status.tokudb data


# strings *
tokudata
tokuleaf
x^cd
 fdac
tokudata
tokuleaf
x^cd
bN fda

For a basic test I compared bulk insertion, simple querying and deletes with TokuDB and InnoDB. I did not use any special features of TokuDB. I started with an sql file produced by mysqldump that was 2.0 Gb in size, which had 19 million rows, and performed some simple tests on it. The table has a signed INT as a primary key, and the goal of this test was to see how easy it would be to delete test data. “Test data” is defined as anything that had a particular field (HASHCODE, defined as VARCHAR(32)) in common with more than 10,000 rows.

0) imported 19,425,235 rows

1) SELECT COUNT(*),HASHCODE FROM test[engine] GROUP BY HASHCODE HAVING COUNT(*)>10000;

2) DELETE FROM PRIMARY_KEY_HASH WHERE HASHCODE IN ([list of id's]); This deleted about 3.3% of the records in the table (647,732 rows)

3) OPTIMIZE TABLE test[engine] – to defragment

Tests were done on an Amazon EC2 instance — AMI ID:ami-2547a34c which is a Fedora 64-bit machine, using the m1.xlarge size (16 Gb RAM).

Action	InnoDB	TokuDB
Importing over 19 million rows	33 min 2.107 sec	31 min 24.793 sec
Size of table on disk	4.4 Gb	2.4 Gb
COUNT(*) query with GROUP BY	8.64 sec	29.28 sec
DELETE query	26.06 sec	2 min 19.51 sec
Size of table on disk	4.4 Gb	1.9 Gb
OPTIMIZE TABLE	35 min 15.04 sec	1 min 20.42 sec
Size of table on disk	4.3 Gb	1.2 Gb

InnoDB performed exceedingly well because the InnoDB buffer pool was sized larger than the data (12 Gb buffer pool vs. 4.4 Gb table), and the data import caused the buffer pool to have all the data and indexes already cached when the queries were run. Even so, TokuDB only fared slightly worse than InnoDB in overall performance.

The most interesting part of the table, for me, is the fact that there is no need to defragment the table — Even though the size on disk does decrease after the OPTIMIZE TABLE, the Tokutek folks explained that there’s a queue of work to be done (such as defragmentation) that is done automatically, and OPTIMIZE TABLE processes the rest of the queue. This is why the size of the table on disk was already reduced even before teh OPTIMIZE TABLE was done, and if I had waited a minute or so before performing the OPTIMIZE TABLE it would have automatically been done and I would have seen no results with the OPTIMIZE TABLE.

(specifically, I was told “The fractal tree is a dynamic data structure which may rearrange itself when queries run. In addition, since the fractal tree is periodically checkpointed, there may be more than one version of the data changed since the last check point was taken in the underlying file.” and pointed to a blog post this post about quantifying fragmentation effects.

The table shows that for smaller amounts of data (fewer than 100 million rows), TokuDB is about 9% faster for inserts, but somewhat slower for even simple queries and deletes. There is no need to defragment TokuDB, which saves a lot of time in the long run.

As TokuDB is recommended for tables larger than 100 million rows, let’s see this same test with a large amount of data. This time we started with an import of 39,334,901 rows, a 4.0 Gb file produced by mysqldump. However, since we want more than 100 million rows, after the import we did 2 inserts to produce almost 160 million records:

INSERT INTO test[engine] (HASHCODE, [other non-primary key fields]) SELECT (HASHCODE, [other non-primary key fields]) FROM test[engine];
# after this there are almost 80 million records (78,669,802)

INSERT INTO test[engine] (HASHCODE, [other non-primary key fields]) SELECT (HASHCODE, [other non-primary key fields]) FROM test[engine];
# after this there are almost 160 million records (157,339,604)

Action	InnoDB	TokuDB
Importing ~40 million rows	119 min 20.596 sec	69 min 1.982 sec
INSERTing again, ~80 million rows total	5 hours 13 min 52.58 sec	56 min 44.56 sec
INSERTing again, ~160 million rows total	20 hours 10 min 32.35 sec	2 hours 2 min 11.95 sec
Size of table on disk	42 Gb	15 Gb
COUNT(*) query with GROUP BY	58 min 10.11 sec	5 min 3.21 sec
DELETE query	2 hours 46 min 18.13 sec	1 hour 14 min 57.75 sec
Size of table on disk	42 Gb	12 Gb
OPTIMIZE TABLE	1 day 2 hours 19 min 21.96 sec	21 min 4.41 sec
Size of table on disk	41 Gb	12 Gb
TRUNCATE TABLE	1 min 0.13 sec	0.27 sec
Size of table on disk	41 Gb	193 Mb (after waiting 60 seconds before doing an ls -l)
OPTIMIZE TABLE	23.88 sec	0.03 sec
Size of table on disk	176 Kb	193 Mb

Clearly, TokuDB is better than InnoDB for all these values. And I did not even use any of the special features of TokuDB — no extra indexes were added!

One great aspect about TokuDB is that it gives you approximate statistics on how many rows have been inserted:

mysql> show processlistG
*************************** 1. row ***************************
     Id: 3
   User: root
   Host: localhost
     db: test
Command: Query
   Time: 1
  State: Inserted about 4000 rows
   Info: INSERT INTO `testtoku` VALUES (14600817,NULL,'c40325fb0406ccf2ad3e3c91aa95a6f2','000bxi504',

The “State” value is per query, so in doing a bulk insert with many rows, I saw this number go up and down. However, it’s very useful nonetheless.

The SHOW TABLE STATUS shows that the statistics are exact (like MyISAM — InnoDB metadata is approximate):

mysql> select count(*) from testtoku;
+----------+
| count(*) |
+----------+
| 18431319 |
+----------+
1 row in set (15.01 sec)


mysql> SHOW TABLE STATUSG
*************************** 1. row ***************************
           Name: testtoku
         Engine: TokuDB
        Version: 10
     Row_format: Dynamic
           Rows: 18431319
 Avg_row_length: 0
    Data_length: 0
Max_data_length: 0
   Index_length: 0
      Data_free: 0
 Auto_increment: 19425236
    Create_time: NULL
    Update_time: NULL
     Check_time: NULL
      Collation: latin1_swedish_ci
       Checksum: NULL
 Create_options:
        Comment:
1 row in set (0.00 sec)

TokuDB does have that “action queue” I mentioned before — the statistics are exact, but may not be up-to-date if there are still actions to be performed. However, any statement that touches every record will perform all the actions left in the queue — so after statements like OPTIMIZE TABLE and even SELECT COUNT(*) FROM tbl, the statistics are up-to-date.

Just in case anyone wants it, here is the my.cnf used for both InnoDB and TokuDB tests:

[mysqld]
datadir = /mnt/mysql/data
port            = 3306
socket          = /tmp/mysql.sock
skip-locking
key_buffer_size = 384M
max_allowed_packet = 1M
table_open_cache = 512
sort_buffer_size = 2M
read_buffer_size = 2M
read_rnd_buffer_size = 8M
myisam_sort_buffer_size = 64M
thread_cache_size = 8
query_cache_size = 32M
thread_concurrency = 8
server-id       = 1
innodb_data_home_dir = /mnt/mysql/data/
innodb_data_file_path = ibdata1:10M:autoextend
innodb_log_group_home_dir = /mnt/mysql/data/
innodb_buffer_pool_size = 12G
innodb_additional_mem_pool_size = 20M
innodb_log_file_size = 100M
innodb_log_buffer_size = 8M
innodb_flush_log_at_trx_commit = 2
innodb_lock_wait_timeout = 50
innodb_file_per_table

Saturday, June 8, 2013

MySQL backups with Percona’s XtraBackup

MySQL backup is sometimes very hard to do effectively. MySQL provides various options for backup, but many of them are simply unsuitable for large systems, particularly if they need to remain active during backups. Percona’s XtraBackup is an open-source clone of InnoBase’s InnoDB Hot Backup utility. So what makes XtraBackup a better solution, and how does it work?
Update: on December 10th 2009, Percona released Xtrabackup 1.0.
Most importantly, XtraBackup can do a backup that maintains transactional integrity without having to stop or lock the database. XtraBackup is mainly concerned with InnoDB tables because other tables don’t offer transactional integrity anyway, and they are already served to an extent by mysqldump and the misleadingly named mysqlhotcopy which isn’t actually very ‘hot’ at all!

Many people use LVM snapshots for backups as they offer many of the same advantages, however there are two downsides. Firstly they can adversely affect performance – while a snapshot is active, disk performance can take a big hit (though that may not be a problem on low-traffic sites). Secondly, although the captured snapshot may be in a known state as far as the file system is concerned, it may not contain information that was in memory at the time (or is held in as-yet-unapplied logs), and thus may require a long crash recovery operation at restore time (just when you’re having a little panic that your database died!) instead of at backup time (probably the middle of the night, while you’re sleeping).
Also, you can’t use LVM unless you started with it – adding LVM to a running server is extremely fiddly and difficult, in some cases impossible. LVM also imposes a whole new layer of acronyms and concepts to the process of disk access.

Another advantage of Xtrabackup is that it backs up the actual database files, not an SQL dump, meaning that it takes a similar amount of space, but restores are much faster (it can just use the files, and doesn’t have to rebuild them from scratch). Generally you’ll use XtraBackup on a whole database server, not just a subset of its contents (the latest version now supports single-table backups, which I’ve not tried yet). Make sure you grab the latest version as though it works for me, it’s still very much in development.
In practice you don’t use XtraBackup directly, but via a wrapper script called InnoBackupEx, which takes care of various matters that are fiddly to handle manually. XtraBackup only handles InnoDB tables, but InnoBackupEx can also handle MyISAM tables, so it’s ok to have a mix, though MyISAM offers neither transactional integrity or crash recovery (or at least nowhere near as far as InnoDB does), and requires locks during backups.
The essentials of InnoBackupEx’ operation are these steps:

Copy the active database files (which may be in an unknown state)
Apply the current InnoDB transactional logs to them to bring into a consistent state at a known checkpoint
Save them as a snapshot, or into a tar stream, possibly on a remote machine

Step 2 requires intimate knowledge of database and log file internals, so rather than having code that replicates MySQL’s functionality, it actually fires up another instance of mysqld to do it. This can be a little confusing if you happen to look at your processes at the time! What it’s doing here is effectively a crash recovery process, and it can take a long time; how long is at least partly dependent on your log files size – big log files give better performance at the price of longer crash recovery, which here means slower backups. In my case, with a 140Gb database, it takes a bit under 4 hours.

Because the content of the backups may be dependent on the settings in use on the db server at the time, the backup process needs to use your my.cnf file, usually found at /etc/my.cnf or /etc/mysql/my.cnf(the latter is common on many distros, and if it’s there, symlink it to /etc/my.cnf so that xtrabackup can find it), and in fact it embeds a copy of that file in the backup stream so that any new server can be made to match the original server’s config exactly – this makes XtraBackup particularly useful for setting up replication slaves. It’s generally a good idea to have innodb_file_per_table enabled as it keeps backup sizes down, reduces risk, lets you recover wasted space more easily, and keeps your options open for per-table disk layout optimisations.

If the backup is destined for a replication slave, you can add the --slave-info switch which will embed the log file name and position corresponding to the time the backup was taken. When you start up the slave, it can then start pulling events that have happened since then from the master.
After all this, the command that I use to backup a whole mysql server to a gzip compressed tar stream is this:

/usr/bin/innobackupex-1.5.1 --slave-info --stream=tar /var/backups/xtrabackup | gzip -c -9 > /var/backups/mysql/backup.tar.gz

Notice that the command dos not specify where the database files are – this is obtained from your my.cnf file. When this runs you’ll get output looking like this:

InnoDB Backup Utility v1.5.1-xtrabackup; Copyright 2003, 2009 Innobase Oy.
All Rights Reserved.

This software is published under
the GNU GENERAL PUBLIC LICENSE Version 2, June 1991.

IMPORTANT: Please check that the backup run completes successfully.
           At the end of a successful backup run innobackup
           prints "innobackup completed OK!".

innobackupex: Using mysql  Ver 14.12 Distrib 5.0.51a, for debian-linux-gnu (x86_64) using readline 5.2
innobackupex: Using mysql server version 5.0.51a-3ubuntu5.4-log

innobackupex: Created backup directory /var/backups/mysql
090911 00:59:28  innobackupex: Starting mysql with options: --unbuffered
090911 00:59:28  innobackupex: Connected to database with mysql child process (pid=15052)
090911 00:59:32  innobackupex: Connection to database server closed

090911 00:59:32  innobackupex: Starting ibbackup with command: xtrabackup --backup --suspend-at-end --log-stream --target-dir=./ --throttle=150
innobackupex: Waiting for ibbackup (pid=15086) to suspend
innobackupex: Suspend file '/var/lib/mysql/xtrabackup_suspended'

xtrabackup: suspend-at-end is enabled.
xtrabackup: uses posix_fadvise().
xtrabackup: uses posix_fadvise().
xtrabackup: cd to /var/lib/mysql
xtrabackup: Target instance is assumed as followings.
xtrabackup:   innodb_data_home_dir = ./
xtrabackup:   innodb_data_file_path = ibdata1:10M:autoextend
xtrabackup:   innodb_log_group_home_dir = ./
xtrabackup:   innodb_log_files_in_group = 2
xtrabackup:   innodb_log_file_size = 67108864
xtrabackup: use O_DIRECT
xtrabackup: Stream mode.
>> log scanned up to (185 2443877757)

090911 00:59:34  innobackupex: Continuing after ibbackup has suspended

innobackupex: Starting to backup InnoDB tables and indexes
innobackupex: from original InnoDB data directory '/var/lib/mysql'
innobackupex: Backing up as tar stream 'ibdata1'
>> log scanned up to (185 2443879075)
...
>> log scanned up to (185 2443879075)
tar: ibdata1: file changed as we read it
innobackupex: If you use GNU tar, this warning can be ignored.
innobackupex: Backing up files '/var/lib/mysql/mydb1/*.ibd' (39 files)
>> log scanned up to (185 2447627905)
...
>> log scanned up to (185 2455853862)
090911 04:44:19  innobackupex: Starting mysql with options: --unbuffered
090911 04:44:19  innobackupex: Connected to database with mysql child process (pid=9067)
>> log scanned up to (185 2455853862)
090911 04:44:23  innobackupex: Starting to lock all tables...
>> log scanned up to (185 2455854380)
...
>> log scanned up to (185 2455858345)
090911 04:44:53  innobackupex: All tables locked and flushed to disk

090911 04:44:53  innobackupex: Starting to backup .frm, .MRG, .MYD, .MYI,
innobackupex: .TRG, .TRN, and .opt files in
innobackupex: subdirectories of '/var/lib/mysql'
innobackupex: Backing up file '/var/lib/mysql/mydb2_survey/prefs.frm'
innobackupex: Backing up file '/var/lib/mysql/mydb2_survey/prefs.MYD'
>> log scanned up to (185 2455858345)
>> log scanned up to (185 2455858345)
innobackupex: Backing up file '/var/lib/mysql/mydb2_survey/prefs.MYI'
>> log scanned up to (185 2455858345)
...
>> log scanned up to (185 2455858355)
innobackupex: Backing up file '/var/lib/mysql/mydb2_survey/db.opt'
innobackupex: Backing up files '/var/lib/mysql/mysql/*.{frm,MYD,MYI,MRG,TRG,TRN,opt}' (52 files)
innobackupex: Backing up files '/var/lib/mysql/phpmyadmin/*.{frm,MYD,MYI,MRG,TRG,TRN,opt}' (25 files)
innobackupex: Backing up files '/var/lib/mysql/mydb1/*.{frm,MYD,MYI,MRG,TRG,TRN,opt}' (40 files)
innobackupex: Backing up file '/var/lib/mysql/mydb3_prefs/prefs.frm'
innobackupex: Backing up file '/var/lib/mysql/mydb3_prefs/prefs.MYD'
>> log scanned up to (185 2455858355)
...
>> log scanned up to (185 2455858355)
innobackupex: Backing up file '/var/lib/mysql/mydb3_prefs/prefs.MYI'
>> log scanned up to (185 2455858355)
...
>> log scanned up to (185 2455858355)
innobackupex: Backing up file '/var/lib/mysql/mydb3_prefs/db.opt'
innobackupex: Backing up file '/var/lib/mysql/work/jobs.frm'
innobackupex: Backing up file '/var/lib/mysql/work/jobs.MYD'
innobackupex: Backing up file '/var/lib/mysql/work/jobs.MYI'
innobackupex: Backing up file '/var/lib/mysql/work/db.opt'
090911 04:48:21  innobackupex: Finished backing up .frm, .MRG, .MYD, .MYI, .TRG, .TRN, and .opt files

innobackupex: Resuming ibbackup

xtrabackup: The latest check point (for incremental): '185:2455858355'
>> log scanned up to (185 2455858365)
xtrabackup: Transaction log of lsn (185 2443876945) to (185 2455858365) was copied.
090911 04:48:25  innobackupex: All tables unlocked
090911 04:48:25  innobackupex: Connection to database server closed

innobackupex: Backup created in directory '/var/backups/mysql'
innobackupex: MySQL binlog position: filename 'mysql-bin.000684', position 7411270
innobackupex: MySQL slave binlog position: master host '192.168.1.2', filename 'mysql-bin.000107', position 98
090911 04:48:26  innobackupex: innobackup completed OK!
innobackupex: You must use -i (--ignore-zeros) option for extraction of the tar stream.

You can see from this log that the scanning of the log files is what takes all the time – be thankful that this is happening while you back up as it means you won’t have to wait for it on restore! The process is that it runs the ibbackup program, which gets suspended while the logs are scanned, and when that’s finished it resumes and copies the files to make the backup. If you’re watching this output, you may find that it sometimes looks like it’s stalled during log scanning. Don’t worry, it hasn’t, it just doesn’t prioritise the output of the log info, and it will turn up eventually.

An alternative command to backup directly to another server without file compression (assuming that the other server is on the same LAN), which is the kind of thing you’d want to do to when setting up a slave:

innobackupex-1.5.1 --stream=tar /tmp/ --slave-info | ssh -C user@remotebox "tar xfi - -C /var/backups/mysql"

The backup operation can impose quite high disk loads itself, but you can throttle it using the --throttle=<iops> switch, which is in IOs per second, and depends on your disk system – if you’re on simple RAID-1, around 100 should be a fair balance. If you’re running replication, it makes a lot of sense to use one of your slaves to run the backups on as it then imposes no load on your primary master.

From reading the docs it’s not immediately obvious what you end up with after a backup; it’s a complete MySQL data directory, i.e. what you’d typically find in /var/lib/mysql on most Linux distros. This makes restores really simple – it’s a matter of moving any existing data directory out of the way and restoring the backup in its place, and again this is very handy for replication slaves. Although (if you use stream output) the result is a tar file, it doesn’t act quite like “normal” tar archives, and if you do a tar tf to list the contents, it will seem to be empty. This is because it’s not really a file collection, but a stream of raw data, and as such may contain lots of null bytes (char 0), so you need to specify the `i` switch to tar as well, and there is no point in trying to untar it yourself.

Restores

So your daily backups are running nicely and one of your servers dies or falls irreparably out of sync with its master (it happened to me!) and you need to restore a backup. Firstly, don’t panic – you have another master that handles transparent failover using mmm, right?
You use the same innobackupex script as you used for the backup to do the restore, just with different options. It helps to give the restore process lots of memory with the --use-memory=nn, and mysqld isn’t running during the restore process so you don’t need to leave space for it. Firstly move your existing data directory out of the way (or delete it if you’re low on space and sure your backup is OK – you’ve tested this procedure, right?!), then move the backup into place before running the restore operation. So here’s a restore command (on a server with 32Gb RAM):

rm -rf /var/lib/mysql/*
mv /var/backups/mysql/* /var/lib/mysql
innobackupex-1.5.1 --apply-log --use-memory=28G  /var/lib/mysql

The machine is offline anyway, so you don’t need to worry about it using up all available i/o capacity. Because the backup is already in a good and consistent state, the restore actually happens pretty fast – in my case only a few seconds – also it’s not really a restore in the usual sense, it’s really a “Confirm the correctness of the backup” step and it’s still possible that the backup could fail at this point (for example if you backed up corrupted data), so it’s a good idea to keep multiple backup generations.
To recap: XtraBackup gives you:

Reliable (but possibly slow), consistent backups without disrupting a running service
A simple method for setting up MyQSL replication slaves very quickly
Fast, simple restores

May your backups go smoothly and never be needed!
Thanks to all at Percona for their excellent services, reviews of this article, the MySQL book
, and of course, XtraBackup.

Update Oct 9th 2009

Don’t set the xtrabackup backup directory to the same location as your compressed tar files as xtrabackup clears this directory before each run. I’ve amended the command line to reflect this. I also just spotted that there’s an article on this exact same subject in the first issue (July ’09) of osdbzine

Wednesday, April 10, 2013

Top 100 SQL Performance Tips

Specific Query Performance Tips (see also database design tips for tips on indexes):

Use EXPLAIN to profile the query execution plan
Use Slow Query Log (always have it on!)
Don't use DISTINCT when you have or could use GROUP BY
Insert performance

Batch INSERT and REPLACE
Use LOAD DATA instead of INSERT

LIMIT m,n may not be as fast as it sounds. Learn how to improve it and read more about Efficient Pagination Using MySQL
Don't use ORDER BY RAND() if you have > ~2K records
Use SQL_NO_CACHE when you are SELECTing frequently updated data or large sets of data
Avoid wildcards at the start of LIKE queries
Avoid correlated subqueries and in select and where clause (try to avoid in)
No calculated comparisons -- isolate indexed columns
ORDER BY and LIMIT work best with equalities and covered indexes
Separate text/blobs from metadata, don't put text/blobs in results if you don't need them
Derived tables (subqueries in the FROM clause) can be useful for retrieving BLOBs without sorting them. (Self-join can speed up a query if 1st part finds the IDs and uses then to fetch the rest)
ALTER TABLE...ORDER BY can take data sorted chronologically and re-order it by a different field -- this can make queries on that field run faster (maybe this goes in indexing?)
Know when to split a complex query and join smaller ones
Delete small amounts at a time if you can
Make similar queries consistent so cache is used
Have good SQL query standards
Don't use deprecated features
Turning OR on multiple index fields (<5.0) into UNION may speed things up (with LIMIT), after 5.0 the index_merge should pick stuff up.
Don't use COUNT * on Innodb tables for every search, do it a few times and/or summary tables, or if you need it for the total # of rows, use SQL_CALC_FOUND_ROWS and SELECT FOUND_ROWS()
Use INSERT ... ON DUPLICATE KEY update (INSERT IGNORE) to avoid having to SELECT
use groupwise maximum instead of subqueries
Avoid using IN(...) when selecting on indexed fields, It will kill the performance of SELECT query.

Scaling Performance Tips:

Use benchmarking
isolate workloads don't let administrative work interfere with customer performance. (ie backups)
Debugging sucks, testing rocks!
As your data grows, indexing may change (cardinality and selectivity change). Structuring may want to change. Make your schema as modular as your code. Make your code able to scale. Plan and embrace change, and get developers to do the same.

Network Performance Tips:

Minimize traffic by fetching only what you need.

Paging/chunked data retrieval to limit
Don't use SELECT *
Be wary of lots of small quick queries if a longer query can be more efficient

Use multi_query if appropriate to reduce round-trips
Use stored procedures to avoid bandwidth wastage

OS Performance Tips:

Use proper data partitions

For Cluster. Start thinking about Cluster *before* you need them

Keep the database host as clean as possible. Do you really need a windowing system on that server?
Utilize the strengths of the OS
pare down cron scripts
create a test environment
source control schema and config files
for LVM innodb backups, restore to a different instance of MySQL so Innodb can roll forward
partition appropriately
partition your database when you have real data -- do not assume you know your dataset until you have real data

MySQL Server Overall Tips:

innodb_flush_commit=0 can help slave lag
Optimize for data types, use consistent data types. Use PROCEDURE ANALYSE() to help determine the smallest data type for your needs.
use optimistic locking, not pessimistic locking. try to use shared lock, not exclusive lock. share mode vs. FOR UPDATE
if you can, compress text/blobs
compress static data
don't back up static data as often
enable and increase the query and buffer caches if appropriate
config params -- http://docs.cellblue.nl/2007/03/17/easy-mysql-performance-tweaks/ is a good reference
Config variables & tips:

use one of the supplied config files
key_buffer, unix cache (leave some RAM free), per-connection variables, innodb memory variables
be aware of global vs. per-connection variables
check SHOW STATUS and SHOW VARIABLES (GLOBAL|SESSION in 5.0 and up)
be aware of swapping esp. with Linux, "swappiness" (bypass OS filecache for innodb data files, innodb_flush_method=O_DIRECT if possible (this is also OS specific))
defragment tables, rebuild indexes, do table maintenance
If you use innodb_flush_txn_commit=1, use a battery-backed hardware cache write controller
more RAM is good so faster disk speed
use 64-bit architectures

--skip-name-resolve
increase myisam_sort_buffer_size to optimize large inserts (this is a per-connection variable)
look up memory tuning parameter for on-insert caching
increase temp table size in a data warehousing environment (default is 32Mb) so it doesn't write to disk (also constrained by max_heap_table_size, default 16Mb)
Run in SQL_MODE=STRICT to help identify warnings
/tmp dir on battery-backed write cache
consider battery-backed RAM for innodb logfiles
use --safe-updates for client
Redundant data is redundant

Storage Engine Performance Tips:

InnoDB ALWAYS keeps the primary key as part of each index, so do not make the primary key very large
Utilize different storage engines on master/slave ie, if you need fulltext indexing on a table.
BLACKHOLE engine and replication is much faster than FEDERATED tables for things like logs.
Know your storage engines and what performs best for your needs, know that different ones exist.

ie, use MERGE tables ARCHIVE tables for logs
Archive old data -- don't be a pack-rat! 2 common engines for this are ARCHIVE tables and MERGE tables

use row-level instead of table-level locking for OLTP workloads
try out a few schemas and storage engines in your test environment before picking one.

Database Design Performance Tips:

Design sane query schemas. don't be afraid of table joins, often they are faster than denormalization
Don't use boolean flags
Use Indexes
Don't Index Everything
Do not duplicate indexes
Do not use large columns in indexes if the ratio of SELECTs:INSERTs is low.
be careful of redundant columns in an index or across indexes
Use a clever key and ORDER BY instead of MAX
Normalize first, and denormalize where appropriate.
Databases are not spreadsheets, even though Access really really looks like one. Then again, Access isn't a real database
use INET_ATON and INET_NTOA for IP addresses, not char or varchar
make it a habit to REVERSE() email addresses, so you can easily search domains (this will help avoid wildcards at the start of LIKE queries if you want to find everyone whose e-mail is in a certain domain)
A NULL data type can take more room to store than NOT NULL
Choose appropriate character sets & collations -- UTF16 will store each character in 2 bytes, whether it needs it or not, latin1 is faster than UTF8.
Use Triggers wisely
use min_rows and max_rows to specify approximate data size so space can be pre-allocated and reference points can be calculated.
Use HASH indexing for indexing across columns with similar data prefixes
Use myisam_pack_keys for int data
be able to change your schema without ruining functionality of your code
segregate tables/databases that benefit from different configuration variables

Other:

Hire a MySQL (tm) Certified DBA
Know that there are many consulting companies out there that can help, as well as MySQL's Professional Services.
Read and post to MySQL Planet at http://www.planetmysql.org
Attend the yearly MySQL Conference and Expo or other conferences with MySQL tracks (link to the conference here)
Support your local User Group (link to forge page w/user groups here)