Tuesday, May 16, 2017

xtrabackup — Percona XtraBackup 2.3

xtrabackup man page

xtrabackup — Percona XtraBackup 2.3 Documentation
The xtrabackup binary is a compiled C program that is linked with the InnoDB libraries and the standard MySQL client libraries. The InnoDB libraries provide functionality necessary to apply a log to data files, and the MySQL client libraries provide command-line option parsing, configuration file parsing, and so on to give the binary a familiar look and feel.
The tool runs in either --backup or --prepare mode, corresponding to the two main functions it performs. There are several variations on these functions to accomplish different tasks, and there are two less commonly used modes, --stats and --print-param.

The Backup Cycle - Full Backups

Creating a Backup

To create a backup, run xtrabackup with the --backup option. You also need to specify a --target_dir option, which is where the backup will be stored, and a --datadir option, which is where the MySQL data is stored. If the InnoDB data or log files aren't stored in the same directory, you might need to specify the location of those, too. If the target directory does not exist, xtrabackup creates it. If the directory does exist and is empty, xtrabackup will succeed. xtrabackup will not overwrite existing files, it will fail with operating system error 17,  file exists.
The tool changes its working directory to the data directory and performs two primary tasks to complete the backup:
  • It starts a log-copying thread in the background. This thread watches the InnoDB log files, and when they change, it copies the changed blocks to a file called xtrabackup_logfile in the backup target directory. This is necessary because the backup might take a long time, and the recovery process needs all of the log file entries from the beginning to the end of the backup.
  • It copies the InnoDB data files to the target directory. This is not a simple file copy; it opens and reads the files similarly to the way InnoDB does, by reading the data dictionary and copying them a page at a time.
When the data files are finished copying, xtrabackup stops the log-copying thread, and creates a files in the target directory called xtrabackup_checkpoints, which contains the type of backup performed, the log sequence number at the beginning, and the log sequence number at the end.
An example command to perform a backup follows:
$ xtrabackup --backup --datadir=/var/lib/mysql/ --target-dir=/data/backups/mysql/
This takes a backup of /var/lib/mysql and stores it at /data/backups/mysql/. If you specify a relative path, the target directory will be relative to the current directory.
During the backup process, you should see a lot of output showing the data files being copied, as well as the log file thread repeatedly scanning the log files and copying from it. Here is an example that shows the log thread scanning the log in the background, and a file copying thread working on the ibdata1 file:
>> log scanned up to (3646475465483)
>> log scanned up to (3646475517369)
>> log scanned up to (3646475581716)
>> log scanned up to (3646475636841)
>> log scanned up to (3646475718082)
>> log scanned up to (3646475988095)
>> log scanned up to (3646476048286)
>> log scanned up to (3646476102877)
>> log scanned up to (3646476140854)
[01] Copying /usr/local/mysql/var/ibdata1
     to /usr/local/mysql/Backups/2011-04-18_21-11-15/ibdata1
[01]        ...done
The last thing you should see is something like the following, where the value of the <LSN> will be a number that depends on your system:
xtrabackup: Transaction log of lsn (<SLN>) to (<LSN>) was copied.
NOTE:
Log copying thread checks the transactional log every second to see if there were any new log records written that need to be copied, but there is a chance that the log copying thread might not be able to keep up with the amount of writes that go to the transactional logs, and will hit an error when the log records are overwritten before they could be read.
After the backup is finished, the target directory will contain files such as the following, assuming you have a single InnoDB table test.tbl1 and you are using MySQL's innodb_file_per_table option:
/data/backups/mysql/ibdata1
/data/backups/mysql/test
/data/backups/mysql/test/tbl1.ibd
/data/backups/mysql/xtrabackup_checkpoints
/data/backups/mysql/xtrabackup_logfile
The backup can take a long time, depending on how large the database is. It is safe to cancel at any time, because it does not modify the database.
The next step is getting your backup ready to restored: preparing_the_backup.

Preparing the backup

After you make a backup with --backup, the next step is to prepare it. The data files are not point-in-time consistent until they've been prepared, because they were copied at different times as the program ran, and they might have been changed while this was happening. If you try to start InnoDB with these data files, it will detect corruption and crash itself to prevent you from running on damaged data. The --prepare step makes the files perfectly consistent at a single instant in time, so you can run InnoDB on them.
NOTE:
For prepare innobackupex --apply-log should be used which will read InnoDB configuration from backup-my.cnf automatically, or --defaults-file=backup-my.cnf option should be passed to the xtrabackup binary if it is used for preparing the backup. Otherwise it could lead to incorrect restore because xtrabackup could use wrong configuration options.
You can run the prepare operation on any machine; it does not need to be on the originating server or the server to which you intend to restore. You can copy the backup to a utility server and prepare it there, for example.
NOTE:
You can prepare a backup created with older Percona XtraBackup version with a newer one, but not vice versa. Preparing a backup on an unsupported server version should be done with the latest Percona XtraBackup release which supports that server version. For example, if one has a backup of MySQL 5.0 created with Percona XtraBackup 1.6, then preparing the backup with Percona XtraBackup 2.2 is not supported, because support for MySQL 5.0 was removed in 2.1. Instead, the latest release in the 2.0 series should be used.
During the prepare operation, xtrabackup boots up a kind of modified InnoDB that's embedded inside it (the libraries it was linked against). The modifications are necessary to disable InnoDB's standard safety checks, such as complaining that the log file isn't the right size, which aren't appropriate for working with backups. These modifications are only for the xtrabackup binary; you don't need a modified InnoDB to use xtrabackup for your backups.
The prepare step uses this "embedded InnoDB" to perform crash recovery on the copied datafiles, using the copied log file. The prepare step is very simple to use: you simply run xtrabackup with the --prepare option and tell it which directory to prepare, for example, to prepare the backup previously taken,
xtrabackup --prepare --target-dir=/data/backups/mysql/
When this finishes, you should see an "InnoDB shutdown" with a message such as the following, where again the value of LSN will depend on your system:
101107 16:40:15  InnoDB: Shutdown completed; log sequence number <LSN>
Your backup is now clean and consistent, and ready to restore. However, you might want to take an extra step to make restores as quick as possible. This is to prepare the backup a second time. The first time makes the data files perfectly self-consistent, but it doesn't create fresh InnoDB log files. If you restore the backup at this point and start MySQL, it will have to create new log files, which could take a little while, and you might not want to wait for that. If you run --prepare a second time, xtrabackup will create the log files for you, and output status text such as the following, which is abbreviated for clarity. The value of <SIZE> will depend on your MySQL configuration.
$ xtrabackup --prepare --target-dir=/data/backups/mysql/
xtrabackup: This target seems to be already prepared.
xtrabackup: notice: xtrabackup_logfile was already used to '--prepare'.
101107 16:54:10  InnoDB: Log file ./ib_logfile0 did not exist: new to be created
InnoDB: Setting log file ./ib_logfile0 size to <SIZE> MB
InnoDB: Database physically writes the file full: wait...
101107 16:54:10  InnoDB: Log file ./ib_logfile1 did not exist: new to be created
InnoDB: Setting log file ./ib_logfile1 size to <SIZE> MB
InnoDB: Database physically writes the file full: wait...
101107 16:54:15  InnoDB: Shutdown completed; log sequence number 1284108
All following prepares (third and following) will not change the already prepared data files, you can only see that output says
xtrabackup: This target seems to be already prepared.
xtrabackup: notice: xtrabackup_logfile was already used to '--prepare'.
It is not recommended to interrupt xtrabackup process while preparing backup - it may cause data files corruption and backup will become not usable. Backup validity is not guaranteed if prepare process was interrupted.
If you intend the backup to be the basis for further incremental backups, you should use the --apply-log-only option when preparing the backup, or you will not be able to apply incremental backups to it. See the documentation on preparing incremental backups for more details.

Restoring a Backup

The xtrabackup binary does not have any functionality for restoring a backup. That is up to the user to do. You might use rsync or cp to restore the files. You should check that the restored files have the correct ownership and permissions.
NOTE:
The datadir must be empty before restoring the backup. Also it's important to note that MySQL server needs to be shut down before restore is performed. You can't restore to a datadir of a running mysqld instance (except when importing a partial backup).
Example of the rsync command that can be used to restore the backup can look like this:
$ rsync -avrP /data/backup/ /var/lib/mysql/
As files' attributes will be preserved, in most cases you will need to change the files' ownership to mysql before starting the database server, as they will be owned by the user who created the backup:
$ chown -R mysql:mysql /var/lib/mysql
Note that xtrabackup backs up only the InnoDB data. You must separately restore the MySQL system database, MyISAM data, table definition files (.frm files), and everything else necessary to make your database functional -- or innobackupex can do it for you.

Other Types of Backups

Incremental Backups

Both xtrabackup and innobackupex tools supports incremental backups, which means that it can copy only the data that has changed since the last full backup. You can perform many incremental backups between each full backup, so you can set up a backup process such as a full backup once a week and an incremental backup every day, or full backups every day and incremental backups every hour.
Incremental backups work because each InnoDB page (usually 16kb in size) contains a log sequence number, or LSN. The LSN is the system version number for the entire database. Each page's LSN shows how recently it was changed. An incremental backup copies each page whose LSN is newer than the previous incremental or full backup's LSN. There are two algorithms in use to find the set of such pages to be copied. The first one, available with all the server types and versions, is to check the page LSN directly by reading all the data pages. The second one, available with Percona Server, is to enable the changed page tracking feature on the server, which will note the pages as they are being changed. This information will be then written out in a compact separate so-called bitmap file. The xtrabackup binary will use that file to read only the data pages it needs for the incremental backup, potentially saving many read requests. The latter algorithm is enabled by default if the xtrabackup binary finds the bitmap file. It is possible to specify --incremental-force-scan to read all the pages even if the bitmap data is available.
Incremental backups do not actually compare the data files to the previous backup's data files. In fact, you can use --incremental-lsn to perform an incremental backup without even having the previous backup, if you know its LSN. Incremental backups simply read the pages and compare their LSN to the last backup's LSN. You still need a full backup to recover the incremental changes, however; without a full backup to act as a base, the incremental backups are useless.

Creating an Incremental Backup

To make an incremental backup, begin with a full backup as usual. The xtrabackup binary writes a file called xtrabackup_checkpoints into the backup's target directory. This file contains a line showing the to_lsn, which is the database's LSN at the end of the backup. Create the full backup with a command such as the following:
xtrabackup --backup --target-dir=/data/backups/base --datadir=/var/lib/mysql/
If you want a usable full backup, use innobackupex since xtrabackup itself won't copy table definitions, triggers, or anything else that's not .ibd.
If you look at the xtrabackup_checkpoints file, you should see some contents similar to the following:
backup_type = full-backuped
from_lsn = 0
to_lsn = 1291135
Now that you have a full backup, you can make an incremental backup based on it. Use a command such as the following:
xtrabackup --backup --target-dir=/data/backups/inc1 \
--incremental-basedir=/data/backups/base --datadir=/var/lib/mysql/
The /data/backups/inc1/ directory should now contain delta files, such as ibdata1.delta and test/table1.ibd.delta. These represent the changes since the LSN 1291135. If you examine the xtrabackup_checkpoints file in this directory, you should see something similar to the following:
backup_type = incremental
from_lsn = 1291135
to_lsn = 1291340
The meaning should be self-evident. It's now possible to use this directory as the base for yet another incremental backup:
xtrabackup --backup --target-dir=/data/backups/inc2 \
--incremental-basedir=/data/backups/inc1 --datadir=/var/lib/mysql/

Preparing the Incremental Backups

The --prepare step for incremental backups is not the same as for normal backups. In normal backups, two types of operations are performed to make the database consistent: committed transactions are replayed from the log file against the data files, and uncommitted transactions are rolled back. You must skip the rollback of uncommitted transactions when preparing a backup, because transactions that were uncommitted at the time of your backup may be in progress, and it's likely that they will be committed in the next incremental backup. You should use the --apply-log-only option to prevent the rollback phase.
If you do not use the --apply-log-only option to prevent the rollback phase, then your incremental backups will be useless. After transactions have been rolled back, further incremental backups cannot be applied.
Beginning with the full backup you created, you can prepare it, and then apply the incremental differences to it. Recall that you have the following backups:
/data/backups/base
/data/backups/inc1
/data/backups/inc2
To prepare the base backup, you need to run --prepare as usual, but prevent the rollback phase:
xtrabackup --prepare --apply-log-only --target-dir=/data/backups/base
The output should end with some text such as the following:
101107 20:49:43  InnoDB: Shutdown completed; log sequence number 1291135
The log sequence number should match the to_lsn of the base backup, which you saw previously.
This backup is actually safe to restore as-is now, even though the rollback phase has been skipped. If you restore it and start MySQLInnoDB will detect that the rollback phase was not performed, and it will do that in the background, as it usually does for a crash recovery upon start. It will notify you that the database was not shut down normally.
To apply the first incremental backup to the full backup, you should use the following command:
xtrabackup --prepare --apply-log-only --target-dir=/data/backups/base \
--incremental-dir=/data/backups/inc1
This applies the delta files to the files in /data/backups/base, which rolls them forward in time to the time of the incremental backup. It then applies the redo log as usual to the result. The final data is in /data/backups/base, not in the incremental directory. You should see some output such as the following:
incremental backup from 1291135 is enabled.
xtrabackup: cd to /data/backups/base/
xtrabackup: This target seems to be already prepared.
xtrabackup: xtrabackup_logfile detected: size=2097152, start_lsn=(1291340)
Applying /data/backups/inc1/ibdata1.delta ...
Applying /data/backups/inc1/test/table1.ibd.delta ...
.... snip
101107 20:56:30  InnoDB: Shutdown completed; log sequence number 1291340
Again, the LSN should match what you saw from your earlier inspection of the first incremental backup. If you restore the files from /data/backups/base, you should see the state of the database as of the first incremental backup.
Preparing the second incremental backup is a similar process: apply the deltas to the (modified) base backup, and you will roll its data forward in time to the point of the second incremental backup:
xtrabackup --prepare --target-dir=/data/backups/base \
--incremental-dir=/data/backups/inc2
NOTE:
--apply-log-only should be used when merging all incrementals except the last one. That's why the previous line doesn't contain the --apply-log-only option. Even if the --apply-log-only was used on the last step, backup would still be consistent but in that case server would perform the rollback phase.
If you wish to avoid the notice that InnoDB was not shut down normally, when you have applied the desired deltas to the base backup, you can run --prepare again without disabling the rollback phase.

Partial Backups

xtrabackup supports taking partial backups when the innodb_file_per_table option is enabled. There are three ways to create partial backups: matching the tables' names with a regular expression, providing a list of them in a file or providing a list of databases.
WARNING:
If any of the matched or listed tables is deleted during the backup, xtrabackup will fail.
For the purposes of this manual page, we will assume that there is a database named test which contains tables named t1 and t2.

Using the --tables Option

The first method is with the --tables option. The option's value is a regular expression that is matched against the fully qualified tablename, including the database name, in the form databasename.tablename.
To back up only tables in the test database, you can use the following command:
$ xtrabackup --backup --datadir=/var/lib/mysql --target-dir=/data/backups/ \
--tables="^test[.].*"
To back up only the table test.t1, you can use the following command:
$ xtrabackup --backup --datadir=/var/lib/mysql --target-dir=/data/backups/ \
--tables="^test[.]t1"

Using the --tables-file Option

The --tables-file option specifies a file that can contain multiple table names, one table name per line in the file. Only the tables named in the file will be backed up. Names are matched exactly, case-sensitive, with no pattern or regular expression matching. The table names must be fully qualified, in databasename.tablename format.
$ echo "mydatabase.mytable" > /tmp/tables.txt
$ xtrabackup --backup --tables-file=/tmp/tables.txt

Using the --databases and --databases-file options

The --databases option accepts a space-separated list of the databases and tables to backup - in the databasename[.tablename] form. The --databases-file option specifies a file that can contain multiple databases and tables in the databasename[.tablename] form, one element name per line in the file. Only named databases and tables will be backed up. Names are matched exactly, case-sensitive, with no pattern or regular expression matching.

Preparing the Backup

When you use the --prepare option on a partial backup, you will see warnings about tables that don't exist. That is because these tables exist in the data dictionary inside InnoDB, but the corresponding .ibd files don't exist. They were not copied into the backup directory. These tables will be removed from the data dictionary, and when you restore the backup and start InnoDB, they will no longer exist and will not cause any errors or warnings to be printed to the log file.
An example of the error message you will see during the prepare phase follows.
InnoDB: Reading tablespace information from the .ibd files...
101107 22:31:30  InnoDB: Error: table 'test1/t'
InnoDB: in InnoDB data dictionary has tablespace id 6,
InnoDB: but tablespace with that id or name does not exist. It will be removed from data dictionary.

Compact Backups

When doing the backup of InnoDB tables it's possible to omit the secondary index pages. This will make the backups more compact and this way they will take less space on disk. The downside of this is that the backup prepare process takes longer as those secondary indexes need to be recreated. Difference in backup size depends on the size of the secondary indexes.
For example full backup taken without and with the --compact option:
#backup size without --compact
2.0G  xb_backup

#backup size taken with --compact option
1.4G  xb_compact_backup
NOTE:
Compact backups are not supported for system table space, so in order to work correctly innodb-file-per-table option should be enabled.
This feature was introduced in Percona XtraBackup 2.1.

Creating Compact Backups

To make a compact backup innobackupex needs to be started with the --compact option:
$ xtrabackup --backup --compact --target-dir=/data/backups
This will create a compact backup in the /data/backups.
If you check at the xtrabackup-checkpoints file in the target-dir folder, you should see something like:
backup_type = full-backuped
from_lsn = 0
to_lsn = 2888984349
last_lsn = 2888984349
compact = 1
When --compact wasn't used compact value will be 0. This way it's easy to check if the backup contains the secondary index pages or not.

Preparing Compact Backups

Preparing the compact require rebuilding the indexes as well. In order to prepare the backup a new option --rebuild-indexes should be used with --apply-logs:
$ xtrabackup --prepare --rebuild-indexes /data/backups/
Output, beside the standard innobackupex output, should contain the information about indexes being rebuilt, like:
[01] Checking if there are indexes to rebuild in table sakila/city (space id: 9)
[01]   Found index idx_fk_country_id
[01]   Rebuilding 1 index(es).
[01] Checking if there are indexes to rebuild in table sakila/country (space id: 10)
[01] Checking if there are indexes to rebuild in table sakila/customer (space id: 11)
[01]   Found index idx_fk_store_id
[01]   Found index idx_fk_address_id
[01]   Found index idx_last_name
[01]   Rebuilding 3 index(es).
Additionally, you can use the --rebuild-threads option to process tables in multiple threads when rebuilding indexes, e.g.:
$ xtrabackup --prepare --rebuild-indexes --rebuild-threads=16 /data/backups/
In this case Percona XtraBackup will create 16 worker threads with each thread rebuilding indexes for one table at a time. It will also show thread IDs for each message
Starting 16 threads to rebuild indexes.

[09] Checking if there are indexes to rebuild in table sakila/city (space id: 9)
[09]   Found index idx_fk_country_id
[10] Checking if there are indexes to rebuild in table sakila/country (space id: 10)
[11] Checking if there are indexes to rebuild in table sakila/customer (space id: 11)
[11]   Found index idx_fk_store_id
[11]   Found index idx_fk_address_id
[11]   Found index idx_last_name
[11]   Rebuilding 3 index(es).
Since Percona XtraBackup has no information when applying an incremental backup to a compact full one, on whether there will be more incremental backups applied to it later or not, rebuilding indexes needs to be explicitly requested by a user whenever a full backup with some incremental backups merged is ready to be restored. Rebuilding indexes unconditionally on every incremental backup merge is not an option, since it is an expensive operation.

Restoring Compact Backups

The xtrabackup binary does not have any functionality for restoring a backup. That is up to the user to do. You might use rsync or cp to restore the files. You should check that the restored files have the correct ownership and permissions.

Other Reading

  • Feature preview: Compact backups in Percona XtraBackup

Advanced Features

Throttling Backups

Although xtrabackup does not block your database's operation, any backup can add load to the system being backed up. On systems that do not have much spare I/O capacity, it might be helpful to throttle the rate at which xtrabackup reads and writes data. You can do this with the --throttle option, this option limits the number of I/O operations per second in 1 MB units.
Image below shows how throttling works when --throttle =1. [image]
In --backup mode, this option limits the number of pairs of read-and-write operations per second that xtrabackup will perform. If you are creating an incremental backup, then the limit is the number of read IO operations per second.
By default, there is no throttling, and xtrabackup reads and writes data as quickly as it can. If you set too strict of a limit on the I/O operations, the backup might be so slow that it will never catch up with the transaction logs that InnoDB is writing, so the backup might never complete.

Scripting Backups With xtrabackup

The xtrabackup tool has several features to enable scripts to control it while they perform related tasks. The innobackupex script is one example, but xtrabackup is easy to control with your own command-line scripts too.

Suspending After Copying

In backup mode, xtrabackup normally copies the log files in a background thread, copies the data files in a foreground thread, and then stops the log copying thread and finishes. If you use the --suspend-at-end option, instead of stopping the log thread and finishing, xtrabackup continues to copy the log files, and creates a file in the target directory called xtrabackup_suspended. As long as that file exists, xtrabackup will continue to watch the log files and copy them into the xtrabackup_logfile in the target directory. When the file is removed, xtrabackup will finish copying the log file and exit.
This functionality is useful for coordinating the InnoDB data backups with other actions. Perhaps the most obvious is copying the table definitions (the .frm files) so that the backup can be restored. You can start xtrabackup in the  background, wait for the xtrabackup_suspended file to be created, and then copy any other files you need to complete the backup. This is exactly what the innobackupex tool does (it also copies MyISAM data and other files).

Generating Meta-Data

It is a good idea for the backup to include all the information you need to restore the backup. The xtrabackup tool can print out the contents of a my.cnf file that are needed to restore the data and log files. If you add the --print-param option, it will print out something like the following:
# This MySQL options file was generated by XtraBackup.
[mysqld]
datadir = /data/mysql/
innodb_data_home_dir = /data/innodb/
innodb_data_file_path = ibdata1:10M:autoextend
innodb_log_group_home_dir = /data/innodb-logs/
You can redirect this output into a file in the target directory of the backup.

Agreeing on the Source Directory

It's possible that the presence of a defaults file or other factors could cause xtrabackup to back up data from a different location than you expected. To prevent this, you can use --print-param to ask it where it will be copying data from. You can use the output to ensure that xtrabackup and your script are working on the same dataset.

Log Streaming

You can instruct xtrabackup to omit copying data files, and simply stream the log file to its standard output instead with --log-stream. This automatically adds the --suspend-at-end option. Your script can then perform tasks such as streaming remote backups by piping the log files into an SSH connection and copying the data files to another server with a tool such as rsync or the xbstream binary.

Analyzing Table Statistics

The xtrabackup binary can analyze InnoDB data files in read-only mode to give statistics about them. To do this, you should use the --stats option. You can combine this with the --tables option to limit the files to examine. It also uses the --use-memory option.
You can perform the analysis on a running server, with some chance of errors due to the data being changed during analysis. Or, you can analyze a backup copy of the database. Either way, to use the statistics feature, you need a clean copy of the database including correctly sized log files, so you need to execute with --prepare twice to use this functionality on a backup.
The result of running on a backup might look like the following:
<INDEX STATISTICS>
  table: test/table1, index: PRIMARY, space id: 12, root page 3
  estimated statistics in dictionary:
    key vals: 25265338, leaf pages 497839, size pages 498304
  real statistics:
     level 2 pages: pages=1, data=5395 bytes, data/pages=32%
     level 1 pages: pages=415, data=6471907 bytes, data/pages=95%
        leaf pages: recs=25958413, pages=497839, data=7492026403 bytes, data/pages=91%
This can be interpreted as follows:
  • The first line simply shows the table and index name and its internal identifiers. If you see an index named GEN_CLUST_INDEX, that is the table's clustered index, automatically created because you did not explicitly create a PRIMARY KEY.
  • The estimated statistics in dictionary information is similar to the data that's gathered through ANALYZE TABLE inside of InnoDB to be stored as estimated cardinality statistics and passed to the query optimizer.
  • The real statistics information is the result of scanning the data pages and computing exact information about the index.
  • The level <X> pages: output means that the line shows information about pages at that level in the index tree. The larger <X> is, the farther it is from the leaf pages, which are level 0. The first line is the root page.
  • The leaf pages output shows the leaf pages, of course. This is where the table's data is stored.
  • The external pages: output (not shown) shows large external pages that hold values too long to fit in the row itself, such as long BLOB and TEXT values.
  • The recs is the real number of records (rows) in leaf pages.
  • The pages is the page count.
  • The data is the total size of the data in the pages, in bytes.
  • The data/pages is calculated as (data / (pages * PAGE_SIZE)) * 100%. It will never reach 100% because of space reserved for page headers and footers.
A more detailed example is posted as a MySQL Performance Blog post.

Script to Format Output

The following script can be used to summarize and tabulate the output of the statistics information:
tabulate-xtrabackup-stats.pl

#!/usr/bin/env perl
use strict;
use warnings FATAL => 'all';
my $script_version = "0.1";

my $PG_SIZE = 16_384; # InnoDB defaults to 16k pages, change if needed.
my ($cur_idx, $cur_tbl);
my (%idx_stats, %tbl_stats);
my ($max_tbl_len, $max_idx_len) = (0, 0);
while ( my $line = <> ) {
   if ( my ($t, $i) = $line =~ m/table: (.*), index: (.*), space id:/ ) {
      $t =~ s!/!.!;
      $cur_tbl = $t;
      $cur_idx = $i;
      if ( length($i) > $max_idx_len ) {
         $max_idx_len = length($i);
      }
      if ( length($t) > $max_tbl_len ) {
         $max_tbl_len = length($t);
      }
   }
   elsif ( my ($kv, $lp, $sp) = $line =~ m/key vals: (\d+), \D*(\d+), \D*(\d+)/ ) {
      @{$idx_stats{$cur_tbl}->{$cur_idx}}{qw(est_kv est_lp est_sp)} = ($kv, $lp, $sp);
      $tbl_stats{$cur_tbl}->{est_kv} += $kv;
      $tbl_stats{$cur_tbl}->{est_lp} += $lp;
      $tbl_stats{$cur_tbl}->{est_sp} += $sp;
   }
   elsif ( my ($l, $pages, $bytes) = $line =~ m/(?:level (\d+)|leaf) pages:.*pages=(\d+), data=(\d+) bytes/ ) {
      $l ||= 0;
      $idx_stats{$cur_tbl}->{$cur_idx}->{real_pages} += $pages;
      $idx_stats{$cur_tbl}->{$cur_idx}->{real_bytes} += $bytes;
      $tbl_stats{$cur_tbl}->{real_pages} += $pages;
      $tbl_stats{$cur_tbl}->{real_bytes} += $bytes;
   }
}

my $hdr_fmt = "%${max_tbl_len}s %${max_idx_len}s %9s %10s %10s\n";
my @headers = qw(TABLE INDEX TOT_PAGES FREE_PAGES PCT_FULL);
printf $hdr_fmt, @headers;

my $row_fmt = "%${max_tbl_len}s %${max_idx_len}s %9d %10d %9.1f%%\n";
foreach my $t ( sort keys %tbl_stats ) {
   my $tbl = $tbl_stats{$t};
   printf $row_fmt, $t, "", $tbl->{est_sp}, $tbl->{est_sp} - $tbl->{real_pages},
      $tbl->{real_bytes} / ($tbl->{real_pages} * $PG_SIZE) * 100;
   foreach my $i ( sort keys %{$idx_stats{$t}} ) {
      my $idx = $idx_stats{$t}->{$i};
      printf $row_fmt, $t, $i, $idx->{est_sp}, $idx->{est_sp} - $idx->{real_pages},
         $idx->{real_bytes} / ($idx->{real_pages} * $PG_SIZE) * 100;
   }
}

Sample Script Output

The output of the above Perl script, when run against the sample shown in the previously mentioned blog post, will appear as follows:
          TABLE           INDEX TOT_PAGES FREE_PAGES   PCT_FULL
art.link_out104                    832383      38561      86.8%
art.link_out104         PRIMARY    498304         49      91.9%
art.link_out104       domain_id     49600       6230      76.9%
art.link_out104     domain_id_2     26495       3339      89.1%
art.link_out104 from_message_id     28160        142      96.3%
art.link_out104    from_site_id     38848       4874      79.4%
art.link_out104   revert_domain    153984      19276      71.4%
art.link_out104    site_message     36992       4651      83.4%
The columns are the table and index, followed by the total number of pages in that index, the number of pages not actually occupied by data, and the number of bytes of real data as a percentage of the total size of the pages of real data. The first line in the above output, in which the INDEX column is empty, is a summary of the entire table.

Working with Binary Logs

The xtrabackup binary integrates with information that InnoDB stores in its transaction log about the corresponding binary log position for committed transactions. This enables it to print out the binary log position to which a backup corresponds, so you can use it to set up new replication slaves or perform point-in-time recovery.

Finding the Binary Log Position

You can find the binary log position corresponding to a backup once the backup has been prepared. This can be done by either running the xtrabackup with --prepare or innobackupex with --apply-log option. If your backup is from a server with binary logging enabled, xtrabackup will create a file named xtrabackup_binlog_info in the target directory. This file contains the binary log file name and position of the exact point in the binary log to which the prepared backup corresponds.
You will also see output similar to the following during the prepare stage:
InnoDB: Last MySQL binlog file position 0 3252710, file name ./mysql-bin.000001
... snip ...
[notice (again)]
  If you use binary log and don't use any hack of group commit,
  the binary log position seems to be:
InnoDB: Last MySQL binlog file position 0 3252710, file name ./mysql-bin.000001
This output can also be found in the xtrabackup_binlog_pos_innodb file, but it is only correct when no other than XtraDB or InnoDB are used as storage engines.
If other storage engines are used (i.e. MyISAM), you should use the xtrabackup_binlog_info file to retrieve the position.
The message about hacking group commit refers to an early implementation of emulated group commit in Percona Server.

Point-In-Time Recovery

To perform a point-in-time recovery from an xtrabackup backup, you should prepare and restore the backup, and then replay binary logs from the point shown in the xtrabackup_binlog_info file.
A more detailed procedure is found here (with innobackupex).

Setting Up a New Replication Slave

To set up a new replica, you should prepare the backup, and restore it to the data directory of your new replication slave. Then in your CHANGE MASTER TO command, use the binary log filename and position shown in the xtrabackup_binlog_info file to start replication.
A more detailed procedure is found in  ../howtos/setting_up_replication.

Restoring Individual Tables

In server versions prior to 5.6, it is not possible to copy tables between servers by copying the files, even with innodb_file_per_table. However, with Percona XtraBackup, you can export individual tables from any InnoDB database, and import them into Percona Server with XtraDB or MySQL 5.6. (The source doesn't have to be XtraDB or or MySQL 5.6, but the destination does.) This only works on individual .ibd files, and cannot export a table that is not contained in its own .ibd file.
Let's see how to export and import the following table:
CREATE TABLE export_test (
  a int(11) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
NOTE:
If you're running Percona Server version older than 5.5.10-20.1, variable innodb_expand_import should be used instead of innodb_import_table_from_xtrabackup.

Exporting the Table

This table should have been created in innodb_file_per_table mode, so after taking a backup as usual with --backup, the .ibd file should exist in the target directory:
$ find /data/backups/mysql/ -name export_test.*
/data/backups/mysql/test/export_test.ibd
when you prepare the backup, add the extra parameter --export to the command. Here is an example:
$ xtrabackup --prepare --export --target-dir=/data/backups/mysql/
Now you should see a .exp file in the target directory:
$ find /data/backups/mysql/ -name export_test.*
/data/backups/mysql/test/export_test.exp
/data/backups/mysql/test/export_test.ibd
/data/backups/mysql/test/export_test.cfg
These three files are all you need to import the table into a server running Percona Server with XtraDB or MySQL 5.6.
NOTE:
MySQL uses .cfg file which contains InnoDB dictionary dump in special format. This format is different from the .exp` one which is used in XtraDB for the same purpose. Strictly speaking, a .cfg` file is not required to import a tablespace to MySQL 5.6 or Percona Server 5.6. A tablespace will be imported successfully even if it is from another server, but InnoDB will do schema validation if the corresponding .cfg file is present in the same directory.

Importing the Table

On the destination server running Percona Server with XtraDB and innodb_import_table_from_xtrabackup option enabled, or MySQL 5.6, create a table with the same structure, and then perform the following steps:
  • Execute ALTER TABLE test.export_test DISCARD TABLESPACE;
    • If you see the following message, then you must enable innodb_file_per_table and create the table again: ERROR 1030 (HY000): Got error -1 from storage engine
  • Copy the exported files to the test/ subdirectory of the destination server's data directory
  • Execute ALTER TABLE test.export_test IMPORT TABLESPACE;
The table should now be imported, and you should be able to SELECT from it and see the imported data.
NOTE:
Persistent statistics for imported tablespace will be empty until you run the ANALYZE TABLE on the imported table. They will be empty because they are stored in the system tables mysql.innodb_table_stats and mysql.innodb_index_stats and they aren't updated by server during the import. This is due to upstream bug #72368.

LRU dump backup

This feature reduces the warm up time by restoring buffer pool state from ib_lru_dump file after restart. Percona XtraBackup discovers ib_lru_dump and backs it up automatically. [image]
If the buffer restore option is enabled in my.cnf buffer pool will be in the warm state after backup is restored. To enable this set the variable innodb_buffer_pool_restore_at_startup =1 in Percona Server 5.5 or innodb_auto_lru_dump =1 in Percona Server 5.1.

Implementation

Implementation Details

This page contains notes on various internal aspects of the xtrabackup tool's operation.

File Permissions

xtrabackup opens the source data files in read-write mode, although it does not modify the files. This means that you must run xtrabackup as a user who has permission to write the data files. The reason for opening the files in read-write mode is that xtrabackup uses the embedded InnoDB libraries to open and read the files, and InnoDB opens them in read-write mode because it normally assumes it is going to write to them.

Tuning the OS Buffers

Because xtrabackup reads large amounts of data from the filesystem, it uses posix_fadvise() where possible, to instruct the operating system not to try to cache the blocks it reads from disk. Without this hint, the operating system would prefer to cache the blocks, assuming that xtrabackup is likely to need them again, which is not the case. Caching such large files can place pressure on the operating system's virtual memory and cause other processes, such as the database server, to be swapped out. The xtrabackup tool avoids this with the following hint on both the source and destination files:
posix_fadvise(file, 0, 0, POSIX_FADV_DONTNEED)
In addition, xtrabackup asks the operating system to perform more aggressive read-ahead optimizations on the source files:
posix_fadvise(file, 0, 0, POSIX_FADV_SEQUENTIAL)

Copying Data Files

When copying the data files to the target directory, xtrabackup reads and writes 1MB of data at a time. This is not configurable. When copying the log file, xtrabackup reads and writes 512 bytes at a time. This is also not possible to configure, and matches InnoDB's behavior (workaround exists in Percona Server because it has an option to tune innodb_log_block_size for XtraDB, and in that case Percona XtraBackup will match the tuning).
After reading from the files, xtrabackup iterates over the 1MB buffer a page at a time, and checks for page corruption on each page with InnoDB's buf_page_is_corrupted() function. If the page is corrupt, it re-reads and retries up to 10 times for each page. It skips this check on the doublewrite buffer.

xtrabackup Exit Codes

The xtrabackup binary exits with the traditional success value of 0 after a backup when no error occurs. If an error occurs during the backup, the exit value is 1.
In certain cases, the exit value can be something other than 0 or 1, due to the command-line option code included from the MySQL libraries. An unknown command-line option, for example, will cause an exit code of 255.

References

The xtrabackup Option Reference

This page documents all of the command-line options for the xtrabackup binary.

Options

--apply-log-only
This option causes only the redo stage to be performed when preparing a backup. It is very important for incremental backups.
--backup
Make a backup and place it in --target-dir. See Creating a backup.
--binlog-info
This option controls how Percona XtraBackup should retrieve server's binary log coordinates corresponding to the backup. Possible values are OFFONLOCKLESS and AUTO. See the Percona XtraBackup lockless_bin-log  manual page for more information
--close-files
Do not keep files opened. When xtrabackup opens tablespace it normally doesn't close its file handle in order to handle the DDL operations correctly. However, if the number of tablespaces is really huge and can not fit into any limit, there is an option to close file handles once they are no longer accessed. Percona XtraBackup can produce inconsistent backups with this option enabled. Use at your own risk.
--compact
Create a compact backup by skipping secondary index pages.
--compress
This option tells xtrabackup to compress all output data, including the transaction log file and meta data files, using the specified compression algorithm. The only currently supported algorithm is 'quicklz'. The resulting files have the qpress archive format, i.e. every *.qp file produced by xtrabackup is essentially a one-file qpress archive and can be extracted and uncompressed by the qpress  file archiver.
--compress-chunk-size=#
Size of working buffer(s) for compression threads in bytes. The default value is 64K.
--compress-threads=#
This option specifies the number of worker threads used by xtrabackup for parallel data compression. This option defaults to 1. Parallel compression ('--compress-threads') can be used together with parallel file copying ('--parallel'). For example, '--parallel=4 --compress --compress-threads=2' will create 4 IO threads that will read the data and pipe it to 2 compression threads.
--create-ib-logfile
This option is not currently implemented. To create the InnoDB log files, you must prepare the backup twice at present.
--datadir=DIRECTORY
The source directory for the backup. This should be the same as the datadir for your MySQL server, so it should be read from my.cnf if that exists; otherwise you must specify it on the command line.
--defaults-extra-file=[MY.CNF]
Read this file after the global files are read. Must be given as the first option on the command-line.
--defaults-file=[MY.CNF]
Only read default options from the given file. Must be given as the first option on the command-line. Must be a real file; it cannot be a symbolic link.
--defaults-group=GROUP-NAME
This option is to set the group which should be read from the configuration file. This is used by innobackupex if you use the --defaults-group option. It is needed for mysqld_multi deployments.
--export
Create files necessary for exporting tables. See Restoring Individual Tables.
--extra-lsndir=DIRECTORY
(for --backup): save an extra copy of the xtrabackup_checkpoints file in this directory.
--incremental-basedir=DIRECTORY
When creating an incremental backup, this is the directory containing the full backup that is the base dataset for the incremental backups.
--incremental-dir=DIRECTORY
When preparing an incremental backup, this is the directory where the incremental backup is combined with the full backup to make a new full backup.
--incremental-force-scan
When creating an incremental backup, force a full scan of the data pages in the instance being backuped even if the complete changed page bitmap data is available.
--incremental-lsn=LSN
When creating an incremental backup, you can specify the log sequence number (LSN) instead of specifying --incremental-basedir. For databases created by MySQL and Percona Server 5.0-series versions, specify the LSN as two 32-bit integers in high:low format. For databases created in 5.1 and later, specify the LSN as a single 64-bit integer.  ##ATTENTION##: If a wrong LSN value is specified (a user error which XtraBackup is unable to detect), the backup will be unusable. Be careful!
--innodb-log-arch-dir=DIRECTORY
This option is used to specify the directory containing the archived logs. It can only be used with the xtrabackup --prepare option.
--innodb-miscellaneous
There is a large group of InnoDB options that are normally read from the my.cnf configuration file, so that xtrabackup boots up its embedded InnoDB in the same configuration as your current server. You normally do not need to specify these explicitly. These options have the same behavior that they have in InnoDB or XtraDB. They are as follows:
--innodb-adaptive-hash-index
--innodb-additional-mem-pool-size
--innodb-autoextend-increment
--innodb-buffer-pool-size
--innodb-checksums
--innodb-data-file-path
--innodb-data-home-dir
--innodb-doublewrite-file
--innodb-doublewrite
--innodb-extra-undoslots
--innodb-fast-checksum
--innodb-file-io-threads
--innodb-file-per-table
--innodb-flush-log-at-trx-commit
--innodb-flush-method
--innodb-force-recovery
--innodb-io-capacity
--innodb-lock-wait-timeout
--innodb-log-buffer-size
--innodb-log-files-in-group
--innodb-log-file-size
--innodb-log-group-home-dir
--innodb-max-dirty-pages-pct
--innodb-open-files
--innodb-page-size
--innodb-read-io-threads
--innodb-write-io-threads
--log-copy-interval=#
This option specifies time interval between checks done by log copying thread in milliseconds (default is 1 second).
--log-stream
Makes xtrabackup not copy data files, and output the contents of the InnoDB log files to STDOUT until the --suspend-at-end file is deleted. This option enables --suspend-at-endautomatically.
--no-defaults
Don't read default options from any option file. Must be given as the first option on the command-line.
--databases=#
This option specifies the list of databases and tables that should be backed up. The option accepts the list of the form "databasename1[.table_name1] databasename2[.table_name2] . . .".
--databases-file=#
This option specifies the path to the file containing the list of databases and tables that should be backed up. The file can contain the list elements of the form databasename1[.table_name1], one element per line.
--parallel=#
This option specifies the number of threads to use to copy multiple data files concurrently when creating a backup. The default value is 1 (i.e., no concurrent transfer).
--prepare
Makes xtrabackup perform recovery on a backup created with --backup, so that it is ready to use. See preparing a backup.
--print-defaults
Print the program argument list and exit. Must be given as the first option on the command-line.
--print-param
Makes xtrabackup print out parameters that can be used for copying the data files back to their original locations to restore them. See scripting-xtrabackup.
--rebuild_indexes
Rebuild secondary indexes in InnoDB tables after applying the log. Only has effect with --prepare.
--rebuild_threads=#
Use this number of threads to rebuild indexes in a compact backup. Only has effect with --prepare and --rebuild-indexes.
--secure-auth
Refuse client connecting to server if it uses old (pre-4.1.1) protocol. (Enabled by default; use --skip-secure-auth to disable.)
--stats
Causes xtrabackup to scan the specified data files and print out index statistics.
--stream=name
Stream all backup files to the standard output in the specified format. Currently supported formats are 'xbstream' and 'tar'.
--suspend-at-end
Causes xtrabackup to create a file called xtrabackup_suspended in the --target-dir. Instead of exiting after copying data files, xtrabackup continues to copy the log file, and waits until the xtrabackup_suspended file is deleted. This enables xtrabackup and other programs to coordinate their work. See scripting-xtrabackup.
--tables=name
A regular expression against which the full tablename, in databasename.tablename format, is matched. If the name matches, the table is backed up. See partial backups.
--tables-file=name
A file containing one table name per line, in databasename.tablename format. The backup will be limited to the specified tables. See scripting-xtrabackup.
--target-dir=DIRECTORY
This option specifies the destination directory for the backup. If the directory does not exist, xtrabackup creates it. If the directory does exist and is empty, xtrabackup will succeed. xtrabackup will not overwrite existing files, however; it will fail with operating system error 17, file exists.
If this option is a relative path, it is interpreted as being relative to the current working directory from which xtrabackup is executed.
--throttle=#
This option limits --backup to the specified number of read+write pairs of operations per second. See throttling a backup.
--tmpdir=name
This option is currently not used for anything except printing out the correct tmpdir parameter when --print-param is used.
--to-archived-lsn=LSN
This option is used to specify the LSN to which the logs should be applied when backups are being prepared. It can only be used with the xtrabackup --prepare option.
--use-memory=#
This option affects how much memory is allocated for preparing a backup with --prepare, or analyzing statistics with --stats. Its purpose is similar to innodb_buffer_pool_size. It does not do the same thing as the similarly named option in Oracle's InnoDB Hot Backup tool. The default value is 100MB, and if you have enough available memory, 1GB to 2GB is a good recommended value. Multiples are supported providing the unit (e.g. 1MB, 1M, 1GB, 1G).
--version
This option prints xtrabackup version and exits.

No comments:

Post a Comment