Thursday, November 30, 2017

How Percona XtraBackup Works

How Percona XtraBackup Works

Percona XtraBackup is based on InnoDB‘s crash-recovery functionality. It copies your InnoDB data files, which results in data that is internally inconsistent; but then it performs crash recovery on the files to make them a consistent, usable database again.
This works because InnoDB maintains a redo log, also called the transaction log. This contains a record of every change to InnoDB data. When InnoDB starts, it inspects the data files and the transaction log, and performs two steps. It applies committed transaction log entries to the data files, and it performs an undo operation on any transactions that modified data but did not commit.
Percona XtraBackup works by remembering the log sequence number (LSN) when it starts, and then copying away the data files. It takes some time to do this, so if the files are changing, then they reflect the state of the database at different points in time. At the same time, Percona XtraBackup runs a background process that watches the transaction log files, and copies changes from it. Percona XtraBackup needs to do this continually because the transaction logs are written in a round-robin fashion, and can be reused after a while. Percona XtraBackupneeds the transaction log records for every change to the data files since it began execution.
Percona XtraBackup will use Backup lockswhere available as a lightweight alternative to FLUSH TABLES WITH READLOCK. This feature is available in Percona Server 5.6+. Percona XtraBackup uses this automatically to copy non-InnoDB data to avoid blocking DML queries that modify InnoDB tables. When backup locks are supported by the server, xtrabackup will first copy InnoDB data, run the LOCKTABLES FOR BACKUP and copy the MyISAM tables and .frm files. Once this is done, the backup of the files will begin. It will backup .frm.MRG.MYD.MYI.TRG.TRN.ARM.ARZ.CSM.CSV.par, and.opt files.
Note
Locking is done only for MyISAMand other non-InnoDB tables, and only after Percona XtraBackup is finished backing up all InnoDB/XtraDB data and logs. Percona XtraBackup will use Backup locks where available as a lightweight alternative to FLUSHTABLES WITH READ LOCK. This feature is available in Percona Server 5.6+. Percona XtraBackupuses this automatically to copy non-InnoDB data to avoid blocking DML queries that modify InnoDBtables.
After that xtrabackup will use LOCKBINLOG FOR BACKUP to block all operations that might change either binary log position orExec_Master_Log_Pos or Exec_Gtid_Set (i.e. master binary log coordinates corresponding to the current SQL thread state on a replication slave) as reported by SHOW MASTER/SLAVESTATUSxtrabackup will then finish copying the REDO log files and fetch the binary log coordinates. After this is completed xtrabackup will unlock the binary log and tables.
Finally, the binary log position will be printed to STDERR and xtrabackup will exit returning 0 if all went OK.
Note that the STDERR of xtrabackup is not written in any file. You will have to redirect it to a file, e.g., xtrabackupOPTIONS 2> backupout.log.
It will also create the following files in the directory of the backup.
During the prepare phase, Percona XtraBackup performs crash recovery against the copied data files, using the copied transaction log file. After this is done, the database is ready to restore and use.
The backed-up MyISAM and InnoDBtables will be eventually consistent with each other, because after the prepare (recovery) process, InnoDB‘s data is rolled forward to the point at which the backup completed, not rolled back to the point at which it started. This point in time matches where the FLUSH TABLES WITHREAD LOCK was taken, so the MyISAMdata and the prepared InnoDB data are in sync.
The xtrabackup and innobackupextools both offer many features not mentioned in the preceding explanation. Each tool’s functionality is explained in more detail further in the manual. In brief, though, the tools permit you to do operations such as streaming and incremental backups with various combinations of copying the data files, copying the log files, and applying the logs to the data.

Restoring a backup

To restore a backup with xtrabackupyou can use the xtrabackup --copy-back or xtrabackup --move-backoptions.
xtrabackup will read from the my.cnfthe variables datadir,innodb_data_home_dirinnodb_data_file_path,innodb_log_group_home_dir and check that the directories exist.
It will copy the MyISAM tables, indexes, etc. (.frm.MRG.MYD.MYI.TRG.TRN.ARM.ARZ.CSM.CSVpar and .optfiles) first, InnoDB tables and indexes next and the log files at last. It will preserve file’s attributes when copying them, you may have to change the files’ ownership to mysql before starting the database server, as they will be owned by the user who created the backup.
Alternatively, the xtrabackup --move-back option may be used to restore a backup. This option is similar to xtrabackup --copy-back with the only difference that instead of copying files it moves them to their target locations. As this option removes backup files, it must be used with caution. It is useful in cases when there is not enough free disk space to hold both data files and their backup copies.

Percona TokuBackup in Percona Sever 5.7

Percona TokuBackup

Percona TokuBackup is an open-source hot backup utility for MySQL servers running the TokuDB storage engine (including Percona Server and MariaDB). It does not lock your database during backup. The TokuBackup library intercepts system calls that write files and duplicates the writes to the backup directory.
Note
This feature is currently considered Experimental

Installing From Binaries

TokuBackup is included with Percona Server 5.7.10-1 and later versions. Installation can be performed with the ps_tokudb_admin script.
To install Percona TokuBackup:
  1. Run ps_tokudb_admin --enable-backup to add the preload-hotbackup option into [mysqld_safe] section of my.cnf.
$ sudo ps_tokudb_admin --enable-backup
Checking SELinux status...
INFO: SELinux is disabled.

Checking if preload-hotbackup option is already set in config file...
INFO: Option preload-hotbackup is not set in the config file.

Checking TokuBackup plugin status...
INFO: TokuBackup plugin is not installed.

Adding preload-hotbackup option into /etc/my.cnf
INFO: Successfully added preload-hotbackup option into /etc/my.cnf
PLEASE RESTART MYSQL SERVICE AND RUN THIS SCRIPT AGAIN TO FINISH INSTALLATION!
  1. Restart mysql service
$ sudo service mysql restart
  1. Run ps_tokudb_admin --enable-backup again to finish installation of TokuBackup plugin
$ sudo ps_tokudb_admin --enable-backup
Checking SELinux status...
INFO: SELinux is disabled.

Checking if preload-hotbackup option is already set in config file...
INFO: Option preload-hotbackup is set in the config file.

Checking TokuBackup plugin status...
INFO: TokuBackup plugin is not installed.

Checking if Percona Server is running with libHotBackup.so preloaded...
INFO: Percona Server is running with libHotBackup.so preloaded.

Installing TokuBackup plugin...
INFO: Successfully installed TokuBackup plugin.

Making a Backup

To run Percona TokuBackup, the backup destination directory must exist, be writable and owned by the same user under which MySQL server is running (usually mysql) and empty. Once this directory is created, the backup can be run using the following command:
mysql> set tokudb_backup_dir='/path_to_empty_directory';
Note
Setting the tokudb_backup_dir variable automatically starts the backup process to the specified directory. Percona TokuBackup will take full backup each time, currently there is no incremental backup option

Restoring From Backup

Percona TokuBackup does not have any functionality for restoring a backup. You can use rsync or cp to restore the files. You should check that the restored files have the correct ownership and permissions.
Note
Make sure that the datadir is empty and that MySQL server is shut down before restoring from backup. You can’t restore to a datadir of a running mysqld instance (except when importing a partial backup).
The following example shows how you might use the rsync command to restore the backup:
$ rsync -avrP /data/backup/ /var/lib/mysql/
Since attributes of files are preserved, in most cases you will need to change their ownership to mysql before starting the database server. Otherwise, the files will be owned by the user who created the backup.
$ chown -R mysql:mysql /var/lib/mysql
If you have changed default TokuDB data directory (tokudb_data_dir) or TokuDB log directory (tokudb_log_dir) or both of them, you will see separate folders for each setting in backup directory after taking backup. You’ll need to restore each folder separately:
$ rsync -avrP /data/backup/mysql_data_dir/ /var/lib/mysql/
$ rsync -avrP /data/backup/tokudb_data_dir/ /path/to/original/tokudb_data_dir/
$ rsync -avrP /data/backup/tokudb_log_dir/ /path/to/original/tokudb_log_dir/
$ chown -R mysql:mysql /var/lib/mysql
$ chown -R mysql:mysql /path/to/original/tokudb_data_dir
$ chown -R mysql:mysql /path/to/original/tokudb_log_dir

Advanced Configuration

Monitoring Progress

TokuBackup updates the PROCESSLIST state while the backup is in progress. You can see the output by running SHOW PROCESSLIST or SHOW FULL PROCESSLIST.

Excluding Source Files

You can exclude certain files and directories based on a regular expression set in the tokudb_backup_exclude session variable. If the source file name matches the excluded regular expression, then the source file is excluded from backup.
For example, to exclude all lost+found directories from backup, use the following command:
mysql> SET tokudb_backup_exclude='/lost\\+found($|/)';
Note
In Percona Server 5.7.10-3 to address bug #125, server pid file is excluded by default. If you’re providing your own additions to the exclusions and have the pid file in the default location, you will need to add the mysqld_safe.pid entry.

Throttling Backup Rate

You can throttle the backup rate using the tokudb_backup_throttle session-level variable. This variable throttles the write rate in bytes per second of the backup to prevent TokuBackup from crowding out other jobs in the system. The default and max value is 18446744073709551615.
mysql> SET tokudb_backup_throttle=1000000;

Restricting Backup Target

You can restrict the location of the destination directory where the backups can be located using the tokudb_backup_allowed_prefix system-level variable. Attempts to backup to a location outside of the specified directory or its children will result in an error.
The default is null, backups have no restricted locations. This read-only variable can be set in the my.cnf configuration file and displayed with the SHOW VARIABLES command:
mysql> SHOW VARIABLES LIKE 'tokudb_backup_allowed_prefix';
+------------------------------+-----------+
| Variable_name                | Value     |
+------------------------------+-----------+
| tokudb_backup_allowed_prefix | /dumpdir  |
+------------------------------+-----------+

Reporting Errors

Percona TokuBackup uses two variables to capture errors. They are tokudb_backup_last_error and tokudb_backup_last_error_string. When TokuBackup encounters an error, these will report on the error number and the error string respectively. For example, the following output shows these parameters following an attempted backup to a directory that was not empty:
mysql> SET tokudb_backup_dir='/tmp/backupdir';
ERROR 1231 (42000): Variable 'tokudb_backup_dir' can't be set to the value of '/tmp/backupdir'

mysql> SELECT @@tokudb_backup_last_error;
+----------------------------+
| @@tokudb_backup_last_error |
+----------------------------+
|                         17 |
+----------------------------+

mysql> SELECT @@tokudb_backup_last_error_string;
+---------------------------------------------------+
| @@tokudb_backup_last_error_string                 |
+---------------------------------------------------+
| tokudb backup couldn't create needed directories. |
+---------------------------------------------------+

Limitations and known issues

  • You must disable InnoDB asynchronous IO if backing up InnoDB tables with TokuBackup. Otherwise you will have inconsistent, unrecoverable backups. The appropriate setting is innodb_use_native_aio=0.
  • To be able to run Point-In-Time-Recovery you’ll need to manually get the binary log position.
  • Transactional storage engines (TokuDB and InnoDB) will perform recovery on the backup copy of the database when it is first started.
  • Tables using non-transactional storage engines (MyISAM) are not locked during the copy and may report issues when starting up the backup. It is best to avoid operations that modify these tables at the end of a hot backup operation (adding/changing users, stored procedures, etc.).
  • The database is copied locally to the path specified in /path/to/backup. This folder must exist, be writable, be empty, and contain enough space for a full copy of the database.
  • TokuBackup always makes a backup of the MySQL datadir and optionally the tokudb_data_dir, tokudb_log_dir, and the binary log folder. The latter three are only backed up separately if they are not the same as or contained in the MySQL datadir. None of these three folders can be a parent of the MySQL datadir.
  • No other directory structures are supported. All InnoDB, MyISAM, and other storage engine files must be within the MySQL datadir.
  • TokuBackup does not follow symbolic links.
  • TokuBackup does not backup MySQL configuration file(s).
  • TokuBackup does not backup tablespaces if they are out of datadir.
  • Due to upstream bug #80183, TokuBackup can’t recover backed-up table data if backup was taken while running OPTIMIZE TABLE or ALTER TABLE ... TABLESPACE.
  • TokuBackup doesn’t support incremental backups.

TokuDB backup : use mylvmbackup for Hotbackups

mylvmbackup is a tool for quickly creating full physical backups of a MySQL server's data files. To perform a backup, mylvmbackup obtains a read lock on all tables and flushes all server caches to disk, makes an LVM snapshot of the volume containing the MySQL data directory, and unlocks the tables again. The snapshot process takes only a small amount of time. When it is done, the server can continue normal operations, while the actual file backup proceeds.

The LVM snapshot is mounted to a temporary directory and all data is backed up using the tar program by default. The archive files are created using names in the form of backup-YYYYMMDD_hhmmss_mysql.tar.gz, where YYYY, MM, DD, hh, mm and ss represent the year, month, day, hour, minute, and second of the time at which the backup occurred. The default prefix backup, date format and file suffix may be modified. The use of timestamped archive names allows you to run mylvmbackup many times without risking to overwrite old archives. It is possible to preserve only a defined number of last backups, to avoid running out of disk space.

Alternatively, instead of tar, you can use rsync, rsnap or zbackup to perform the archiving.
The rsync backup can perform both local backups as well as backing up to a remote server using rsyncd or rsync via SSH.

rsnap is a wrapper around rsync to automatically maintain and rotate a given number of last backups (7 by default). It utilizes hard links to link to unchanged files for saving disk space.
zbackup is a globally-deduplicating backup tool. Feed a large .tar into it, and it will store duplicate regions of it only once, then compress and optionally encrypt the result. Feed another .tar file, and it will also re-use any data found in any previous backups. This way only new changes are stored, and as long as the files are not very different, the amount of storage required is very low. Any of the backup files stored previously can be read back in full at any time.
Moreover, a backup type none is provided for cases where the user wants to use mylvmbackup only for creating the snapshots and intends to perform the actual backup by using the appropriate hooks. (Or for cases where the snapshot itself is considered to be the backup).

mylvmbackup also provides several methods for logging and reporting the progress and success of a backup run. The log messages can either be printed to the console (STDOUT) or logged via syslog.
Additionally, a report can be sent to you via email, and you can receive notifications about failures or successes via SNMP.

It is required to run mylvmbackup on the same host where the MySQL server runs. If your MySQL daemon is not listening on localhost or using the default socket location, you must specify --host or --socket. Even though mylvmbackup communicates with the server through a normal client connection to obtain the read lock and flush data, it performs the actual backup by accessing the file system directly. It is also a requirement that the MySQL server's data directory resides on an LVM volume. (It is, however, a good idea to do the LVM backup to a different partition than the one where the data directory resides. Otherwise, there is a good chance that LVM will run out of undo space for LVM snapshot maintenance and the backup will fail.)

TokuDB tips: MySQL backups

TokuDB tips: MySQL backups

Using TokuDB in production. 

We are using Percona Server + TokuDB as a main storage engine in Percona Cloud Tools to store timeseries data.
And, yes, Percona Server + TokuDB is available GA Percona Server 5.6.19-67.0 with TokuDB (GA).
Just having good performance is not enough to make it into production; there are also operational questions and one such question is about backups. I want to explain how we do backups for Percona Server + TokuDB in Percona Cloud Tools.
I should say up front, that we DO NOT have support for TokuDB in Percona XtraBackup. TokuDB internals are significantly different from InnoDB/XtraDB, so it will be a major project to add this to Percona XtraBackup and we do not have any plans at the moment to work on this.
It does not mean that TokuDB users do not have options for backups. There is Tokutek Hot back-up, included in the Tokutek Enterpise Subscription. And there is a method we use in Percona Cloud Tools: LVM Backups. We use mylvmbackup scripts for this task and it works fairly well for us.
There is however some gotchas to be aware. If you understand an LVM backups mechanic, this is basically a managed crash recovery process when you restore from a backup.
Now we need to go in a little detail for TokuDB. To support transactions that involve both TokuDB and InnoDB engines, TokuDB uses a two-phase commit mechanism in MySQL. When involved, the two-phase commit requires binary logs presented for a proper recovery procedures.
But now we need to take a look at how we setup a binary log in Percona Cloud Tools. We used SSD for the main data storage (LVM partition is here) and we use a Hardware RAID1 over two hard-drives for binary logs. We choose this setup as we care about SSD lifetime. In write-intensive workloads, binary logs will produce a lot of write operations and in our calculation we will just burn these SSDs, so we have to store them on something less expensive.
So the problem there is that when we take an LVM snapshot over main storage, we do not have a consistent view of binary logs (although it is possible to modify backup scripts to copy the current binary log under FLUSH TABLES WITH READ LOCK operation, this is probably what we will do next). But binary logs are needed for recovery, without them we face these kind of errors during restoring from backup:
The error message actually hints a way out. Unfortunately it seems that we are the first ones to have ever tried this option, as tc-heuristic-recover is totally broken in current MySQL and not supposed to work… and it would be noticed if someone really tried it before us (which gives me an impression that Oracle/MySQL never properly tested it, but that is a different story).
We will fix this in Percona Server soon.
So the way to handle a recovery from LVM backup without binary logs is to start mysqld with –tc-heuristic-recover switch (unfortunately I did not figure out yet, should it be COMMIT or ROLLBACK value, hehe).
The proper way to use LVM backup is to have a corresponding binary log file, like I said it will require a modification to mylvmbackup script.
I should say this is not the only way we do backups in Percona Cloud Tools. In this project we use Percona Backup Service provided by the Percona Managed Services team, and our team also uses mydumper to perform a logical backup of data.
While it works acceptably to backup hundreds of gigabytes worth of data (it is just a sequential scan, which should be easy for TokuDB), the full recovery is painful and takes unacceptably long. So mydumper backup (recovery) will be used if we ever need to perform a fine-grained recovery (i.e only small amount of specific tables).

So I hope this tip is useful if you are looking for info about how to do backups for TokuDB.

See also :

Improving TokuDB Hot Backup Usage with the autotokubackup Command Line Tool

Improving TokuDB Hot Backup Usage with the autotokubackup Command Line Tool


In this blog post, we’ll look at how the command line tool autotokubackup can make TokuDB hot backups easier.
I would like to share an experimental tool named autotokubackup, for TokuBackup. This tool is aimed at helping simplify the life of TokuDB system administrators. autotokubackup is written in the Python language.
General information:
So why would you need this tool? Let’s clarify a bit what you might face while using tokubackup. You have a backup solution that you can use from the MySQL shell:
Now you want to automate this process. The first problem is that the second backup will fail, because it’s required that the backup directory is empty before starting a backup process. One solution is to create time-stamped directories and for the backups.
Further, you have a backup policy that requires some other necessary files are copied as part of your backup process. You need to write a script to put those files into a separate folder under backup directory.
Another issue you will face is the lack of any clear output on backup progress. The shell just pauses until the backup completes. The one possible way to obtain information about the backup process is displaying the MySQL processlist in a separate MySQL shell. But it isn’t the best way, and there are some issues, as reported here: Unclear status information of backup state while taking backups using TokuBackup.
Generally, we need to know which files are backed up during the backup process. There should also be a clear message indicating the end of the backup process.
To make your life easier, the autotokubackup tool:
  • Automates the TokuDB database backup procedures
  • Creates timestamped backups inside the backup directory, overcoming the need for copy/remove old backups to empty the backup directory
  • Copies all necessary files for your backup policy (you can specify up to ten supplementary files to be in the backup directory as part of backup process)
  • Clearly describes what is going to be in the backup director, by showing newly created files inside backup directory
  • Clearly shows the end of backup process
To start, we only need two things:
  • Installed Percona Server with TokuDB engine + TokuBackup plugin
  • Installed Python3
To install the tool you can use following methods:
* From source:
* or via pip3:
The result will be something like:
After that there should be a configuration file for this tool located -> /etc/tokubackup.conf.
The structure of the config file is:
You can change options to reflect your environment and start to use. Available command line options for the tool can be displayed using --help
You can prepare different config files. For example, one for the slave. Specify using the –defaults_file option, and the overall result of the run should be something like the below:
The backup directory will store the following:
That’s it.