How to setup and maintain Percona backup using XtraBackup
Backups are your insurance against server failures. A well designed backup helps you quickly get back online even if the whole server is lost. For modern websites running WordPress, Magento, etc., the entire information is stored in databases. So, a sound backup design for such sites should include reliable database backups.
Website owners use various strategies such as hot copies (aka physical backups), database dumps (aka logical backups), etc. to backup databases. But, traditional tools like mysqlhotcopy or mysqldump often lead to performance or availability issues. Percona’s XtraBackup tool avoids these issues while ensuring reliable backups.
However, like any other server system, improper configuration or lack of periodic maintenance can cause XtraBackup systems to fail, and result in inconsistent backups.
Today we’ll go through how to setup a reliable Percona backup system and how to maintain the XtraBackup process.
Why use XtraBackup for Percona backup?
Percona is essentially MySQL server on steroids. So, all backup tools used in MySQL can be used on Percona as well. This includes mysqldump, InnoDB hot backup, etc. However, all these tools imposed performance or uptime penalties that cannot be afforded by busy websites. Let’s see how.
Mysqldump blocks new site updates – causing downtime
The most popular backup tool for MySQL compatible databases is “mysqldump“. Even popular backup tools such as phpMyAdmin, AutoMySQLBackup, phpMyBackup, etc. use mysqldump as its core component.
However, mysqldump has a critical flaw. It makes the entire database read-only during the backup process to prevent database corruption. So, when backing up large databases, new data (such as comments, orders, contacts, etc.) cannot be added for a long time, affecting a significant number of users. The same holds true for very busy websites. Even if the backup time is a few minutes, it can mean several users unable to use features that needs a database update.
Additionally, mysqldump takes up a lot of system memory to build SQL queries to write into the backup file. It results in high server load, and impacts website performance.
For all these reasons, mysqldump (and other such logical backup tools) cannot be used on large or mission critical websites.
LVM snapshot and other “hot backups” require database to be paused
As explained above, mysqldump takes a long time to complete, and degrades server performance. As an alternate solution, websites owners use tools such as mysqlhotcopy, LVM snapshot, or Zmanda, which takes a snapshot of the database files at the file system level.
The problem with these tools is that, the database must be paused for a small time period when the snapshot is taken. If the database is not stopped, the tool could take the backup of an incomplete database transaction, causing the entire table to be unreliable. For busy websites, stopping the database service, for however small a time, is not an option.
So, none of these “physical backup” tools are reliable for websites that need high uptime.
Percona XtraBackup does not block site updates and minimizes resource usage
What websites with large or busy databases needed was a backup tool that minimizes resource usage, and does not block database updates. MySQL produced such a tool called “MySQL Enterprise Backup”, but it is a commercial tool that costs $5000/server. For many web owners it is a costly proposition.
Percona came up with an open source alternative with the same functionality as MySQL Enterprise Backup, called “XtraBackup”.
XtraBackup copies the data files (without pausing the server), and then uses the transaction log (aka redo log) to fill in any incomplete transactions that were running at the time of taking the backups. Since there’s no large data set manipulation, the memory and I/O usage is low for XtraBackup – which means the website visitors won’t be affected by the backup process.
How to setup Percona backup
Percona servers can use either InnoDB or XtraDB engines based on whether the database service is configured as a stand-alone server or as a cluster. XtraBackup supports non-blocking backups for both these engines.
Factors to consider in designing a backup system
The installation and configuration of XtraBackup is straight-forward (which we’ll cover in a minute), but the main question is how to setup the whole backup process. The following are the points you need to consider:
- “Freshness” of your backups – In case of a server crash, what’s least recent data you can accept? For eg., can you continue business with 1 day old backup without serious business impact? Note that the more “fresh” your backup is, the more disk space it’ll take, and more performance impact it’ll cause.
- Acceptable recovery time – What’s the max. acceptable limit before your site should come back from a server crash? For eg., Can you afford your site to be offline for 1 hour? Note that the lower the recovery time, the more frequent full database backups should be – leading to more disk space usage, and performance impact.
- Retention time of your backups – How long do you need your backups? For statutory or business purposes, some sites should store their data for a certain period of time. Define your retention time. Is it 3 months? 6 months? 1 year? 3 years?
- Safety of your backups – How destruct-proof you want your backups to be? If you store your backups in the same server, it’ll be lost along with the server crash. But if it is in a remote data center, it’ll survive even a data center wide downtime. With a remote server, you incur more costs in hardware, bandwidth, etc.
Designing the backup system for an eCommerce website
To explain how we decide a backup strategy, I’ll use the example of a Percona Backup system we implemented for an eCommerce website. The site updated their products every day, and needed a way to quickly restore the site in case the database crashed.
So, the site need to be at least 1 day “fresh”. For this, we setup daily “incremental” backups – that is backups that just updates the differences from last day to the backup location. By writing only a small amount of data, we minimized the system memory and disk I/O needed for the operation.
In case of a crash, we wanted the website to be recoverable within 30 minutes. To enable that, we took weekly full backups, so that the incrementals that need to be applied at any point would be less than 7- and thereby minimize the recovery time.
We then synched these backup files to a remote data center, so that in case the whole data center went down for some reason, we’d be able to restore the site in a remote location within 30 mins.
Steps to setup the backup system
Once the backup design is finalized, XtraBackup setup can be done as follows:
1. Install XtraBackup
In CentOS/RedHat servers, first install the Percona repository, and then install XtraBackup using the command:
2. Configure XtraBackup
XtraBackup configuration variables are defined in my.cnf (usually /etc/my.cnf). The only mandatory setting is the backup directory, which can be specified as shown:
Other settings can be added as per backup design considerations.
3. Configure cron to run backup automatically
The basic command to run an XtraBackup is,
Depending on the periodicity (daily, weekly or monthly) setup incremental or full backup using a Bash script.
4. Sync the backups to a remote location
We recommend our customers (high traffic websites, managed hosting providers, etc.) to sync backups to a remote location. For this we use customized Bash scripts which uses Rsync over SSH.
The remote server is secured so that only the script can transfer data into it. This is done so that, the data cannot be modified inadvertently either by other programs or by attackers.
How to maintain Percona backup
Backups are too critical to set it up and forget it. A variety of issues like file system errors, disk space issues, high server load, etc., can cause a backup to fail. So, it is important to monitor the backup process closely, and quickly fix any issues that’s noted.
In the Percona databases that we maintain, we monitor server parameters like disk I/O, memory usage and disk space usage to make sure the server conditions do not affect the backup process in any way. In growing websites, database size is monitored, and additional disk space is added to accommodate as and when backup size increases.
Random restore tests are done once in a week on test databases to confirm that all backups are working fine. This procedure serves two purposes – (1) We know that the backups are reliable. (2) We get a way to evaluate our disaster recovery plans. On many occasions these weekly tests gave us an opportunity to improve the disaster recovery processes.
No comments:
Post a Comment