General Overview of Multimaster Replication
Multimaster replication is a utility that allows data in multiple databases to be automatically kept
in sync. For example, in a multimaster replication system, if a row gets inserted into one of the
databases in the system, that row will be automatically propagated to all of the other databases in
that system. Updates and deletes to the data in any of the databases will be propagated in the
same way.
A multimaster replication environment is set up by configuring databases to be part of a
“replication group”. One of the databases in the group is defined as the “master definition site,”
and all of the other databases in the group are classified as “master sites.” The main difference
between the two types of sites is that most of the replication administration commands must be
invoked from the master definition site.
There are two basic ways that transactions get propagated to remote databases—“synchronously”
and “asynchronously”. Synchronous replication occurs by causing each transaction to be applied
to all the master sites in a group immediately. The way this is achieved is by using Oracle’s two-
phase commit functionality, to ensure that all of the databases in question can apply a given
transaction. If any of the sites in the group cannot accept the transaction (such as because the
site’s database has crashed, or the network connection to a database is down) then none of the
master sites in the replication group will be able to accept the transaction—the transaction will
not be able to take place.
The way asynchronous replication works is that all the transactions that occur on a site are
temporarily placed in a buffer, called the “deferred transaction queue,” or deftran queue.
Periodically, such as once per minute, all of the transactions in a site’s deftran queue get sent to
all of the other sites, by “push” jobs. These jobs get created by calling the “schedule_push”
procedure. Finally, the transactions in a deftran queue that have already been sent to other sites
must be periodically purged, to prevent the deftran queue from growing too large.
The vast majority of customer sites that use multimaster replication use asynchronous replication
rather than synchronous. One of the reasons for this is that asynchronous replication has been
available for a much longer time; the initial versions of multimaster replication only allowed for
asynchronous propagation. The main reason that asynchronous is used, though, is because it has
many advantages over synchronous.
First of all, asynchronous replication uses much less network bandwidth and provides higher
performance than synchronous replication. The primary reason for this is that it is more efficient
to store multiple transactions and then propagate them all as a group, rather than to propagate
each transaction separately.
This is particularly important when the sites in question are very far apart geographically (such as
having one site in San Francisco and another in New York). Another reason for these bandwidth
and performance improvements is that there is much more overhead associated with synchronous
replication because each and every transaction requires that separate connections be established to
all of the other sites in the replication group. With asynchronous replication, fewer connections
need to be established, since transactions are propagated as a group.
The biggest advantage of asynchronous replication, though, is that it provides for high availability
of the replication group. With asynchronous replication, if one of the sites in the replication group
crashes, all of the other sites will still be able to accept updates—the transactions that are made on
the remaining sites will just “stack up” in those sites’ deftran queues until the down site becomes
available.
On the other hand, with synchronous replication, if any one of the sites becomes unavailable
(such as because of a database crash or a network failure) then none of the sites will be updatable.
This is because with synchronous replication, each and every transaction must be able to be
immediately applied to all of the sites in the replication group, and of course if a site is
unreachable no transactions will be able to be applied to it. This means that not only does
synchronous replication not provide any higher database availability, it can actually provide
lower availability than using a single database!
Monday, August 15, 2016
Multi-Master Replication
If You *Must* Deploy Multi-Master Replication, Read This First
An increasing number of organizations run applications that depend on
MySQL multi-master replication between remote sites. I have worked on
several such implementations recently. This article summarizes the
lessons from those experiences that seem most useful when deploying
multi-master on existing as well as new applications.
Let's start by defining terms. Multi-master replication means that applications update the same tables on different masters, and the changes replicate automatically between those masters. Remote sites mean that the masters are separated by a wide area network (WAN), which implies high average network latency of 100ms or more. WAN network latency is also characterized by a long tail, ranging from seconds due to congestion to hours or even days if a ship runs over the wrong undersea cable.
With the definitions in mind we can proceed to the lessons. The list is not exhaustive but includes a few insights that may not be obvious if you are new to multi-master topologies. Also, I have omitted issues like monitoring replication, using InnoDB to make slaves crash-safe, or provisioning new nodes. If you use master/slave replication, you are likely familiar with these topics already.
1. Use the Right Replication Technology and Configure It Properly
The best overall tool for MySQL multi-master replication between sites is Tungsten. The main reason for this assertion is that Tungsten uses a flexible, asynchronous, point-to-point, master/slave replication model that handles a wide variety of topologies such as star replication or all-to-all. Even so, you have to configure Tungsten properly. The following topology is currently my favorite:
One approach you should approach with special caution is MySQL circular replication. In topologies of three or more nodes, circular replication results in broken systems if one of the masters fails. Also, you should be wary of any kind of synchronous multi-master replication across sites that are separated by more than 50 kilometers (i.e. 1-2ms latency). Synchronous replication makes a siren-like promise of consistency but the price you pay is slow performance under normal conditions and broken replication when WAN links go down.
2. Use Row-Based Replication to Avoid Data Drift
Replication depends on deterministic updates--a transaction that changes 10 rows on the original master should change exactly the same rows when it executes against a replica. Unfortunately many SQL statements that are deterministic in master/slave replication are non-deterministic in multi-master topologies. Consider the following example, which gives a 10% raise to employees in department #35.
UPDATE emp SET salary = salary * 1.1 WHERE dep_id = 35;
If all masters add employees, then the number of employees who actually get the raise will vary depending on whether such additions have replicated to all masters. Your servers will very likely become inconsistent with statement replication. The fix is to enable row-based replication using binlog-format=row in my.cnf. Row replication transfers the exact row updates from each master to the others and eliminates ambiguity.
3. Prevent Key Collisions on INSERTs
For applications that use auto-increment keys, MySQL offers a useful trick to ensure that such keys do not collide between masters using the auto-increment-increment and auto-increment-offset parameters in my.cnf. The following example ensures that auto-increment keys start at 1 and increment by 4 to give values like 1, 5, 9, etc. on this server.
server-id=1
auto-increment-offset = 1
Neither Tungsten nor MySQL native replication can resolve conflicts, though we are starting to design this capability for Tungsten. You need to avoid them in your applications. Here are a few tips as you go about this.
6. Have a Plan for Sorting Out Mixed Up Data
Master/slave replication has its discontents, but at least sorting out messed up replicas is simple: re-provision from another slave or the master. No so with multi-master topologies--you can easily get into a situation where all masters have transactions you need to preserve and the only way to sort things out is to track down differences and update masters directly. Here are some thoughts on how to do this.
(Disclaimer: My company sells support for Tungsten so I'm not unbiased. That said, commercial outfits really earn their keep on problems like this.)
7. Test Everything
Cutting corners on testing for multi-master can really hurt. This article has described a lot of things to look for, so put together a test plan and check for them. Here are a few tips on procedure:
Summary
Before moving to a multi-master replication topology you should ask yourself whether the trouble is justified. You can get many of the benefits of multi-master with system-of-record architectures with a lot less heartburn. That said, an increasing number of applications do require full multi-master across multiple sites. If you operate one of them, I hope this article is helpful in getting you deployed or improving what you already have.
Tungsten does a pretty good job of multi-master replication already, but I am optimistic we can make it much better. There is a wealth of obvious features around conflict resolution, data repair, and up-front detection of problems that will make life better for Tungsten users and reduce our support load. Plus I believe we can make it easier for developers to write applications that run on multi-master DBMS topologies. You will see more about how we do this in future articles on this blog.
Posted by
Robert Hodges
at
12:57 AM
Let's start by defining terms. Multi-master replication means that applications update the same tables on different masters, and the changes replicate automatically between those masters. Remote sites mean that the masters are separated by a wide area network (WAN), which implies high average network latency of 100ms or more. WAN network latency is also characterized by a long tail, ranging from seconds due to congestion to hours or even days if a ship runs over the wrong undersea cable.
With the definitions in mind we can proceed to the lessons. The list is not exhaustive but includes a few insights that may not be obvious if you are new to multi-master topologies. Also, I have omitted issues like monitoring replication, using InnoDB to make slaves crash-safe, or provisioning new nodes. If you use master/slave replication, you are likely familiar with these topics already.
1. Use the Right Replication Technology and Configure It Properly
The best overall tool for MySQL multi-master replication between sites is Tungsten. The main reason for this assertion is that Tungsten uses a flexible, asynchronous, point-to-point, master/slave replication model that handles a wide variety of topologies such as star replication or all-to-all. Even so, you have to configure Tungsten properly. The following topology is currently my favorite:
- All-to-all topology. Each master replicates directly to every other master. This handles prolonged network outages or replication failures well, because one or more masters can drop out without breaking replication between the remaining masters or requiring reconfiguration. When the broken master(s) return, replication just resumes on all sides. All-to-all does not work well if you have a large number of masters.
- Updates are not logged on slaves. This keeps master binlogs simple, which is helpful for debugging, and eliminates the possibility of loops. It also requires some extra configuration if the masters have their own slaves, as would be the case in a Tungsten Enterprise cluster.
One approach you should approach with special caution is MySQL circular replication. In topologies of three or more nodes, circular replication results in broken systems if one of the masters fails. Also, you should be wary of any kind of synchronous multi-master replication across sites that are separated by more than 50 kilometers (i.e. 1-2ms latency). Synchronous replication makes a siren-like promise of consistency but the price you pay is slow performance under normal conditions and broken replication when WAN links go down.
2. Use Row-Based Replication to Avoid Data Drift
Replication depends on deterministic updates--a transaction that changes 10 rows on the original master should change exactly the same rows when it executes against a replica. Unfortunately many SQL statements that are deterministic in master/slave replication are non-deterministic in multi-master topologies. Consider the following example, which gives a 10% raise to employees in department #35.
UPDATE emp SET salary = salary * 1.1 WHERE dep_id = 35;
If all masters add employees, then the number of employees who actually get the raise will vary depending on whether such additions have replicated to all masters. Your servers will very likely become inconsistent with statement replication. The fix is to enable row-based replication using binlog-format=row in my.cnf. Row replication transfers the exact row updates from each master to the others and eliminates ambiguity.
3. Prevent Key Collisions on INSERTs
For applications that use auto-increment keys, MySQL offers a useful trick to ensure that such keys do not collide between masters using the auto-increment-increment and auto-increment-offset parameters in my.cnf. The following example ensures that auto-increment keys start at 1 and increment by 4 to give values like 1, 5, 9, etc. on this server.
server-id=1
auto-increment-offset = 1
auto-increment-increment = 4
This works so long as your applications use auto-increment keys
faithfully. However, any table that either does not have a primary key
or where the key is not an auto-increment field is suspect. You need to
hunt them down and ensure the application generates a proper key that
does not collide across masters, for example using UUIDs or by putting
the server ID into the key. Here is a query on the MySQL information
schema to help locate tables that do not have an auto-increment primary
key.
SELECT t.table_schema, t.table_name
FROM information_schema.tables t
WHERE NOT EXISTS
(SELECT * FROM information_schema.columns c
WHERE t.table_schema = c.table_schema
AND t.table_name = c.table_name
AND c.column_key = 'PRI'
AND c.extra = 'auto_increment')
4. Beware of Semantic Conflicts in Applications
Neither Tungsten nor MySQL native replication can resolve conflicts, though we are starting to design this capability for Tungsten. You need to avoid them in your applications. Here are a few tips as you go about this.
First, avoid obvious conflicts. These include inserting data with the
same keys on different masters (described above), updating rows in two
places at once, or deleting rows that are updated elsewhere. Any of
these can cause errors that will break replication or cause your masters
to become out of sync. The good news is that many of these problems
are not hard to detect and eliminate using properly formatted
transactions. The bad news is that these are the easy conflicts. There
are others that are much harder to address.
For example, accounting systems need to generate unbroken sequences of
numbers for invoices. A common approach is to use a table that holds
the next invoice number and increment it in the same transaction that
creates a new invoice. Another accounting example is reports that need
to read the value of accounts consistently, for example at monthly
close. Neither example works off-the-shelf in a multi-master system
with asynchronous replication, as they both require some form of
synchronization to ensure global consistency across masters. These and
other such cases may force substantial application changes. Some
applications simply do not work with multi-master topologies for this
reason.
5. Remove Triggers or Make Them Harmless
Triggers are a bane of replication. They conflict with row replication
if they run by accident on the slave. They can also create strange
conflicts due to weird behavior/bugs (like this) or other problems like needing definer accounts present. MySQL native replication turns triggers off on slaves when using row replication, which is a very nice feature that prevents a lot of problems.
Tungsten on the other hand cannot suppress slave-side triggers. You
must instead alter each trigger to add an IF statement that prevents the
trigger from running on the slave. The technique is described in the Tungsten Cookbook.
It is actually quite flexible and has some advantages for cleaning up
data because you can also suppress trigger execution on the master.
You should regard all triggers with suspicion when moving to
multi-master. If you cannot eliminate triggers, at least find them,
look at them carefully to ensure they do not generate conflicts, and
test them very thoroughly before deployment. Here's a query to help you
hunt them down:
SELECT trigger_schema, trigger_name
FROM information_schema.triggers;
Master/slave replication has its discontents, but at least sorting out messed up replicas is simple: re-provision from another slave or the master. No so with multi-master topologies--you can easily get into a situation where all masters have transactions you need to preserve and the only way to sort things out is to track down differences and update masters directly. Here are some thoughts on how to do this.
- Ensure you have tools to detect inconsistencies. Tungsten has built-in consistency checking with the 'trepctl check' command. You can also use the Percona Toolkit pt-table-checksum to find differences. Be forewarned that neither of these works especially well on large tables and may give false results if more than one master is active when you run them.
- Consider relaxing foreign key constraints. I love foreign keys because they keep data in sync. However, they can also create problems for fixing messed up data, because the constraints may break replication or make it difficult to go table-by-table when synchronizing across masters. There is an argument for being a little more relaxed in multi-master settings.
- Switch masters off if possible. Fixing problems is a lot easier if you can quiesce applications on all but one master.
- Know how to fix data. Being handy with SQL is very helpful for fixing up problems. I find SELECT INTO OUTFILE and LOAD DATA INFILE quite handy for moving changes between masters. Don't forget SET SESSION LOG_FILE_BIN=0 to keep changes from being logged and breaking replication elsewhere. There are also various synchronization tools like pt-table-sync, but I do not know enough about them to make recommendations.
(Disclaimer: My company sells support for Tungsten so I'm not unbiased. That said, commercial outfits really earn their keep on problems like this.)
7. Test Everything
Cutting corners on testing for multi-master can really hurt. This article has described a lot of things to look for, so put together a test plan and check for them. Here are a few tips on procedure:
- Set up a realistic pre-prod test with production data snapshots.
- Have a way to reset your test environment quickly from a single master, so you can get back to a consistent state to restart testing.
- Run tests on all masters, not just one. You never know if things are properly configured everywhere until you try.
- Check data consistency after tests. Quiesce your applications and run a consistency check to compare tables across masters.
Summary
Before moving to a multi-master replication topology you should ask yourself whether the trouble is justified. You can get many of the benefits of multi-master with system-of-record architectures with a lot less heartburn. That said, an increasing number of applications do require full multi-master across multiple sites. If you operate one of them, I hope this article is helpful in getting you deployed or improving what you already have.
Tungsten does a pretty good job of multi-master replication already, but I am optimistic we can make it much better. There is a wealth of obvious features around conflict resolution, data repair, and up-front detection of problems that will make life better for Tungsten users and reduce our support load. Plus I believe we can make it easier for developers to write applications that run on multi-master DBMS topologies. You will see more about how we do this in future articles on this blog.
Sunday, August 7, 2016
MyRocks vs InnoDB
- InnoDB writes between 8X and 14X more data to SSD per transaction than RocksDB
- RocksDB sustains about 1.5X more QPS
- Compressed/uncompressed InnoDB uses 2X/3X more SSD space than RocksDB
Configuration
I used the same configuration as described in the previous post with one difference. For this test I ran 168 iterations of the query step and each step ran for 1 hour. The test ran for 7 days while the previous test ran for 1 day. What I describe as QPS below is TPS (transactions/second) and when I use per query below I mean per transaction. The IO efficiency metrics are measured by iostat. I report the database size in GB at the end of each day - hours 1, 24, 48, 72, 96, 120, 144 and 168. For each one hour interval I collect:
- average QPS
- iostat reads per query (r/q)
- iostat KB read per query(rKB/q)
- iostat KB written per query (wKB/q)
- iostat reads per second (r/s)
- iostat KB read per second (rKB/s)
- iostat KB written per second (wKB/s)
- myrocks.zlib - Facebook MySQL 5.6, RocksDB with zlib compression
- innodb56.none - upstream MySQL 5.6.26, InnoDB without compression
- innodb57.none - upstream MySQL 5.7.10, InnoDB without compression
- innodb56.zlib - upstream MySQL 5.6.26, InnoDB with zlib compression
- innodb57.zlib - upstream MySQL 5.7.10, InnoDB with zlib compression
Better Compression
Compressed InnoDB uses about 2X more SSD space than MyRocks. Uncompressed InnoDB uses about 3.1X more SSD space than MyRocks. This graph shows the database size every 24 hours. Note that the database gets more data as a function of the QPS rate and MyRocks has more data than InnoDB after 168 hours -- myrocks.zlib has 6.4% more rows than inno57.none and 7.2% more rows than inno57.zlib after 7 days.

Better Performance
I'd be happy if MyRocks matched the QPS from InnoDB and only beat it on IO efficiency. But it wins on QPS and IO efficiency. The data below is QPS over time. MyRocks gets at least 1.5X more QPS than compressed InnoDB. It also does a lot better than uncompressed InnoDB but who wants to use 3X more SSD space. The QPS growth at test start for InnoDB with zlib happens because there are stalls until the compressed b-tree pages fragment. I think this is a problem with mutexes in the pessimistic insert/update code for compressed InnoDB.
The graph below shows the average QPS from each 1-hour interval.

Better IO Efficiency
I present IO efficiency metrics here using data from iostat normalized by the QPS rate to show the amount of IO done per transaction.
This result is remarkable. InnoDB writes between 8X and 14X more to storage per transaction than MyRocks. This means that workloads can use TLC SSD with MyRocks when InnoDB requires MLC and that workloads can use SSD with MyRocks when SSD doesn't have sufficient endurance for InnoDB. Running a busy database on SSD is so much easier than using disk.
This result doesn't include the additional write-amplification from flash GC that occurs with InnoDB compared to MyRocks because the MyRocks write pattern generates less work for flash GC. I previously described how that increased the flash write-rate by about 1.5X for InnoDB compared to RocksDB for the device I use. This means that the real difference in write rates on SSD might mean that InnoDB writes 12X to 21X more to storage than MyRocks.
Using iostat metrics from hour 168 InnoDB writes 8.7, 10.5, 8.9 and 10.6 times more to storage per transaction compared to RocksDB. Using iostat data from hour 167 the difference is 11.5, 13.9, 11.7 and 14.0 times more data written.
The graph below shows the number of KB written to SSD per transaction from each 1-hour interval.

This graph shows the number of SSD reads per transaction. MyRocks has the smallest rate. While an LSM can have an IO penalty for range queries and range queries are the most frequent operation in Linkbench, that isn't a problem for this workload.


Absolute iostat results
These graphs show absolute rates from iostat. The data is the average rate per 1-hour interval.
The first graph is the rate for iostat r/s. The rate is larger for MyRocks because it sustains the most QPS.

The next graph shows the rate for iostat wKB/s. Note that InnoDB sustains between 100 and 200 MB/s of writes. The SSD device is much busier doing writes for InnoDB than for MyRocks. IO capacity used for writes isn't available for reads even when endurance isn't an issue. More writes means more erases from flash GC and erases are a source of IO stalls when a read gets stuck behind the erase on the same channel.


Monday, August 11, 2014
TokuDB : Hot Index Creation
Hot Index Creation
TokuDB allows you to add indexes to an existing table and still perform inserts and queries on that table while the index is being created.The
ONLINE
keyword is not used. Instead, the value of the tokudb_create_index_online
client session variable is examined. More information is available in TokuDB Variables.
Hot index creation is invoked using the
CREATE INDEX
command after setting tokudb_create_index_online=on
.Here's an example:
SET tokudb_create_index_online=ON; Query OK, 0 rows affected (0.00 sec) CREATE INDEX index table (field_name);
Alternatively, using the ALTER TABLE
command for creating an index will create the index offline (with the table unavailable for inserts or queries), regardless
of the value of tokudb_create_index_online
. The only way to hot create an index is to use the CREATE INDEX
command.Hot creating an index will be slower than creating the index offline, and progress depends how busy the mysqld server is with other tasks. Progress of the index creation can be seen by using the
SHOW PROCESSLIST
command (in another client). Once the index creation completes, the new index will be used in future query plans.If more than one hot
CREATE INDEX
is issued for a particular table, the indexes will be created serially. An index creation that is waiting for another to
complete will be shown as Locked in SHOW PROCESSLIST
. We recommend that each CREATE INDEX
be allowed to complete before the next one is started.
TokuDB : Hot Column Addition and Deletion
Hot Column Addition and Deletion
From 18 hours to 3 seconds!
Hot Column Addition and Deletion (HCAD) Overview
TokuDB v5.0 introduces several features that are new to the MySQL world. In this series of posts, we’re going to present some information on these features: what’s the feature, how does it work under the hood, and how do you get the most out of this feature in your MySQL setup.Today we start with HCAD: Hot Column Addition and Deletion. Many users have had the experience of loading a bunch of data into a table and associated indexes, only to find that adding some columns or removing them would be useful.
alter table X add column Y int default 0;
or the like takes a long time — hours or more — during which time the table is write locked, meaning no insertions/deletions/updates and no queries on the new column until the alter table is done.
Mark Callaghan points out that changing the row format in InnoDB is a “significant project”, so it looked like slow alter tables were going to be a challenge for MySQL for the foreseeable future. Slow alter tables is a reason for the inability of MySQL to scale to large tables.
TokuDB v5.0 changes all that with the introduction of HCAD. You can add or delete columns from an existing table with minimal downtime — just the time for MySQL itself to close and reopen the table. The total downtime is seconds to minutes.
Here we present an example of HCAD in action. See this page for details of the experiment. Drum roll…
TokuDB:
mysql> alter table ontime add column totalTime int default 0; Query OK, 0 rows affected (3.33 sec)
InnoDB:
mysql> alter table ontime add column totalTime int default 0; Query OK, 122225386 rows affected (17 hours 44 min 40.85 sec)
That’s 19,000x faster! Goodbye long downtimes.
As a note, the “0 rows affected” for TokuDB means that the column addition work happens in the background. All queries on the table, however, will see the new column as soon as the alter table returns, in this case after 3.33 sec.
Friday, April 18, 2014
How can I optimize a mysqldump of a large database?
How can I optimize a mysqldump of a large database ?

I have a symfony application with an InnoDB database that is
~2GB with 57 tables. The majority of the size of the database resides
in a single table (~1.2GB). I am currently using mysqldump to backup
the database nightly.
Due to my comcast connection, oftentimes if I am running a dump manually my connection to the server will timeout before the dump is complete causing me to have to rerun the dump. [I currently run a cron that does the dump nightly, this is just for dumps that I run manually.] Is there a way to speed up the dumps for the connection timeout issue, but also to limit the time the server is occupied with this process? BTW, I am currently working on reducing the size of the overall database to resolve this issue. |
|||||||||||
The main bottleneck in the dump like this is drive I/O.
You are reading a load of data and writing it again. You can speed this
up in a number of ways:
screen session without any processes getting interrupted.If you are sending the data directly over the connection (i.e. you are running mysqldump on your local machine against a remote database, so the dump appears locally) you might be better off running the dump on the server first, compressing as needed, then transferring the data over the network using a tool (such as rsync ) which supports partial transfers so you can resume the transfer (instead of restarting) if a connection drop interrupts it.As part of your "reducing the size of the overall database to resolve this issue" I would guess that a large chunk of your data does not change. You might be able to move a large chunk of the 1.2Gb from that main table off into another and remove that from those that are copied by the mysqldump call. You don't need to backup this data
every time if it never changes. Splitting data between tables and
databases this way is usually referred to as data partitioning and can
also allow you to spread the data and I/O load over multiple drives.
High-end database have built in support for automatic partitioning,
though in mysql you will probably have to do it manually and alter your
data access layer to account for it.Straying off-topic for this site (so you should probably nip over to ServerFault or SuperUser to ask if you need more detail): If you seem to be losing connections due to inactivity, check the options in your SSH server and SSH client to make sure keep-alive packets are enabled and being sent often enough. If seeing drops even if the connection is active you could also try using OpenVPN or similar to wrap the connection - it should handle a short drop, even a complete drop if your entire connection is down for a few seconds, such that the SSH client and server don't notice. |
||||

INSIGHT INTO DOING BACKUPS WITH mysqldump
IMHO Doing backups has become more of an art form if you know just how to approach it You have options Option 1 : mysqldump an entire mysql instance This is the easiest one, the no-brainer !!!
Everything written in one file: table structures, indexes, triggers,
stored procedures, users, encrypted passwords. Other mysqldump options
can also export different styles of INSERT commands, log file and
position coordinates from binary logs, database creation options,
partial data (--where option), and so forth.Option 2 : mysqldump separate databases into separate data files Start by creating a list of databases (2 techniques to do this) Technique 1
Technique 2
Technique 1 is the fastest way. Technique 2 is the surest and safest.
Technique 2 is better because, sometimes, users create folders for
general purposes in /var/lib/mysql (datadir) which are not database
related. The information_schema would register the folder as a database
in the information_schema.schemata table. Technique 2 would bypass
folders that do not contain mysql data.Once you compile the list of databases, you can proceed to loop through the list and mysqldump them, even in parallel if so desired.
If there are too many databases to launch at one time, parallel dump them 10 at a time:
Option 3 : mysqldump separate tables into separate data filesStart by creating a list of tables
Then dump all tables in groups of 10
Option 4 : USE YOUR IMAGINATIONTry variations of the aforementioned Options plus techniques for clean snapshots Examples
Only Option 1 brings everything. The drawback is that mysqldumps created this way can only be reloaded into the same majot release version of mysql that the mysqldump was generated. In other words, a mysqldump from a MySQL 5.0 database cannot be loaded in 5.1 or 5.5. The reason ? The mysql schema is total different among major releases. Options 2 and 3 do not include saving usernames and passwords. Here is the generic way to dump the SQL Grants for users that is readble and more portable
Option 3 does not save the stored procedures, so you can do the following
Another point that should be noted is concerning InnoDB. If your have
a large InnoDB buffer pool, it makes sense to flush it as best you can
before performing any backups. Otherwise, MySQL spends the time flushing
tables with leftover dirty page out of the buffer pool. Here is what I
suggest:About 1 hour before performing the backup run this SQL command
In MySQL 5.5 default innodb_max_dirty_pages_pct is 75. In MySQL 5.1
and back, default innodb_max_dirty_pages_pct is 90. By setting
innodb_max_dirty_pages_pct to 0, this will hasten the flushing of dirty
pages to disk. This will prevent or at least lessen the impact of
cleaning up any incomplete two-phase commits of InnoDB data prior to
performing any mysqldump against any InnoDB tables.FINAL WORD ON mysqldump Most people shy away from mysqldump in favor of other tools and those tools are indeed good. Such tools include
|
Monday, April 14, 2014
Backups and Recover
Backups and Recovery
This is the most important task of an database administrator, you must protect your data at all costs, this means regular backups and regular restores even to another system just to check the integrity of those backups. There is no point in putting yourself in a position where you are holding your breathe when a restore is happening only to find out that the backup is corrupt, try if possible to perform regular restores if not then at least you should be performing a disaster recovery test once per year. Not being able to restore could be a disaster for your company and your job.To check your backups you can use one or more of the below which I have used in the past
- use a reporting database if the customers don't need real time data and you have the money and time, Production data could be restored every day to this system which is a very good test
- use a performance test server with Production data, ideal to test releases of your software against Production data which is generally has more volume then a test system, restore perhaps once a week
- at least perform a DR once per year to prove the backup solution is working, for example you may have forgotten to backup something not only regarding the database but from the systems as well
First lets start with a few terms associated with backups
logical backup | this type of backup is created by saving information that represents the logical database structures using SQL statements like create database, create table and insert. This type of backup is ideal when you want to upgrade from one version of MySQL to another however it is a slower method of backing up. |
physical backup | this type of backup is a backup of the actual database files or disk partitions, this type of backup can be very fast to backup and restore. |
full backup | a full backup is a standalone backup containing everything in the database, this could then be restored on another server. A full backup can be either logical or physical. |
incremental backup | this type of backup only contains the data that has changed from the last backup. The advantage of this type of backup is that it is faster as there is not some much data to backup, however the disadvantage is that it takes longer to recover. |
consistent backup | this is a backup at an exact moment in time, generally you shutdown the database (or quiescent mode) then take the backup. |
hot backup | this type of backup is taken when the database is running, during the backup both reads and writes are not blocked |
warm backup | this type of backup is taken when the database is running, however reads are not blocked but writes are prohibited from making any modifications to the database. |
cold backup | similar to a consistent backup as the database is shutdown before the backup begins |
point-in-time restore | is a restoration of a database to a specified date and time , some databases use a full backup and recovery logs to restore to that point-in-time, others can only use the last full backup which means that data might have to be re-keyed into the system. |
The $64,000 question is how often you should take your backups, and this i am afraid depends, so company are happy for once a month backups other may take two backups per day. The answer generally has to come from the business on what they are prepared to lose, amount of data lost or what has to be re-keyed into the system again. If you have a small company that say has to re-key in 20-50 invoices then that's no big deal, however if you have a trading company that many have to re-key in 10's of thousands of entries/trades then that becomes a problem. You have to add the time it takes to restore the system plus the time it takes to recover the system so that users are able to use it, it is this time that you give to the business to make there decision on what is a acceptable time period that the business can be down for, the shorter the time the more money that will have to be thrown at the solution, if you are talking about zero downtime then we would have to implement a high availability solution which could cost a lot of money, if you are happy with 1 days downtime then this should be enough to restore and recovery a database and to re-key in some entries to make the database consistent with the companies paper work.
As you saw above there are a number of ways to backup a database, depending on the the available time to perform a backup will make you decide on what method to use, if you have a short maintenance window with a large database then a incremental backup maybe the only option, but you have a large maintenance window with a small database then you could perform a full backup, remember what ever option you use with have a impact on the recovery time.
One point to make is that you backups should be taken off-site if held on tape or copied across to an other system in another location, if an incident happened on the original system for example a fire you don't want to lose your backups as well, the storing of off-site data should be part of you DR plan.
Enough of talking about backups lets see how you can actually take one, there are a number of backup tools that MySQL can use, see the table below
Backup tools for MySQL
|
|||||
Backup method |
Storage engine
|
Impact
|
Backup speed
|
Recovery speed
|
Recovery granularity
|
mysqldump |
ALL
|
WARM
|
MEDUIM
|
SLOWEST
|
MOST FLEXIBLE
|
mysqldump |
INNODB
|
HOT
|
MEDUIM
|
SLOWEST
|
MOST FLEXIBLE
|
select into outfile |
ALL
|
WARM
|
SLOW
|
SLOW
|
MOST FLEXIBLE
|
mk-parallel-backup |
ALL
|
WARM
|
MEDUIM
|
MEDUIM
|
FLEXIBLE
|
ibbackup |
INNODB
|
HOT
|
FAST
|
FAST
|
FLEXIBLE
|
ibbackup |
ALL
|
WARM
|
FAST
|
FAST
|
FLEXIBLE
|
backup command in mysqld |
ALL
|
HOT
|
FAST
|
FAST
|
FLEXIBLE
|
filesystem (copy files) |
ALL
|
COLD
|
FASTEST
|
FASTEST
|
NOT FLEXIBLE
|
snapshot (using LVM, ZFS, VMWare) |
ALL
|
ALMOST HOT
|
FAST
|
FAST
|
LEAST FLEXIBLE
|
mysqlhotcopy |
MyISAM
|
MOSTLY COLD
|
FAST
|
FAST
|
FLEXIBLE
|
mysqldump | ## backup all databases mysqldump --user=root --password --all-databases > backup_<date>_all.sql ## backup a specific database mysqldump --user=root --password <database_name> > backup_<date>_<database_name>.sql ## backup multiple databases mysqldump --user=root --password <database_name>,<database_name> > backup_<date>.sql ## backup a table from a database mysqldump --user=root --password <database_name> <table_name> > backup_<date>_<database_name>_<table_name>.sql ## backup some specific data mysqldump --user=root --password <database_name> <table_name> --where "last_name='VALLE' order by first_name > backup_<date>.sql ## dumping from one database to another mysqldump --databases <database_name> | mysql -h <destination_host> <database_name> |
restore a mysqldump | ## all databases mysql --user=root --password < backup.sql ## specific database mysql --user=<user> --password <database_name> < backup_<dataabse_name>.sql |
select into outfile / load data infile | ## dump of the accounts table select * into outfile '/tmp/accounts.txt' from accounts; ## load the dump load data infile '/tmp/accounts.txt' into table accounts; |
mk-parallel-dump, mk-parallel-restore | ## backup a database mk-parallel-dump --basdir=/backups ## restore a database mk-parallel-restore /backups |
New in MySQL 5.6 is the online logical host backup, you can also use compression and encryption which is important when using sensitive data.
backup | backup database <database_name> to '<database_name>-backup.sql' |
restore | restore from '<database_name>-backup.sql' |
history | select * from backup_history where backup_id = 321\G |
- no backup of the internal mysql datadisk
- no native driver for InnoDB tables
- no native driver for Maria or Falcon
- no backup of partitions
- no incremental backups
mysqlhotcopy | ## backup a database mysqlhotcopy <database_name> /backups ## backup multiple databases mysqlhotcopy <database_name> accounts /backups ## backup a database to to another server mysqlhotcopy --method=scp <database_name> \ username@backup.server:/backup ## use pattern match to backup databases and tables mysqlhotcopy <database_name>./^employees/ /backup |
Recovering from Crashes
Most often you have to recover to a point-in-time after the last backup, the normal procedure is as follows
- restore the latest backup
- recovery the data to a point-in-time using recovery log files
- restore the database using the last backup
- determine the first binary log and starting position needed
- determine the last binary log needed
- convert the binary log to text format with the mysqlbinlog utility using options to specify the start and stop time
- check the text file to make sure it's what you need
- import the converted binary log(s)
convert the log files | ## convert to a specific binary log file mysqlbinlog mysql-bin.010310 > mysql-bin.010310.sql ## use a date to end at a specific time mysqlbinlog --stop-datetime='201204-29 17:00:00' mysql-bin.010312 > mysql-bin.010312.sql ## other options are --stop-datetime --start-datatime --start-position --stop-position |
restore the converted file | mysql --user=root -password < mysql-bin.010310.sql |
Subscribe to:
Posts (Atom)