How can I optimize a mysqldump of a large database ?
I have a symfony application with an InnoDB database that is
~2GB with 57 tables. The majority of the size of the database resides
in a single table (~1.2GB). I am currently using mysqldump to backup
the database nightly.
Due to my comcast connection, oftentimes if I am running a dump manually my connection to the server will timeout before the dump is complete causing me to have to rerun the dump. [I currently run a cron that does the dump nightly, this is just for dumps that I run manually.] Is there a way to speed up the dumps for the connection timeout issue, but also to limit the time the server is occupied with this process? BTW, I am currently working on reducing the size of the overall database to resolve this issue. |
|||||||||||
The main bottleneck in the dump like this is drive I/O.
You are reading a load of data and writing it again. You can speed this
up in a number of ways:
screen session without any processes getting interrupted.If you are sending the data directly over the connection (i.e. you are running mysqldump on your local machine against a remote database, so the dump appears locally) you might be better off running the dump on the server first, compressing as needed, then transferring the data over the network using a tool (such as rsync ) which supports partial transfers so you can resume the transfer (instead of restarting) if a connection drop interrupts it.As part of your "reducing the size of the overall database to resolve this issue" I would guess that a large chunk of your data does not change. You might be able to move a large chunk of the 1.2Gb from that main table off into another and remove that from those that are copied by the mysqldump call. You don't need to backup this data
every time if it never changes. Splitting data between tables and
databases this way is usually referred to as data partitioning and can
also allow you to spread the data and I/O load over multiple drives.
High-end database have built in support for automatic partitioning,
though in mysql you will probably have to do it manually and alter your
data access layer to account for it.Straying off-topic for this site (so you should probably nip over to ServerFault or SuperUser to ask if you need more detail): If you seem to be losing connections due to inactivity, check the options in your SSH server and SSH client to make sure keep-alive packets are enabled and being sent often enough. If seeing drops even if the connection is active you could also try using OpenVPN or similar to wrap the connection - it should handle a short drop, even a complete drop if your entire connection is down for a few seconds, such that the SSH client and server don't notice. |
||||
INSIGHT INTO DOING BACKUPS WITH mysqldump
IMHO Doing backups has become more of an art form if you know just how to approach it You have options Option 1 : mysqldump an entire mysql instance This is the easiest one, the no-brainer !!!
Everything written in one file: table structures, indexes, triggers,
stored procedures, users, encrypted passwords. Other mysqldump options
can also export different styles of INSERT commands, log file and
position coordinates from binary logs, database creation options,
partial data (--where option), and so forth.Option 2 : mysqldump separate databases into separate data files Start by creating a list of databases (2 techniques to do this) Technique 1
Technique 2
Technique 1 is the fastest way. Technique 2 is the surest and safest.
Technique 2 is better because, sometimes, users create folders for
general purposes in /var/lib/mysql (datadir) which are not database
related. The information_schema would register the folder as a database
in the information_schema.schemata table. Technique 2 would bypass
folders that do not contain mysql data.Once you compile the list of databases, you can proceed to loop through the list and mysqldump them, even in parallel if so desired.
If there are too many databases to launch at one time, parallel dump them 10 at a time:
Option 3 : mysqldump separate tables into separate data filesStart by creating a list of tables
Then dump all tables in groups of 10
Option 4 : USE YOUR IMAGINATIONTry variations of the aforementioned Options plus techniques for clean snapshots Examples
Only Option 1 brings everything. The drawback is that mysqldumps created this way can only be reloaded into the same majot release version of mysql that the mysqldump was generated. In other words, a mysqldump from a MySQL 5.0 database cannot be loaded in 5.1 or 5.5. The reason ? The mysql schema is total different among major releases. Options 2 and 3 do not include saving usernames and passwords. Here is the generic way to dump the SQL Grants for users that is readble and more portable
Option 3 does not save the stored procedures, so you can do the following
Another point that should be noted is concerning InnoDB. If your have
a large InnoDB buffer pool, it makes sense to flush it as best you can
before performing any backups. Otherwise, MySQL spends the time flushing
tables with leftover dirty page out of the buffer pool. Here is what I
suggest:About 1 hour before performing the backup run this SQL command
In MySQL 5.5 default innodb_max_dirty_pages_pct is 75. In MySQL 5.1
and back, default innodb_max_dirty_pages_pct is 90. By setting
innodb_max_dirty_pages_pct to 0, this will hasten the flushing of dirty
pages to disk. This will prevent or at least lessen the impact of
cleaning up any incomplete two-phase commits of InnoDB data prior to
performing any mysqldump against any InnoDB tables.FINAL WORD ON mysqldump Most people shy away from mysqldump in favor of other tools and those tools are indeed good. Such tools include
|