This article is excerpted from O'Reilly's recently published book Linux System Administration .
If your MySQL server does not need to be available 24x7, a fast and easy offline raw backup method is:
Stop the MySQL server:
# /etc/init.d/mysqld stop
Copy MySQL's data files and directories. For example, if your MySQL data directory is /var/lib/mysql and you want to save it to /tmp/mysql-backup:
# cp -r /var/lib/mysql /tmp/mysql-backup
Instead of cp, you can use rsync, tar, gzip, or other commands.
Start the server again:
# /etc/init.d/mysqld start
Online backups are trickier. If you have mutually independent MyISAM tables (no foreign keys or transactions), you could lock each one in turn, copy its files, and unlock it. But you may have InnoDB tables, or someone could write a transaction involving multiple tables. Fortunately there are several reasonable noncommercial solutions, including mysqlhotcopy, mysqlsnapshot, replication, and mysqldump.
mysqlhotcopy is a Perl script that does online raw backups of ISAM or MyISAM tables. The manpage includes many options, but here's how to back up a single database named drupal:
# mysqlhotcopy -u user -p password drupal /tmp
Locked 57 tables in 0 seconds.
Flushed tables ('drupal'.'access', 'drupal'.'accesslog', 'drupal'.'aggregator_
category', 'drupal'.'aggregator_category_feed', 'drupal'.'aggregator_category_item',
'drupal'.'aggregator_feed', 'drupal'.'aggregator_item', 'drupal'.'authmap', 'drupal'.
'blocks', 'drupal'.'book', 'drupal'.'boxes', 'drupal'.'cache', 'drupal'.'client',
'drupal'.'client_system', 'drupal'.'comments', 'drupal'.'contact', 'drupal'.'file_
revisions', 'drupal'.'files', 'drupal'.'filter_formats', 'drupal'.'filters',
'drupal'.'flood', 'drupal'.'forum', 'drupal'.'history', 'drupal'.'locales_meta',
'drupal'.'locales_source', 'drupal'.'locales_target', 'drupal'.'menu', 'drupal'.
'node', 'drupal'.'node_access', 'drupal'.'node_comment_statistics', 'drupal'.'node_
counter', 'drupal'.'node_revisions', 'drupal'.'permission', 'drupal'.'poll',
'drupal'.'poll_choices', 'drupal'.'poll_votes', 'drupal'.'profile_fields', 'drupal'.
'profile_values', 'drupal'.'role', 'drupal'.'search_dataset', 'drupal'.'search_
index', 'drupal'.'search_total', 'drupal'.'sequences', 'drupal'.'sessions', 'drupal'.
'system', 'drupal'.'term_data', 'drupal'.'term_hierarchy', 'drupal'.'term_node',
'drupal'.'term_relation', 'drupal'.'term_synonym', 'drupal'.'url_alias', 'drupal'.
'users', 'drupal'.'users_roles', 'drupal'.'variable', 'drupal'.'vocabulary',
'drupal'.'vocabulary_node_types', 'drupal'.'watchdog') in 0 seconds.
Copying 171 files...
Copying indices for 0 files...
Unlocked tables.
mysqlhotcopy copied 57 tables (171 files) in 1 second (1 seconds overall).
mysqlsnapshot is even easier. It backs up all the ISAM or MyISAM tables on your server to one tar file per database:
# ./mysqlsnapshot -u user -p password -s /tmp --split -n checking for binary logging... ok backing up db drupal... done backing up db mysql... done backing up db test... done snapshot completed in /tmp
If you've set up MySQL replication for 24x7 availability, you can back up from a slave server using one of the methods just decribed. You'll also need to save replication info (logs, configuration files, and so on).
For extra protection from hardware corruption (but not human error), set up replication and provide your slave (and/or master) with RAID 1 (mirrored) disks.
Many MySQL sites migrate data from MyISAM to InnoDB tables to get true database transactions and better write performance. The authors of the InnoDB module have a commercial product for online InnoDB backups named InnoDB Hot Backup.
The last method is usually the first mentioned in most documentation: mysqldump. Rather than a raw (verbatim) copy, mysqldump produces an ASCII dump of the specified databases and tables. It works with all MySQL table types, including InnoDB. It's relatively slow, and the text files it produces are large, although they compress fairly well. It's useful to create these dumps from time to time, because they contain a straightforward script for recreating your databases and tables from scratch. You can use editors, grep, and other text tools to search through or modify the dump files.
To lock all of your tables and dump them to a single file, enter:
# mysqldump -u user -ppassword -x --all-databases > /tmp/mysql.dump
You can pipe the output through gzip to save some time and space:
# mysqldump -u user -ppassword -x --all-databases | gzip > /
tmp/mysql.dump.gz
A new open source tool (free download, pay for support) called Zmanda Recovery Manager for MySQL provides a useful front end to many of these alternatives. Notable features:
Has a command-line interface.
Backs up local databases, or remote databases over SSL.
Emails the status of the backup procedure.
Handles all table types, including InnoDB.
Does not provide any new backup methods. Instead, it chooses among mysqldump, mysqlhotcopy, MySQL replication, or LVM snapshots.
Supports recovery to a particular transaction or point in time.
Zmanda provides .tar.gz and .rpm files for many Linux distributions. For an installation how-to for Debian, see HowtoForge.
Note: Comments are owned by the poster. We are not responsible for their content.
You may be saying "yeah, but the backup took 1 second for all 57 tables." True, but we don't know how much data was in those 57 tables. How long does it take you to copy 100 megs of data? 100 gigs?
This is a good point, and it's why I store my mysql data on an lvm logical volume. I can flush the tables, create a snapshot of the logical volume, and then unlock the tables all in under a second regardless of the data size. I then have as long as I want to backup the data.
FWIW I found it quite easy to roll my own script, rather than using one of the ones mentioned in the article.
Additionally... if I may... this is crucial
Posted by: Anonymous Coward on May 29, 2007 04:41 PMAnyone running any form of CMS/CRM or Wiki should take note of this article (and maybe even read my comments)
Everyone should be aware that doing a plain database file copy *may work* on certain databases even while the database server is still running, if there is no activity on the database, all writes have been flushed to disk, and the file system allows read access to the database file. This is very dangerous because all the assumptions may not be valid during the lifetime of the copy process (for a large database this can easily be several minutes).
Most RDBMS's come with command-line tools to do live backups, essentially a script to start a new transactional snapshot, get a copy of full metadata, then get a copy of full data typically ignoring in-limbo or deleted transactions / content. If this live-backup fails, it may indicate on-disk errors (due to various reasons) and this should be a trigger for the database maintainer to prepare for pre-disaster recovery.
So a regular live-backup scheduled and run automatically but upon failure notifying some administrator is invaluable.
Restore-testing is also prudent, since a "successful backup" is only really worth anything if it can be restored, so modify your backup script to not only do a backup, but also a restore to a temporary file (immediately) to see if the backup file is uncorrupted. If this process is successful, delete the "temp restored database", and make sure you make an off-site copy of the backup, otherwise you're not insured against disk failure.
I hope someone may find this addition useful, and I apologise if I offended anyone in posting this.
Donald Klopper
donald.klopper@gmail.com
#