Monday, April 30, 2012

MySQL Replication error 1594: Relay log read failure

MySQL slaves and the “corrupted” relay log/binlog problem

When trying to pull up to date a new slave, maybe using a binlog server (a MySQL server with no true tablespace, just acting as master for slaves, feeding them with replication data), you may encounter an error like this in your mysql.err file, after noticing the replication has just broken



091028 11:32:49 [ERROR] Slave SQL: Relay log read failure: Could not 
parse relay log event entry. The possible reasons are: the master's 
binary log is corrupted (you can check this by running 'mysqlbinlog' on 
the binary log), the slave's relay log is corrupted (you can check this 
by running 'mysqlbinlog' on the relay log), a network problem, or a bug 
in the master's or slave's MySQL code. If you want to check the master's
 binary log or slave's relay log, you will be able to know their names 
by issuing 'SHOW SLAVE STATUS' on this slave. Error_code: 1594

091028 11:32:49 [Warning] Slave: Field 'owner' doesn't have a default value Error_code: 1364

091028 11:32:49 [ERROR] Error running query, slave SQL thread aborted. 
Fix the problem, and restart the slave SQL thread with "SLAVE START". We
 stopped at log 'mysqld-bin.000780' position 786513508

Well, everything on the web points to a “OMG, my file is corrupt!! My disk is freaking out!”. Probably, instead, it’s just simpler: you have forgotten a FLUSH LOGS; before backing up your binary logs from the original master. So, when you are recreating the binlog server from the backup, the final slave arrives to the last statement wich obviously is not complete/correct and booom, it stops.

The solution is to copy the correct and complete binlog file (in my case mysqld-bin.000780) again in the binlog server, then issue a

STOP SLAVE;

CHANGE MASTER TO  master_log_file="mysqld-bin.000780", master_log_pos=786513508;

SLAVE START;

and… problem solved!
(remember to adjust file name and position to suite your needs)
Lesson learned (once again): always test your backups and do not hesitate to flush with MySQL :)

Thursday, April 26, 2012

MySQL Data export/import probelm with SQL SECURITY DEFINER

Importing a MySQL database dump from a client into my system I got this error:

$ mysql -u root -p  example_db < example_db_dump.sql 
Enter password: 
ERROR 1449 (HY000) at line 5172: The user specified as a definer ('smith'@'localhost') does not exist

Running 'grep' for that user turned up a bunch of lines like this:

/*!50013 DEFINER=`smith`@`localhost` SQL SECURITY DEFINER */

These are created on the 'donor' MySQL system when creating one or more views of the data. That user does not exist on my system and so MySQL complains.

I don't care about those views so the easiest way to deal with this issue is to remove these '50013' lines. You can do that with 'sed':

$ sed '/\*\!50013 DEFINER/d' example_db_dump.sql > example_db_dump_clean.sql

You need to drop the new database as the import process partially worked, re-create it and then you can reimport:

$ mysqladmin -u root -p drop example_db
$ mysqladmin -u root -p create example_db
$ mysql -u root -p example_db < example_db_dump_clean.sql

blekko : 3 search engine compair use: xxxx /monte

Blekko, the user-curated search engine, has launched a “3-Engine Monte” feature that directly compares results from Google, Bing and Blekko for any given search query. The tool allows users to choose the SERP most relevant to them, and it can help marketers see how their SEO content stacks up across the engines.
Users are presented with three separate SERPs and asked to choose which they think is the most useful. Before they make a selection, the results are not branded with any of three engines, and the engine behind each possible SERP is revealed only after users make their choice based on the content provided.
Blekko may be attempting to show its own value to searchers who generally choose Bing or Google over it, but for marketers, the tool demonstrates the major differences between the engines.

User-curated search engine Blekko added a new feature called 3 Engine Monte that allows users to choose which results in found the most valuable without knowing which engine produced them.

Skipping MySQL replication errors

In order to reduce mySQL replication breakages

slave-skip-errors=1062,1053,1146,1051,1050

What this does is not stop replication in case following errors are encountered

Error: 1050 SQLSTATE: 42S01 (ER_TABLE_EXISTS_ERROR)

Message: Table '%s' already exists

Error: 1051 SQLSTATE: 42S02 (ER_BAD_TABLE_ERROR)

Message: Unknown table '%s'

Error: 1053 SQLSTATE: 08S01 (ER_SERVER_SHUTDOWN)

Message: Server shutdown in progress

Error: 1062 SQLSTATE: 23000 (ER_DUP_ENTRY)

Message: Duplicate entry '%s' for key %d

Error: 1146 SQLSTATE: 42S02 (ER_NO_SUCH_TABLE)

Message: Table '%s.%s' doesn't exist

This will avoid the very common primary key collisions and "temporary tables aren't there" problems. Writing this down for posterity. Use with caution.

MySQL skip duplicate replication errors

Normally MySQL replication will stop whenever there is an error running a query on the slave. This happens in order for us to be able to identify the problem and fix it, and keep the data consistent with the mater that has sent the query. You can skip such errors, even if this is not recommended, as long as you know really well what are those queries and why they are failing, etc.
For example you can skip just one query that is hanging the slave using:
mysql>SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1; START SLAVE;
There might be cases where you will want to skip more queries. For example you might want to skip all duplicate errors you might be getting (output from show slave status;):
"1062 | Error 'Duplicate entry 'xyz' for key 1' on query. Default database: 'db'. Query: 'INSERT INTO ..."
If you are sure that skipping those errors will not bring your slave inconsistent and you want to skip them ALL, you would add to your my.cnf:
slave-skip-errors = 1062
As shown above in my example 1062 is the error you would want to skip, and from here we have:
Error: 1062 SQLSTATE: 23000 (ER_DUP_ENTRY)
Message: Duplicate entry ‘%s’ for key %d
You can skip also other type of errors, but again don’t do this unless you understand very well what those queries are and what impact they have on your data:
slave-skip-errors=[err_code1,err_code2,...|all]
and for the error codes you can see them all here … (you will see them in your show slave status; also).

Monday, April 23, 2012

MySQL master-slave and master-master replication. Step by step configuration

MySQL master-slave and master-master replication. Step by step configuration instructions.

If you are looking for the options to csale your MySQL installation you may be also interested in MySQL partitioning and subpartitioning. It may improve each node speed and capacity parameters.
One may say that there are a lot of MySQL replication manuals, but latest versions of MySQL server have changed the way how configuration should be applied. Most of the manuals do not reflect these changes. I want to describe some other aspects of configurations also. As far as there are a lot of good manuals about replication, I think there is no need to dove into details what is the replication. Just want to mention that this technique is usually used for load balancing on database servers. If you have a lot of read requests (most common for web applications) master-slave replication should suit your needs well. In this case you will do write transactions on master host and read requests on slave hosts, because data is populated from master to slave much faster than from slaves to master and to other slaves.

But sometimes you might have more write requests or may have other (application related) reasons to start another type of replication. You can see it on the next fugure and that is so called master-master replication.

In this article I will describe simple master-slave architecture with 2 hosts and simple master-master replication with the same 2 hosts. Our final goal is to configure master-master replication, what includes several sub-steps, so lets start. Sure you should configure network services on both systems. For example:
Master 1/Slave 2 ip: 192.168.16.4
Master 2/Slave 1 ip : 192.168.16.5

Iptables rules for MySQL replication

It will be good practice to allow connections only from other nodes envolved into the replication and deny from other. By the way this will work good for some other services that are allowed to communicate nly with the known hosts. You can define port range like 1025:3306 (I am going to write more about iptables soon, so follow my blog on twitter).

iptables -A INPUT -p tcp -s 192.168.16.4 --sport 3306 -d 192.168.16.5 --dport 3306 -m state --state NEW,ESTABLISHED -j ACCEPT
iptables -A OUTPUT -p tcp -s 192.168.16.5 --sport 3306 -d 192.168.16.4 --dport 3306 -m state --state ESTABLISHED -j ACCEPT

MySQL master-slave replication

Basically master-master replication consists of two master-slave replications. Now we will configure master-slave replication from the first server to the second one.
Create relication user on Master 1:

mysql> grant replication slave on *.* to 'replication'@192.168.16.5 identified by 'slave';

And start master:

mysql> start master;

Master 1 changes to /etc/my.cnf:

[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
old_passwords=1

log-bin
binlog-do-db=<database name>  # input the database which should be replicated
binlog-ignore-db=mysql            # input the database that should be ignored for replication
binlog-ignore-db=test

server-id=1

[mysql.server]
user=mysql
basedir=/var/lib


[mysqld_safe]
err-log=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid

Slave 1 changes to /etc/my.cnf:

[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
old_passwords=1

server-id=2

[mysql.server]
user=mysql
basedir=/var/lib

[mysqld_safe]
err-log=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid

Important! Pay attention that you should not configure master-host, master-user, master-password, master-port via my.cnf on slave server now.
On Master 1:

mysql> show master status;
+------------------------+----------+--------------+------------------+
| File                   | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+------------------------+----------+--------------+------------------+
|mysqld-bin.000012       |      106 | adam         |                  |
+------------------------+----------+--------------+------------------+
1 row in set (0.00 sec)

On Slave 1:

mysql> CHANGE MASTER TO MASTER_HOST='192.168.16.4', MASTER_USER='replication', MASTER_PASSWORD='password', MASTER_PORT=3306,MASTER_LOG_FILE='mysqld-bin.000012', MASTER_LOG_POS=106, MASTER_CONNECT_RETRY=10;

Attention! This will configure slave and server will remember settings, so this replaces my.cnf settings in latest versions of MySQL server.
Start slave on Slave 1:

mysql> start slave;
mysql> show slave statusG;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.16.5
                  Master_User: slave
                  Master_Port: 3306
                Connect_Retry: 10
              Master_Log_File: mysqld-bin.000012
          Read_Master_Log_Pos: 1368129
               Relay_Log_File: mysqld-relay-bin.000005
                Relay_Log_Pos: 605530
        Relay_Master_Log_File: mysqld-bin.000012
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 1368129
              Relay_Log_Space: 1367083
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 1
1 row in set (0.02 sec)

Above highlighted rows must be indicate related log files and Slave_IO_Running and Slave_SQL_Running: must be to YES.

MySQL master-master replication

Master-master replication is actually two master-slave replications. This allows to make read and write transactions on both servers, as data propagation from master to slave goes very fast oposit to data propagation from slave to master which requires much more time. So, to create master-master replication we should now configure Master 2 - Slave 2 replication.
Create a replication slave account on Master 2 for Master 1/Slave 2:

mysql> grant replication slave on *.* to 'replication'@192.168.16.4 identified by 'slave';

And start master:

mysql> start master;

Master 2 changes to /etc/my.cnf:

[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
old_passwords=1

log-bin
binlog-do-db=<database name>  # input the database which should be replicated
binlog-ignore-db=mysql            # input the database that should be ignored for replication
binlog-ignore-db=test

server-id=2

[mysql.server]
user=mysql
basedir=/var/lib


[mysqld_safe]
err-log=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid

Slave 2 / Master 1 changes to /etc/my.cnf:

[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
old_passwords=1

log-bin
binlog-do-db=<database name>  # input the database which should be replicated
binlog-ignore-db=mysql            # input the database that should be ignored for replication
binlog-ignore-db=test

server-id=1

[mysql.server]
user=mysql
basedir=/var/lib

[mysqld_safe]
err-log=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid

Important! And again you should not configure master-host, master-user, master-password, master-port via my.cnf on slave server now.
On Master 2:

mysql> show master status;
+------------------------+----------+--------------+------------------+
| File                   | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+------------------------+----------+--------------+------------------+
|mysqld-bin.000012       |      106 | adam         |                  |
+------------------------+----------+--------------+------------------+
1 row in set (0.00 sec)

On Slave 2:

mysql> CHANGE MASTER TO MASTER_HOST='192.168.16.5', MASTER_USER='replication', MASTER_PASSWORD='password', MASTER_PORT=3306,MASTER_LOG_FILE='mysqld-bin.000012', MASTER_LOG_POS=106, MASTER_CONNECT_RETRY=10;

Attention! As I have already mentioned in the previous section this will configure slave and server will remember settings, so this replaces my.cnf settings in latest versions of MySQL server.

MySQL master-master replication and autoincrement indexes

If you are using master-slave replication, than most likely you will design your application the way to write to master and read from slave or several slaves. But when you are using master-master replication you are going to read and write to any of master servers. So, in this case the problem with autoincremental indexes will raise. When both servers will have to add a record (different one each server simultaneously) to the same table. Each one will assign them the same index and will try to replicate to the salve, this will create a collision. Simple trick will allow to avoid such collisions on MySQL server.
On the Master 1/Slave 2 add to /etc/my.cnf:

auto_increment_increment= 2
auto_increment_offset   = 1

On the Master 2/Slave 1 add to /etc/my.cnf:

auto_increment_increment= 2
auto_increment_offset   = 2

--- The End ---

Saturday, April 21, 2012

MySQL: 10 essential MySQL tools for admins

MyTop
➢ http://jeremy.zawodny.com/mysql/mytop/

mysqlsla :: Parse, filter and sort MySQL slow, general and binary logs

hackmysql.com/mysqlsla

Sep 3, 2009 – mysqlsla parses, filters, analyzes and sorts MySQL slow, general, binary and microslow patched logs in order to create a customizable report

MySQLTuner

mysqltuner.pl/

To run the script with the default options, run the script without arguments\n". " Allow MySQL server to run for at least 24-48 hours before trusting suggestions\n".

mytop - a top clone for MySQL

jeremy.zawodny.com/mysql/mytop/

mytop is a console-based (non-gui) tool for monitoring the threads and overall performance

innotop - The best top clone for MySQL, with special attention paid to ...

code.google.com/p/innotop/

The best top clone for MySQL, with special attention paid to InnoDB ... innotop is a 'top' clone for MySQL with more features and flexibility than similar tools

Tools - MySQL Performance Blog

www.mysqlperformanceblog.com/tools/

Maatkit is a great set of tools for MySQL Performance Analyses and maintainence. Must know and have for any MySQL

Essential MySQL tool No. 1: mk-query-digest
Nothing frustrates like slow MySQL performance. All too often, faster hardware is thrown at the problem, a solution that works only if hardware is in fact to blame. More often than not, poor performance can be attributed to slowly executing queries that are blocking other queries, creating a ripple effect of slow response times. Since it's a lot cheaper to optimize queries than to upgrade hardware, the logical first step in MySQL optimization is query log analysis.

Database administrators should analyze query logs frequently, depending on the volatility of the environment. And if you've never performed query log analysis, it's time to start, even if you are relying on third-party software, which is often assumed to be optimized when, in fact, it is not.
Today's best query log analyzer is mk-query-digest. Co-written by Baron Schwartz and myself, it is actively developed, fully documented, and thoroughly tested. MySQL distributions include the query log analyzer mysqldumpslow, but the tool is outdated, poorly documented, and untested. Other query log analyzers, like mysqlsla, which I wrote several years ago, suffer the same problems as mysqldumpslow.

mk-query-digest analyzes query logs and generates reports with aggregated, statistical information about execution times and other metrics. Since query logs usually contain thousands, if not millions, of queries, query log analysis requires a tool.

mk-query-digest can help you find the queries that take the longest time to execute as compared to other queries. Optimizing these slow queries will make MySQL run faster by reducing the greatest delays. The real art of query optimization is more nuanced, but the basic goal is the same: find slow queries, optimize them, and increase query response times.
The tool is easy to use; executing mk-query-digest slow-query.log will print the slowest queries in slow-query.log. The tool includes support for "query reviews," for reporting queries you have not yet seen or approved, making frequent log analyses quick and efficient.
Download: http://maatkit.org/get/mk-query-digest
Maintainers: Daniel Nichter and Baron Schwartz
More info: http://maatkit.org/ | http://code.google.com/p/maatkit/

Essential MySQL tool No. 2: mydumper
Being able to generate data dumps quickly is vital for backups and server cloning. Unfortunately, mysqldump, which ships with MySQL distributions, is single-threaded and, thus, too slow for data-intensive jobs. Thankfully, the modern replacement, mydumper, uses multiple threads, making it 10 times as faster than mysqldump.
Also known as MySQL Data Dumper, this tool does not manage backup sets, differentials, or other parts of a complete backup plan. It just dumps data from MySQL as quickly as possible, enabling you to complete backups under tight time constraints, such as overnight, while employees are offline, or to perform backups more frequently than you would with mysqldump.
One technical point to know about mydumper is that it locks tables, so it is not the ideal tool for performing backups during operating hours. Then again, professional data recovery costs hundreds of dollars per hour, and you always get a bill even if the data isn't recoverable. mydumper is free and well worth exploring for even basic backups.
mydumper also comes in handy when cloning servers. Other tools perform complete hard drive duplications, but when all you need is MySQL data, mydumper is the fastest way to get it. Servers provisioned in a cloud are particularly suited to cloning using mydumper. Just dump your MySQL data from an existing server and copy it to the new instance.
Cloning is worthwhile for creating slave servers, benchmarking, and profiling, but nowhere is it more vital than in testing and development. Being able to spin up a replica for quick testing before going live is essential for dynamic MySQL environments. With mydumper, you can quickly create a server that is nearly identical to your production server, enabling your test results to better mimic production results.
Download: https://launchpad.net/mydumper/+download
Maintainers: Domas Mituzas, Andrew Hutchings, Mark Leith
More info: http://www.mydumper.org/ | https://launchpad.net/mydumper/

Essential MySQL tool No. 3: xtrabackup and xtrabackup-manager
If your databases are in use every day, all day, giving you no "overnight" during which tables can be locked for backup, xtrabackup is your solution. Also known as Percona XtraBackup, this tool performs nonblocking backups and is the only free, open source tool that can do this. By comparison, proprietary nonblocking backup software can cost more than $5,000 per server.
xtrabackup also offers incremental backups, allowing you to back up only the data that has changed since the last full backup. Adding incremental backups to your backup process is powerful, given the reduced performance hit of these tremendously smaller backups.
Furthermore, another project has grown up around xtrabackup that makes managing a full backup plan even easier: xtrabackup-manager. Although this tool is new and still in development, it holds a lot of potential because it offers advanced features like rotating backups with groups and backup set expiring. Together, xtrabackup and xtrabackup-manager are a formidable and free backup solution.
Download: http://www.percona.com/software/percona-xtrabackup/downloads/
Maintainer: Percona
More info: http://www.percona.com/docs/wiki/percona-xtrabackup:start | https://launchpad.net/percona-xtrabackup
Download: http://code.google.com/p/xtrabackup-manager/
Maintainer: Lachlan Mulcahy
More info: http://code.google.com/p/xtrabackup-manager/ | http://mysqlsoapbox.blogspot.com/

Essential MySQL tool No. 4: tcprstat
tcprstat is probably the most esoteric of the 10 on this list. The tool monitors TCP requests and prints statistics about low-level response times. When you become familiar with the response time way of thinking about performance, the payoff of tcprstat is significant.
The principle is elaborated in the book "Optimizing Oracle Performance" by Cary Millsap and Jeff Holt, and it applies equally well to MySQL. The basic idea is that a service, in this case MySQL, accepts a request (query), fulfills that request (execution time), and responds with results (result set). The service's response time is the time span between receiving a request and sending a response. The shorter the response time, the more requests can be served in the same amount of time.
Parallel processing and other low-level factors play a significant part here, but the simplified upshot is that there are 28,800 seconds in an 8-hour workday, so reducing response times by just four-tenths of a second (from 0.5 to 0.1 second) results in 230,400 more requests served each day. tcprstat helps you achieve this. I have only enough space in this article to pique your curiosity, so I'll finish this tool's introduction by telling you the first step toward getting started with MySQL response time optimization: read "Optimizing Oracle Performance." Then start using tcprstat.
Download: (source) https://launchpad.net/tcprstat | (binary) http://www.percona.com/docs/wiki/tcprstat:start
Maintainers: Percona
More info: http://www.percona.com/docs/wiki/tcprstat:start | https://launchpad.net/tcprstat

Essential MySQL tool No. 5: mk-table-checksum
"Data drift" is a significant problem for dynamic MySQL environments. This problem, wherein slave data becomes out of sync with the master, is often caused by writing data to a slave or executing certain nondeterministic queries on the master. What's worse is that the data differences may go unnoticed until they become crippling. Enter mk-table-checksum, a tool that performs the complex, sensitive calculations necessary to verify the data in two or more tables is identical.
mk-table-checksum works with both stand-alone servers and servers in a replication hierarchy, where the tool's greatest value is easily seen. Verifying table data between a master and a slave must account for replication consistency. Because changes to the master are replicating to slaves with some amount of time delay ("lag"), simply reading data from the servers is an unreliable way to verify consistency, given that the data is constantly changing and incomplete until fully replicated. Locking tables and waiting for all data to replicate would allow consistent reads, but to do so would mean effectively halting the servers. mk-table-checksum allows you to perform nonblocking, consistent checksums of master and slave data. (For technical details on how this is accomplished, see the tool's documentation.)
Apart from replication consistency, there are other problems with verifying data. Table size is one of them. The MySQL command CHECKSUM TABLE is sufficient for small tables, but large tables require "chunking" to avoid long locks or overloading CPU or memory resources with checksum calculations.
Chunking solves a second problem: the need for regular data-consistency checks. While data drift can be a one-time occurrence, often it is recurring. mk-table-checksum is designed to continuously check tables, vetting certain chunks one run and other chunks the next run until eventually the whole table has been checked. The ongoing nature of this process helps ensure that recurring drift is corrected.
Download: http://maatkit.org/get/mk-table-checksum
Maintainers: Daniel Nichter & Baron Schwartz
More info: http://maatkit.org/ | http://code.google.com/p/maatkit/

Essential MySQL tool No. 7: mycheckpoint
You don't always want to wait for something to go wrong before addressing a problem, and dashboards provide an essential way for you to monitor your MySQL environment for potential problems before they arise.
There are many free and commercial monitoring applications for MySQL, some MySQL-specific and others generic with MySQL plug-ins or templates. mycheckpoint is notable because it is free, open source, MySQL-specific, and full-featured.

mycheckpoint can be configured to monitor both MySQL and server metrics, like InnoDB buffer pool flushes, temporary tables created, operating system load, memory usage, and more. If you don't like charts, mycheckpoint can also generate human-readable reports.
As with stalk, alert conditions can be defined with email notifications, but no secondary tool like collect will be run to log additional troubleshooting data. Another useful feature is mycheckpoint's ability to monitor MySQL variables to detect changes that can lead to problems, or signal that someone has modified MySQL when they shouldn't have.

Monitoring MySQL isn't just for data centers or large deployments. Even if you have a single MySQL server, monitoring is essential; as with your vehicle, there's a lot to know about the system while it's running to help you foresee or avoid malfunctions. mycheckpoint is one solution among many worth trying.
Download: http://code.google.com/p/mycheckpoint/downloads/list
Maintainer: Shlomi Noach
More info: http://code.openark.org/forge/mycheckpoint

Essential MySQL tool No. 9: mk-archiver
As tables become larger, queries against them can become slower. Many factors influence response times, but if you have optimized everything else and the only remaining suspect is a very large table, then archiving rows from that table can restore fast query-response times.
Unless the table is unimportant, you should not brazenly delete rows. Archiving requires finesse to ensure that data is not lost, that the table isn't excessively locked, and that the archiving process does not overload MySQL or the sever. The goal is an archiving process that is reliable and unnoticeable except for the beneficial effect of reducing query times. mk-archiver achieves all this.

mk-archiver has two fundamental requirements, the first of which is that archivable rows must be identifiable. For example, if the table has a date column and you know that only the last N years of data are needed, then rows with dates older than N years ago can be archived. Moreover, a unique index must exist to help mk-archiver identify archivable rows without scanning the entire table. Scanning a large table is costly, so an index and specific SELECT statements are used to avoid table scans.
In practice, mk-archiver automatically handles the technical details. All you have to do is tell it what table to archive, how to identify archivable rows, and where to archive those rows. These rows can be purged, copied to another table, or written to a dump file for future restoration if needed. Once you're comfortable with the tool, there are many options to fine-tune the archiving process. Also, mk-archiver is pluggable, so it can be used to solve complex archiving needs without patching the code.
Download: http://maatkit.org/get/mk-archiver
Maintainers: Daniel Nichter and Baron Schwartz
More info: http://maatkit.org/ | http://code.google.com/p/maatkit/

Essential MySQL tool No. 10: oak-security-audit
When was the last time you audited the security of your MySQL servers? You're not alone if "never" is the answer. There are many companies that provide security audits, but unless nothing ever changes after those audits, then the security of your MySQL environment should be checked regularly.
External threats are one obvious reason to enforce MySQL security, but internal threats like current or former employees are often more dangerous because they are (or were) trusted. Security is also important for enforcing privacy (medical/HIPAA regulations), preventing accidental access (for example, logging into the production server instead of the development server), or enabling third-party programs to interact with your systems.

For those looking to increase the security of their deployments, oak-security-audit is a worthwhile, free, open source tool that performs basic MySQL security audits. It doesn't require any setup; just run it against your MySQL servers, and it prints a report with risks and recommendations about accounts, account privileges, passwords, and some general best practices, like disabling network access. Here's a snippet of a report:

-- Looking for anonymous user accounts
-- -----------------------------------
-- Passed
--
-- Looking for accounts accessible from any host
-- ---------------------------------------------
-- Found 1 accounts accessible from any host. Recommended actions:
RENAME USER 'msandbox'@'%' TO 'msandbox'@'<specific host>';

oak-security-audit focuses just on MySQL security, so it's not a replacement for a full system security audit by a human, but it's a great first line of defense that is easy to use. You could run it weekly with cron and have the reports emailed to you.
Download: http://openarkkit.googlecode.com/svn/trunk/openarkkit/src/oak/oak-security-audit.py
Maintainer: Shlomi Noach
More info: http://openarkkit.googlecode.com/svn/trunk/openarkkit/doc/html/oak-security-audit.html

MySQL: Master – Master setup -- Do big schema changes with no downtime

Master – Master setup
• Setup two replicas of your database copying changes to each-other
• Keep it simple! (all writes to one master)
• Instant fail-over host – no slave changes needed
• Configuration is easy!
• set-variable = auto_increment_increment=2 set-variable = auto_increment_offset=1
• (offset = 2 on second master)
• Setup both systems as a slave of the other

Online Schema Changes
The reasons we love master-master!
• Do big schema changes with no downtime!
• Stop A to B replication
• Move traffic to B
• Do changes on A
• Wait for A to catchup on replication
• Move traffic to A
• Re-start A to B replication

Mysql : HOWTO Enable query logging

Enable query logging in your development DB!

• Are all those queries really necessary? Cache candidates?
• (you do have a devel db, right?)
• Just add “log=/var/lib/mysq/sql.log” to .cnf
• Slow query logging:
log-slow-queries
log-queries-not-using-indexes
long_query_time=1
• mysqldumpslow parses the slow log
• 5.1+ does not require a server restart and, can log directly into a CSV table...

MySQL : Finding tables without primary keys

Finding tables without primary keys

I was checking a third party server, and I needed to find if there were tables without primary keys. This is important to know, not only because the lack of primary keys affects performance and data accuracy in general, but also because in row-based replication performance can degrade beyond belief when updating tables without primary keys. Anyway, I did not remember off the bat any method to get this information from a server with thousands of tables, and thus I went to find a solution on my own. My first instinct called for using the COLUMNS table from the INFORMATIOn_SCHEMA, and so I came up with this query, where I sum the number of columns that are inside either a PRIMARY or UNIQUE key and filter only the ones where such sum is zero (i.e. no primary or unique keys):

check this query (IN should be faster than outer join if result is few tables):

SELECT table_schema, table_name
FROM information_schema.tables
WHERE (table_catalog, table_schema, table_name) NOT IN
(SELECT table_catalog, table_schema, table_name
FROM information_schema.table_constraints
WHERE constraint_type in ('PRIMARY KEY', 'UNIQUE'))
AND table_schema NOT IN ('information_schema', 'mysql');

alt.

select tables.table_schema
, tables.table_name
, tables.engine
from information_schema.tables
left join (
select table_schema
, table_name
from information_schema.statistics
group by table_schema
, table_name
, index_name
having
sum(
case
when non_unique = 0
and nullable != 'YES' then 1
else 0
end
) = count(*)
) puks
on tables.table_schema = puks.table_schema
and tables.table_name = puks.table_name
where puks.table_name is null
and tables.table_type = 'BASE TABLE'

alt.

select 
    t.table_schema,t.table_name,engine 
from 
    information_schema.tables t 
    inner join information_schema .columns c  
        on t.table_schema=c.table_schema and t.table_name=c.table_name 
group by 
    t.table_schema,t.table_name   
having 
    sum(if(column_key in ('PRI','UNI'), 1,0)) =0;

alt.

Finding tables without primary keys

select 
    table_schema,table_name 
from  
    information_schema.columns  
group by 
    table_schema,table_name   
having 
    sum(if(column_key in ('PRI','UNI'), 1,0)) = 0;

alt.

select 
    t.table_schema, t.table_name 
from 
    information_schema.tables  t 
    left join information_schema. statistics s 
       on t.table_schema=s.table_schema and t.table_name=s.table_name 
       and s.non_unique=0 
where 
    s.table_name is null;

Tuesday, April 10, 2012

The Data Charmer: MySQL replication for demanding users

The Data Charmer: MySQL replication for demanding users: I have been working with MySQL replication for quite a while. I have dealt with simple replication setups and I have experimented with compl...