Global Transaction ID (GTID) is one of the most compelling new
features of MySQL 5.6. It provides major benefits in failover,
point-in-time backup recovery, and hierarchical replication, and it's a
prerequisite for crash-safe multi-threaded replication. Over the course
of the last few months, we enabled GTID on every production MySQL
instance at Facebook. In the process, we learned a great deal about
deployment and operational use of the feature. We plan to open source
many of our server-side fixes via WebScaleSQL, as we believe others in the scale community can learn from this and benefit from the work we've done.
Background
Traditional
MySQL replication is based on relative coordinates — each replica keeps
track of its position with respect to its current master's binary log
files. GTID enhances this setup by assigning a unique identifier to
every transaction, and each MySQL server keeps track of which
transactions it has already executed. This permits "auto-positioning,"
the ability for a replica to be pointed at a master instance without
needing to specify a binlog filename or position in the CHANGE MASTER
statement.
Auto-positioning makes failover
simpler, faster, and less error-prone. It becomes trivial to get
replicas in sync after a master failure, without requiring an external
tool such as Master High Availability (MHA). Planned master promotions
also become easier, as it is no longer necessary to stop all replicas at
the same position first. Database administrators need not worry about
manually specifying incorrect positions; even in the case of human
error, the server is now smart enough to ignore transactions it has
already executed.
By permitting replicas to be
repointed to masters at different levels of the hierarchy, GTID greatly
simplifies complex replication topologies, including hierarchical
replication (slaves of slaves). Since a GTID-enabled binlog stream can
safely be taken from any member of a replica set, as well as replayed
without requiring relative positions, the feature also eases binlog
backup and recovery.
Additionally, by combining
GTID with semi-synchronous replication, we have implemented automation
to safely recover crashed masters as replicas. When a master crashes, we
can detect this and promote a replica within 30 seconds without losing
data. Later, if the original master was able to be recovered and our
automation detects its data is consistent, GTID allows us to repoint it
to the new master instead of having to kick off a copy operation to
replace it.
Deployment
GTID in MySQL
5.6 is extremely difficult to deploy to an existing large-scale
environment. The gtid_mode variable is not dynamic, and also must match
between master and replicas. The official deployment plan in the MySQL
documentation involves making a master read_only, shutting down all
MySQL instances in the replica set simultaneously at the same position,
enabling the gtid_mode variable in my.cnf, and then starting all of the
instances. This process is fundamentally incompatible with the notion of
high availability, making it unviable for production use at scale.
We worked around this problem in fb-mysql
by relaxing the constraint that gtid_mode must match between master and
replicas. We permit a gtid_mode=ON replica to have a gtid_mode=OFF
master, and we suppress assignment of GTIDs to transactions on servers
with read_only=ON (i.e., replicas). This permits a high-availability
deployment strategy as follows, for each replica set:
- On each replica, one at a time, restart MySQL to enable gtid_mode. Afterward, we are in a state where every replica has gtid_mode=ON, but the master still has gtid_mode=OFF.
- Perform a master promotion as normal, repointing the replicas and original master to a new master. The original master's replication will intentionally break when started, since it still has gtid_mode=OFF.
- Restart the original master to enable gtid_mode. It will now be able to replicate from the new master, and the entire replica set now has gtid_mode=ON.
With
sufficient safeguards and validation logic, it is safe to execute this
rollout process to a large number of replica sets at a time. During the
peak of the deployment process, we were running our rollout script on up
to hundreds of replica sets simultaneously.
fb-mysql changes
To
deploy GTID and make it work in our production environment, we had to
make huge changes both in the MySQL server and in the automation around
it. Apart from the deployment changes, during initial testing we have
encountered a number of serious bugs and performance regressions with
GTID.
In some cases, the MySQL server ended up
scanning all binary logs whenever a replica connected with the new
auto-position protocol, or during crash recovery, to initialize the
global variables GTID_PURGED and GTID_EXECUTED. Opening the binary logs
is required to read the previous_gtid_events present at the beginning of
the binary log. In fb-mysql, we fixed these issues by changing the
format of binary log index file to include previous_gtid_event
corresponding to each binary log file name. Fb-mysql uses the
previous_gtid_events present in binary log index file instead of opening
the binary logs directly thus improving performance.
We
also found that slaves with GTID are not crash-safe with less durable
settings (sync_binlog !=1 and innodb_flush_log_at_trx_commit != 1).
Using fully durable settings requires syncing both the binary log and
innodb transaction log to disk after each transaction in single-threaded
replication mode, which negatively affects slave apply performance. It
is important for any feature to be crash-safe to avoid operational
overhead at Facebook scale. So in fb-mysql, we decided to fix this issue
by adding a new transaction table (mysql.slave_gtid_info) to
consistently store the GTID information.
GTID
is a powerful feature that simplifies many replication complexities. For
example, we identified multi-threaded slave doesn’t work along with
relay_log_recovery=1, but relay_log_recovery=1 is required for crash
safety even with less durable replication settings (sync_relay_log !=
1). In fb-mysql we allowed the use of relay_log_recovery=1 with
multi-threaded slave when GTID is enabled, since gaps in execution
created after the multi-threaded slave crash are automatically filled by
GTID auto-positioning.
Preparation
There
were several steps we had to take prior to beginning our GTID
deployment. One major step involved updating all of our automation to
use GTID and auto-positioning. Enabling gtid_mode without use of
auto-positioning is detrimental to replica crash-safety, so it is
important to roll out both at once. The most substantial change was to
our promotion logic, which now had to cover additional permutations for
whether GTID was already enabled, or being enabled for the first time.
Another important prerequisite involves prevention of GTID-incompatible statements.
MySQL has an option, enforce_gtid_consistency, that causes these
statements to generate an error. For safety's sake, this should always
be enabled whenever gtid_mode is also enabled. However, before beginning
the rollout, it is necessary to audit applications and preemptively fix
any uses of these query patterns. To make this possible at our scale,
we augmented MySQL to add user stat counters for these statements, as
well as an option to write full information on them to the MySQL error
log. This allowed us to easily identify around 20 cases of these query
patterns being used, among our thousands of special-case workloads.
Finally,
we wrote a script to aid in skipping statements, in the rare cases
where that is necessary. The age-old variable sql_slave_skip_counter
does not work with gtid_mode enabled; instead, a DBA must fiddle with
the gtid_next variable and insert an empty transaction. This is painful
in an emergency, especially while a large DBA team is still ramping up
on GTID knowledge, so having a helper script is prudent.
For a more in-depth technical overview of our GTID-related MySQL changes and automation efforts, please see our slides from Percona Live MySQL Conference 2014.
Summary
Facebook's
Global Transaction ID deployment was a cross-functional collaboration
between our MySQL engineering, database operations, and data performance
teams. Deploying GTID to a Facebook-scale environment required
substantial effort, including major improvements to the MySQL server,
changes to our automation, and a custom rollout script. Early on in the
rollout process, we also uncovered, reported, and fixed several bugs and
performance issues in MySQL relating to GTID support. We can happily
state that it is now extremely stable in our use, with no new problems
encountered in recent months. We plan to make these same fixes to the WebScaleSQL branch of MySQL in the next few weeks, so that others can learn and benefit from this.
Despite
the effort involved, deploying GTID has proven to be well-worth the
time commitment. The feature has provided us immediate benefits, in
addition to being a base for further automation improvements in the near
future.
- #69059 GTID lack a reasonable deployment strategy
- #69097 Mysqld scans all binary logs on crash recovery
- #68386 Master scans all binlogs when slave reconnects with auto positioning
- #70659 Make crash safe slave work with gtid + less durable settings
- #69943 Transactions skipped on slave after "stop/start slave" using GTID replication
- #71575 Master logs two consecutive GTIDs causing slaves to miss the first GTID
- #72313 Stop sql_thread, start sql_thread causes a trx to log with a different GTID
- #72314 Stop io_thread, start io_thread with GTID may cause data inconsistencies
- #72635 Data inconsistencies when master has truncated binary log with GTID after crash
- #73032 Setting gtid_purged may break auto_position and thus slaves
- #70711 Mysqlbinlog prints invalid SQL from relay logs when GTID is enabled
- #73397 Make MTS work with relay_log_recovery=1 when GTID is enabled
No comments:
Post a Comment