Thursday, July 31, 2008

MySQL: triggers + replication = frustration

For the most part, I've been impressed with MySQL, but every once in a while I hit a problem that really surprises me. It seems like MySQL has this mentality that if there is a bug that is hard to fix, just document it, and then it's a feature and not a bug. Nice.

MySQL claims to have most of the power features of a robust RDBMS like triggers, foreign keys (although I consider that the most basic of features), stored procedures, etc. But it sure is frustrating to find out that most are incomplete.

Sure MySQL has foreign keys, and cascade deletes, but don't expect cascade deletes to fire triggers. That's a documented feature (bug).

Sure MySQL has triggers, but don't try using them if you are also using replication. We recently got bit by a feature (bug) where stored procedures or triggers that insert multiple records in tables with auto-increment don't work with replication. The auto-increment values on the replica will get off, and replication will break.

The problem is that before each insert statement in the binlog, there is a statement to set the auto-increment value. This is important to make sure that the values will always be the same between master and replica. On the master you may have two transactions that run in parallel, and use interleaved auto-increment values:

tx1: insert into table1 ... (uses auto-increment value 1)
tx2: insert into table1 ... (uses auto-increment value 2)
tx1: insert into table1 ... (uses auto-increment value 3)
tx1: commit;
tx2: commit;

In the binlogs, the transactions are serialized, so the auto-increment value has to be explicitly set:

set auto-increment to 1;
insert into table1...
set auto-increment to 3;
insert into table1...

set auto-increment to 2;
insert into table1...

But consider the case where the insert is on a table with a trigger that inserts a record into a second table (like an auditing table). The binlog only sets the auto-increment value for the actual insert statement. The trigger's insert will use whatever the replica's auto-increment value for the second table is set to. Since simultaneous transactions on the master are serialized in the binlogs, inserts on the second table may happen out of order, and auto-increment values will no longer match the master.

It seems that MySQL is not planning on fixing this bug in 5.0. My understanding is that in 5.1 the solution will be to use row-based replication. Statement based replication will still be broken, from what I can tell. I did see one bug report where someone said something about mixed mode replication, and switching to row-based temporarily for any statement or stored procedure call that will insert multiple records. Sounds like a can of worms to me.

Regardless of what happens in 5.1, there will be no solution for this in 5.0. And from my experience, I'll be nervous to move to 5.1 anytime soon. So it looks like I'll be stuck with this "feature" for a while to come. Nice.