In posts on June 30 and July 6, I explained how implementing the commands “replace into” and “insert ignore” with TokuDB’s fractal trees data structures can be two orders of magnitude faster than implementing them with B-trees. Towards the end of each post, I hinted at that there are some caveats that complicate the story a little. On July 21st I explained one caveat, secondary keys, and on August 3rd, Rich explained another caveat. In this post, I explain the other two caveats: triggers and replication.
First, let’s look at triggers. The key to “replace into” and “insert ignore” being fast is that the uniqueness check is not done when the command is executed. The uniqueness check is deferred to a more opportune time. Triggers do not allow the uniquness check to be deferred. For triggers to properly work, we must know exactly what rows applied what modifications. For “replace into”, we must know which rows resulted in an insertion and which rows resulted in a deletion and insertion. For “insert ignore”, we must know which rows resulted in an insertion and which rows performed no operation. This information is not available unless disk seeks are performed.
Now, let’s look at row based replication. The problem here lies within mysql (as I describe in bug 53561, although that bug only addresses “replace into”, similar issues exist with “insert ignore”). Row based replication is not designed to handle anything other than normal insertions, which return an error if a duplicate key is found. So, a successful execution of “replace into” or “insert ignore” on the master maps to a normal insertion on the slave. Because normal insertions have different semantics than “replace into” and “insert ignore”, row based replication does not work if you use the altered semantics.
The problems with triggers are semantic. Their requirements incur disk seeks, which slows performance. The problem with row based replication is a design issue which will hopefully be fixed one day.