September 17, 2014

Instrumentation and the cost of Foreign Keys

I occasionally get in to light arguments healthy discussions with students about whether or not to use Foreign Key constraints on InnoDB tables.  My standard response has always been: “it depends on how much of a tradeoff you are willing to make for performance. In some situations the cost can be considerable”.

.. that’s when they expect me to “come up with some real proof” to show them. I do not disagree with their logic or proof being on their list-of-demands.  I support the use of data to make decisions.

The problem is that MySQL has (traditionally) been lacking the instrumentation required to make these decisions easy.  This is getting better  – here is an example we recently added to our InnoDB course:

In the last statement, session2 will block waiting on a lock. Want to know where that lock is? Check information_schema.innodb_locks:

The same example without the foreign key constraints does not block on the last statement.  We also see the expected output change to:

This information_schema table is new to InnoDB plugin.  In earlier releases of MySQL you may be able to get the data, but it is not in such an easily digestible form.  Instrumentation is the most under talked about feature in all new releases of MySQL and Percona Server.  See BUG #53336 for more examples of pure awesomeness.

About Morgan Tocker

Morgan is a former Percona employee.
He was the Director of Training at Percona. He was formerly a Technical Instructor for MySQL and Sun Microsystems. He has also previously worked in the MySQL Support Team, and provided DRBD support.

Comments

  1. Since update 1 happens in a different (uncommitted) transaction then updates 2 and 3, they shouldn’t be affecting eachother, should they? (Also, are the transactions needed for this to work?)

    And anyway, now next time those students ask for proof, you’ve got it. :)

  2. @Jory – My interpretation of this is: Foreign key constraints add additional locking to make sure no changes can happen that would break consistency. i.e. when we update the child with parent_id=5, the parent row 5 can not be modified.

    You do not need START TRANSACTION for this to occur – this is just the easiest way to make it visible. By default each statement is it’s own transaction.

  3. InnoDB’s mantra consists in adding locks and key controls -so it’s undoubtable slower- in order to ensure that the database will not be corrupted by a statement. My opinion is, that kind of controls should be mandatory and the question of performance over security should not be asked.

    If you want to go fast, pass over controls and denormalize your data, NoSQL & RAM caches are the solutions.
    But please, let the database to be a database ^_^

  4. Denis, that’s exactly where the discussion starts ;)

    Foreign keys can give you some warning of application logic problems. I enjoy using them in development, but there is a tradeoff that can successfully be made:

    The parent id 5 row is locked to prevent against changes that could produce inconsistency. How often would you change the primary key of the parent row? The only situation I could think of is as part of a delete – in which case I would have the application cascade down and delete child rows. InnoDB is being more careful than I need it to be.

  5. Valdemar says:

    Morgan, please correct me if i’m wrong:

    we have 2 transactions, they start one after another. Then updating of child row occurs in 1-st transaction. Meanwhile in a second transaction two statements are executed. After all this, transactions are commited.

    So, the first question: why “UPDATE parent SET bogus_column = ‘new!’ WHERE id = 4;” is not blocked? As I see, the child row with parent_id = 4 is modified! And isn’t it strange that “UPDATE parent SET bogus_column = ‘new!’ WHERE id = 5;” is blocked — why it should be? There no operations over the child row with parent_id = 5!

    What am i missing?

  6. Valdemar says:

    Пожалуйста, ну хоть кто-нибудь!

  7. Gray says:

    You will notice that a child is changed to point to parent_id = 5:
    (UPDATE child SET parent_id = 5 WHERE parent_id = 4;)
    So now, one child is pointing to parent where parent_id = 5 but is not committed yet… It will be locked until committed…

    Question is, why is this lock needed?!?!

    a) I agree that only if the parent is deleted it should be a problem…
    b) If the parent is found deleted at time of trying to commit the transaction, it should fail with “original item/parent was deleted by other transaction” message…

    It seems to me that MySQL is not as robust as I thought…

    But, I am not as educated as I would like to, and come from many years of using PostgreSQL that seems to be awesome (yes, I do like journaling databases, they keep my data safe :-)

    Any comments of comparison?

    Will this only create temporary lock or will it create a deadlock if session 1 is never committed? (I do assume there is a max wait time where it is canceled, but even waiting 30 secs is a lifetime in webapps)

  8. Glenn says:

    This is an older discussion but still important since Google searches will still direct folks here.

    Denis says let the database be a database. What is a database? A means of storing data that allows you to get to the data quickly and logically. A database engine could have all sorts of cool bells and whistles but if those bells and whistles cause your application to drag down and result in crappy response times, they are not appropriate for that application.

    Usually I choose the bells and whistles based on the size and complexity of the database. And where in my application those capabilities becomes necessary. If I have one or two spots where a foreign key restraint becomes necessary in a large and complex application it seems a little code that accomplishes the same thing is better than having that rule cause performance problems throughout the application.

    By all means use the most sophisticated tools available within a data base engine that improves response and efficiency. But just because it has a really cool capability doesn’t mean that capability SHOULD or MUST be used.

    I have used everything from DB2 to Oracle and find mySQL every bit as robust and efficient as any commercial product for what I have developed so far. But the biggest difference between the two environments is that with DB2 and Oracle I always had a group of DBA types taking my database design and doing the implementation. Far better than my experience could manage. I do my own mySQL implementation. It is my believe that a good mySQL DBA could get just as much out of mySQL as any good DB2 DBA could get out of DB2.

    Maybe I’m wrong. But first I have to develop and application where the size and complexity of the data would make the difference.

  9. Dave says:

    So, locking and subsequent blocking is one thing, but as you said, the data referenced via your FK may not be changed all that often depending on how you application is written. For example, someone creates an account, that account gets assigned an ID, then a transaction is created which references that account via an FK; it’s unlikely that the “reference data” of the account information will be updated frequently. The blocking issue can be mitigated by keeping it in mind while coding the application.

    I have heard that with Oracle, FKs provide a performance benefit because the optimizer has more information about what needs to be done and the plan is created faster. Does innodb get any such performance benefit in the same manner?

    One thing not mentioned here is: I think we’ve all run into schemas where the only way to understand the entity relationships was to read through the application. FKs can provide a very intuitive view into data relationships, not to mention help prevent the application from creating inconsistent data. FKs can help make the schema the final gate for data consistency.

    So, how about some real performance data benchmarks with and without FKs to see if the additional locking (blocking aside) has any real concern?

Speak Your Mind

*