EmergencyEMERGENCY? Get 24/7 Help Now!

Beware of MySQL Data Truncation


Posted on:

|

By:


PREVIOUS POST
NEXT POST
Share Button

Here is nice gotcha which I’ve seen many times and which can cause just a minefield for many reasons.
Lets say you had a system storing articles and you use article_id as unsigned int. As the time goes and you see you may get over 4 billions of articles you change the type for article_id to bigint unsigned but forget linked tables.

For table “article” itself you would notice the issue quickly if you would forget to alter it as inserts will fail because of primary key collisions. But for linked tables like above inserts work you just get a warning:

And in my experience very few people mind to check the warnings promptly.

So two things happen, one is nastier than the other. First the comments end up associated to the wrong article (4294967295) which can be quite a problem in particular if these are private comments. The other thing which tends to happen – you get A LOT of comments associated with this article and few systems are designed with YouTube capacity to handle millions of comments per item.

Note the question in this case is not like there is just one article which has insane number of comments but a as a all articles outside of allowed range have it:

MySQL optimizer behaves rather strange in such case (at least in MySQL 5.0) – instead of instantly saying there is no such value (because value is out of range what could be stored in the column) it truncates the value, performs index lookup, scanning all rows (just 3 in my trivial example) and when filtering them out because value really does not match the where clause.

This makes such error even harder to catch – you will not see wrong comments for a lot articles, you will see 0 while queries will just run insanely slow.

So how to avoid this problem ?

It is actually quite easy – you just need to enable strict mode:

Note however strict mode should not be taken lightly – it is good for reliability but many applications will need to be fixed before they start working properly with this mode.

Share Button
PREVIOUS POST
NEXT POST


Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master's Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.



Tags:

Categories:
Insight for DBAs


Comments
  • Nice to know, and indeed, I hardly ever check the warnings…

    However, this seems like a bit of a futile problem.

    In the case comments can have the same id (i.e. comment #1 can exist on several articles) it would seem logical to me to simply create a unique key on the columns ‘article_id’ and ‘comment_id’ in the table preventing this issue.
    But more importantly; if the database were to be designed coherently, the article with ID 4300000000 would not get created in the first place, so how are people going to comment on an article that never got created in the first place? 😉

    Reply

  • uh oh…. typed to quickly…

    “you change the type for article_id to bigint unsigned but forget linked tables.”

    I would never do that, of course. Please forget my second remark
    :)

    Reply

  • Onno,

    In consulting we do not always deal with perfectly designed systems plus this is of course simplified example

    It does not matter what comment_id is – quite common comment_id is unique by itself (so we can fetch data by comment id easily) which may reduce the need of such index. Not to mention people hunting for performance may be easy on unique indexes because (in innodb) it blocks insert buffer.

    Reply

  • right you are :)

    Reply

Leave a Reply

Percona’s widely read Percona Data Performance blog highlights our expertise in enterprise-class software, support, consulting and managed services solutions for both MySQL® and MongoDB® across traditional and cloud-based platforms. The decades of experience represented by our consultants is found daily in numerous and relevant blog posts.

Besides specific database help, the blog also provides notices on upcoming events and webinars.
Want to get weekly updates listing the latest blog posts? Subscribe to our blog now! Submit your email address below and we’ll send you an update every Friday at 1pm ET.

No, thank you. Please do not ask me again.