Innodb row size limitationFernando Ipar
I recently worked on a customer case where at seemingly random times, inserts would fail with Innodb error 139. This is a rather simple problem, but due to it’s nature, it may only affect you after you already have a system running in production for a while.
Suppose you have the following table structure:
CREATE TABLE example (
id INT UNSIGNED NOT NULL AUTO_INCREMENT,
fname TEXT NOT NULL,
ftitle TEXT NOT NULL,
fsubtitle TEXT NOT NULL,
fcontent TEXT NOT NULL,
PRIMARY KEY (id)
Now you insert some test data into it:
mysql> INSERT INTO example
-> VALUES (
->Â Â NULL,
->Â Â 'First example',
->Â Â 'First comment',
->Â Â 'First title',
->Â Â 'First subtitle',
->Â Â 'First content',
->Â Â 'First header',
->Â Â 'First footer',
->Â Â 'First disclaimer',
->Â Â 'First copyright',
->Â Â 'First stylesheet',
->Â Â 'First terms'
Query OK, 1 row affected (0.03 sec)
All goes ok.
Now you begin using this structure in production, and ocassionally get this error:
ERROR 1030 (HY000) at line 1: Got error 139 from storage engine
What’s going on?
Innodb gives you this error when it can’t store all of the variable-length columns for a given row on a single database page.
Innodb has a limit for the maximum row size, which is slightly less than half a database page (the actual math is 16k-(page header + page trailer)/2. For the default page size of 16kb. this sets an ~8000 bytes limit on row size (8126 as reported by my test installation). This limit is a consequence of InnoDB storing two rows on a page. If you’re using compression, however, you can store a single row on a page.
If your row has variable length columns and the complete row size exceeds this limit, InnoDB chooses variable length columns for off-page storage.
In these cases, the first 768 bytes of each variable length column is stored locally, and the rest is stored outside of the page (this behavior is version specific, see https://www.percona.com/blog/2010/02/09/blob-storage-in-innodb/ for further reference).
It’s worth mentioning that this limit applies to the byte-size of values, not the character-size. This means if you insert a 500 character string with all multi-byte characters into a VARCHAR(500) column, that value will also be chosen for off page storage. This means it is possible that you hit this error after converting from a native language character set to utf8, as it can increase the size dramatically.
If you have more than 10 variable length columns, and each exceeds 768 bytes, then you’ll have at least 8448 bytes for local storage, not counting other fixed length columns. This exceeds the limit and therefore you get the error.
You don’t have to get it always, as this is not evaluated at table definition, but at row insertion. You may just have a table with 50 variable length columns, but if their max length is, say 50 bytes, you still have plenty of room for local storage.
The first thing to highlight here is the need for proper testing. If you just insert dummy data in your database, you may be missing to catch important data-dependent bugs like this one.
Now, how can you work around this problem?
Here are a few approaches you can take to solve this:
Upgrade to Barracuda format
This is probably the simplest approach, and as Barracuda only uses a 20 byte pointer for variable length types, this problem just goes away.
Limit the size of variable length columns
This is a quite obvious approach, and if your data allows it, it’s also one of the simplest to implement.
Just because you need a variable length field, it doesn’t mean you need to be able to store data up to that fields’ maximum allowed value. So, if your problem domain allows you to limit the size of these fields without affecting your use cases, this is the way to go.
Use the COMPRESS/UNCOMPRESS functions
This is also relatively simple to implement, though that depends on where in your code you use this fields. If you have a modular data access API, you shouldn’t have to change code in a lot of places. How much you can win with this approach depends a lot on the type of data you’re inserting (hint: the more random your data is, the worse compression will work). Something important to note is that the result of the COMPRESS function is always binary, so for example, you should use a BLOB instead of a TEXT column for storage. You can find more information about this functions here.
Split the table in a way that you don’t have more than 10 variable length columns per table
This approach has more impact on your existing code, as it requires you to rewrite as joins queries that access all columns in the original table. It also requires you to change the way you insert/modify data, as more tables are involved, and for example you will have to run any multi table update in a transaction. It will also impact both read and modification speed, precisely for the added join complexity. However, if you have too much variable length columns and you can’t limit their size, this may be the only possible solution.
Combine all your variable length fields into a single BLOB and do the splitting at the application level.
This will always work (unless you have so many fixed length columns that you still exceed 8000 bytes, but in that case I think you have a bigger problem than Innodb’s internal limitations), though you still need to change the way your application uses this data. You can limit the changes to your application if you choose XML as format and you use MySQL’s built in functions for handling this data. This approach, however, does have other negative impact, as it prevents you from using SQL operators/functions directly on the individual fields you want to store. It also forces you to read/write more data at once, while one of the interesting things about overflow blob storage is that if the columns aren’t requested as part of a row-read, they don’t have to be touched at all.
Of course, these approaches can be combined. For example, using the single BLOB field with COMPRESS/UNCOMPRESS can yield great results.
Finally, let me mention that handling of overflow pages involves other things one should consider. No overflow pages are shared, so a 769 byte blob will get a 16k overflow page for itself. Also, overflow pages are allocated 1 page at a time until you reach 32 pages. Then they are allocated an extent at a time (64 pages).