Big trouble with zero-length character columns in TokuDB

What good is a zero-length character column in a MySQL table? A zero-length character column has type of ‘char(0)’. If it is nullable, then it can at least store one bit. If it is not nullable, then the value for this column in all rows is a null string. IMO, not very useful. However, the MySQL Reference Manual says that there are valid uses for such a column, so TokuDB should support it. Unfortunately, we recently found and fixed a bug related to zero length character columns in TokuDB.

A Random Query Generator (RQG) trial generated a table with a ‘char(0)’ column and caused TokuDB to crash when executing an alter table statement that drops the column. The reason for the crash is related to the layout of the values of a row. TokuDB stores all of the fixed length columns first, followed by the variable length columns, and finally followed by the blob columns.

The values of the fixed length columns are concatenated and stored at the beginning of each row. Access to the i’th fixed length column is fast since its offset is computed when the table is opened and is the same for each row in the table.

TokuDB stores an array of the offsets of the variable length values in each row followed by the values of the variable length columns. One can easily compute the length of each of the values and retrieve the value with a single array look-up.

Blobs are encoded as a sequence of length value pairs. Access to the i’th blob uses a linked list search through the previous blobs.

TokuDB assumed that all fixed length fields have a length GREATER than zero. Because of this, TokuDB would classify a zero length char column as a blob rather than as a fixed length column. This classification failure caused the crash.

TokuDB version 7.1.5 now computes a type for all of the columns in the table when the table is opened. Fixed length columns can now have zero length, so the row encoders and decodes work correctly.

Share this post

Leave a Reply