Sometimes you have the task of storing multiple of boolean values (yes/now or something similar) in the table and if you get many columns and many rows you may want to store them as efficient way as possible.
For MyISAM tables you could use BIT(1) fields which get combined together for efficient storage:
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
CREATE TABLE `bbool` ( `b1` bit(1) NOT NULL, `b2` bit(1) NOT NULL, `b3` bit(1) NOT NULL, `b4` bit(1) NOT NULL, `b5` bit(1) NOT NULL, `b6` bit(1) NOT NULL, `b7` bit(1) NOT NULL, `b8` bit(1) NOT NULL, `b9` bit(1) NOT NULL, `b10` bit(1) NOT NULL ) ENGINE=MyISAM mysql> show table status like 'bbool' G *************************** 1. row *************************** Name: bbool Engine: MyISAM Version: 10 Row_format: Fixed Rows: 10 Avg_row_length: 7 Data_length: 70 Max_data_length: 1970324836974591 Index_length: 1024 Data_free: 0 Auto_increment: NULL Create_time: 2008-04-24 00:41:01 Update_time: 2008-04-24 00:45:40 Check_time: NULL Collation: latin1_swedish_ci Checksum: NULL Create_options: Comment: 1 row in set (0.00 sec) |
As you can see for MyISAM 10 columns take just 7 bytes – less than a byte per column. This is just minimum row length we can have for this table – myisam_data_pointer_size is 6 default plus we need space for delete flag which makes 7 minimum row size MyISAM can have in this configuration.
This trick however does not work for Innodb which allocates 1 byte for each BIT(1) column. So we can get 1 byte per column for boolean flag storage in Innodb (not accounting for standard row overhead) if we use BIT(1), TINYINT or ENUM types but can we do better ?
In fact we can – by using CHAR(0) type (without NOT NULL flag) – this will be pretty much column containing NULL bit only which can store one of two values – NULL or Empty String.
Lets see how these 3 different table format look in Innodb (I’ve populated each with some 2M rows so difference is more visible)
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 |
CREATE TABLE `tbool` ( `t1` tinyint(4) NOT NULL, `t2` tinyint(4) NOT NULL, `t3` tinyint(4) NOT NULL, `t4` tinyint(4) NOT NULL, `t5` tinyint(4) NOT NULL, `t6` tinyint(4) NOT NULL, `t7` tinyint(4) NOT NULL, `t8` tinyint(4) NOT NULL, `t9` tinyint(4) NOT NULL, `t10` tinyint(4) NOT NULL ) ENGINE=InnoDB CREATE TABLE `cbool` ( `c1` char(0) default NULL, `c2` char(0) default NULL, `c3` char(0) default NULL, `c4` char(0) default NULL, `c5` char(0) default NULL, `c6` char(0) default NULL, `c7` char(0) default NULL, `c8` char(0) default NULL, `c9` char(0) default NULL, `c10` char(0) default NULL ) ENGINE=InnoDB DEFAULT CHARSET=latin1 mysql> show table status like "%bool%" G *************************** 1. row *************************** Name: bbool Engine: InnoDB Version: 10 Row_format: Compact Rows: 2097405 Avg_row_length: 37 Data_length: 78233600 Max_data_length: 0 Index_length: 0 Data_free: 0 Auto_increment: NULL Create_time: 2008-04-24 00:54:18 Update_time: NULL Check_time: NULL Collation: latin1_swedish_ci Checksum: NULL Create_options: Comment: InnoDB free: 6144 kB *************************** 2. row *************************** Name: cbool Engine: InnoDB Version: 10 Row_format: Compact Rows: 2097678 Avg_row_length: 34 Data_length: 71942144 Max_data_length: 0 Index_length: 0 Data_free: 0 Auto_increment: NULL Create_time: 2008-04-24 00:37:48 Update_time: NULL Check_time: NULL Collation: latin1_swedish_ci Checksum: NULL Create_options: Comment: InnoDB free: 4096 kB *************************** 3. row *************************** Name: tbool Engine: InnoDB Version: 10 Row_format: Compact Rows: 2097405 Avg_row_length: 37 Data_length: 78233600 Max_data_length: 0 Index_length: 0 Data_free: 0 Auto_increment: NULL Create_time: 2008-04-24 00:58:01 Update_time: NULL Check_time: NULL Collation: latin1_swedish_ci Checksum: NULL Create_options: Comment: InnoDB free: 6144 kB 3 rows in set (0.11 sec) |
As you can see table which uses BIT(1) column type takes same space as the one which uses TINYINT NOT NULL while CHAR(0) is about 10% smaller. This is modest space savings of course but considering large per row overhead Innodb has this will transform to much larger savings if you have hundreds of such columns.
Lets see how things look for MyISAM for same tables:
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 |
mysql> show table status like "%bool%" G *************************** 1. row *************************** Name: bbool Engine: MyISAM Version: 10 Row_format: Fixed Rows: 2097152 Avg_row_length: 7 Data_length: 14680064 Max_data_length: 1970324836974591 Index_length: 1024 Data_free: 0 Auto_increment: NULL Create_time: 2008-04-24 01:14:06 Update_time: 2008-04-24 01:14:09 Check_time: NULL Collation: latin1_swedish_ci Checksum: NULL Create_options: Comment: *************************** 2. row *************************** Name: cbool Engine: MyISAM Version: 10 Row_format: Fixed Rows: 2097152 Avg_row_length: 7 Data_length: 14680064 Max_data_length: 1970324836974591 Index_length: 1024 Data_free: 0 Auto_increment: NULL Create_time: 2008-04-24 01:14:13 Update_time: 2008-04-24 01:14:17 Check_time: NULL Collation: latin1_swedish_ci Checksum: NULL Create_options: Comment: *************************** 3. row *************************** Name: tbool Engine: MyISAM Version: 10 Row_format: Fixed Rows: 2097152 Avg_row_length: 11 Data_length: 23068672 Max_data_length: 3096224743817215 Index_length: 1024 Data_free: 0 Auto_increment: NULL Create_time: 2008-04-24 01:14:23 Update_time: 2008-04-24 01:14:26 Check_time: NULL Collation: latin1_swedish_ci Checksum: NULL Create_options: Comment: 3 rows in set (0.00 sec) |
As you can see for MyISAM BIT(1) NOT NULL type is as compact as CHAR(0) while TINYINT NOT NULL is a bit less compact.
Looking at results of these tests using CHAR(0) is the most efficient if you would like optimal structure both for MyISAM and Innodb tables, however it is not as convenient to work with. Using NULL as one of flag values means you can’t use normal “=” comparison operator with them:
|
1 2 3 4 5 6 7 |
mysql> select count(*) from cbool where c1=NULL; +----------+ | count(*) | +----------+ | 0 | +----------+ 1 row in set (0.20 sec) |
You can use IS NULL operator which is painful because you need to have different query based on parameter (IS ” would not work) or you can use Null-Aware comparison operator:
|
1 2 3 4 5 6 7 |
mysql> select count(*) from cbool where c1<=>NULL; +----------+ | count(*) | +----------+ | 1048576 | +----------+ 1 row in set (0.22 sec) |
Should you go and change all flags to use this approach ? I do not think so – for most applications using TINYINT BIT(1) or ENUM for flags benefit would unlikely be worth the trouble. Due to complication I also would not recommend as a base approach for new applications. However in special cases if you have very many rows and very many flag values to deal with which you can’t pack to the bitmask this approach can be quite helpful.
Resources
RELATED POSTS