I created > one billion InnoDB tables in MySQL 8.0 (tables, not rows) just for fun. Here is the proof:
|
1 |
$ mysql -A <br>Welcome to the MySQL monitor. Commands end with ; or g.<br>Your MySQL connection id is 1425329<br>Server version: 8.0.12 MySQL Community Server - GPL<br><br>Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.<br><br>Oracle is a registered trademark of Oracle Corporation and/or its<br>affiliates. Other names may be trademarks of their respective<br>owners.<br><br>Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.<br><br>mysql> select count(*) from information_schema.tables;<br>+------------+<br>| count(*) |<br>+------------+<br>| 1011570298 |<br>+------------+<br>1 row in set (6 hours 57 min 6.31 sec) |
Yes, it took 6 hours and 57 minutes to count them all!
In my previous blog post, I created and tested MySQL 8.0 with 40 million tables (that was a real case study). The One Billion Tables project is not a real world scenario, however. I was challenged by Billion Tables Project (BTP) in PostgreSQL, and decided to repeat it with MySQL, creating 1 billion InnoDB tables.
As an aside: I think MySQL 8.0 is the first MySQL version where creating 1 billion InnoDB tables is even practically possible.
The first and one of the most important challenges is disk space. InnoDB allocates data pages on disk when creating .ibd files. Without disk level compression we need > 25Tb of disk. The good news: we have ZFS which provides transparent disk compression. Here’s how the disk utilization looks:
Actual data (apparent-size):
|
1 |
# du -sh --apparent-size /mysqldata/<br>26T /mysqldata/ |
Compressed data:
|
1 |
# du -sh /mysqldata/<br>2.4T /mysqldata/<br> |
Compression ratio:
|
1 |
# zfs get compression,compressratio<br>...<br>mysqldata/mysql/data compressratio 7.14x -<br>mysqldata/mysql/data compression gzip inherited from mysqldata/mysql |
(Looks like the compression ratio reported is not 100% correct, we expect ~10x compression ratio.)
This is usually the big issue with databases that create a file per table. With MySQL 8.0 we can create a shared tablespace and “assign” a table to it. I created a tablespace per database, and created 1000 tables in each database.
The result:
|
1 |
mysql> select count(*) from information_schema.schemata; <br>+----------+<br>| count(*) |<br>+----------+<br>| 1011575 |<br>+----------+<br>1 row in set (1.31 sec) |
Another big challenge is how to create tables fast enough so it will not take months. I have used three approaches:
my.cnf config file (I repeat: do not use this in production):
|
1 |
[mysqld]<br>default-authentication-plugin = mysql_native_password<br>performance_schema=0<br><br>datadir=/mysqldata/mysql/data<br>socket=/mysqldata/mysql/data/mysql.sock<br>log-error = /mysqldata/mysql/log/error.log<br>skip-log-bin=1<br><br>innodb_log_group_home_dir = /mysqldata/mysql/log/<br>innodb_doublewrite = 0<br>innodb_checksum_algorithm=none<br>innodb_log_checksums=0<br>innodb_flush_log_at_trx_commit=0<br>innodb_log_file_size=2G<br>innodb_buffer_pool_size=100G<br>innodb_page_size=4k<br>innodb_flush_method=nosync<br>innodb_io_capacity_max=20000<br>innodb_io_capacity=5000<br>innodb_buffer_pool_instances=32<br>innodb_stats_persistent = 0<br><br>tablespace_definition_cache = 524288<br>schema_definition_cache = 524288<br>table_definition_cache = 524288<br>table_open_cache=524288<br>table_open_cache_instances=32<br>open-files-limit=1000000<br> |
ZFS pool:
|
1 |
# zpool status<br> pool: mysqldata<br> state: ONLINE<br> scan: scrub repaired 0B in 1h49m with 0 errors on Sun Oct 14 02:13:17 2018<br>config:<br><br> NAME STATE READ WRITE CKSUM<br> mysqldata ONLINE 0 0 0<br> nvme0n1 ONLINE 0 0 0<br> nvme1n1 ONLINE 0 0 0<br> nvme2n1 ONLINE 0 0 0<br> nvme3n1 ONLINE 0 0 0<br><br>errors: No known data errors |
A simple “deploy” script to create tables in parallel (includes the sysbench table structure):
|
1 |
#/bin/bash<br>function do_db {<br> db_exist=$(mysql -A -s -Nbe "select 1 from information_schema.schemata where schema_name = '$db'")<br> if [ "$db_exist" == "1" ]; then echo "Already exists: $db"; return 0; fi;<br> tbspace="create database $db; use $db; CREATE TABLESPACE $db ADD DATAFILE '$db.ibd' engine=InnoDB";<br> #echo "Tablespace $db.ibd created!"<br> tables=""<br> for i in {1..1000}<br> do<br> table="CREATE TABLE sbtest$i ( id int(10) unsigned NOT NULL AUTO_INCREMENT, k int(10) unsigned NOT NULL DEFAULT '0', c varchar(120) NOT NULL DEFAULT '', pad varchar(60) NOT NULL DEFAULT '', PRIMARY KEY (id), KEY k_1 (k) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 tablespace $db;"<br> tables="$tables; $table;"<br> done<br> echo "$tbspace;$tables" | mysql <br>}<br>c=0<br>echo "starting..."<br>c=$(mysql -A -s -Nbe "select max(cast(SUBSTRING_INDEX(schema_name, '_', -1) as unsigned)) from information_schema.schemata where schema_name like 'sbtest_%'")<br>for m in {1..100000}<br>do<br> echo "m=$m"<br> for i in {1..30}<br> do<br> let c=$c+1<br> echo $c<br> db="sbtest_$c"<br> do_db &<br> done<br> wait<br>done<br> |
How fast did we create tables? Here are some stats:
|
1 |
# mysqladmin -i 10 -r ex|grep Com_create_table<br>...<br>| Com_create_table | 6497 |<br>| Com_create_table | 6449 |
So we created ~650 tables per second. The average, above, is per 10 seconds.
It took > 6 hours to do “count(*) from information_schema.tables”! Here is why:
|
1 |
# ls -lah /mysqldata/mysql/data/mysql.ibd<br>-rw-r----- 1 mysql mysql 6.1T Oct 18 15:02 /mysqldata/mysql/data/mysql.ibd |
|
1 |
mysql> show create table information_schema.tablesG<br>*************************** 1. row ***************************<br> View: TABLES<br> Create View: CREATE ALGORITHM=UNDEFINED DEFINER=`mysql.infoschema`@`localhost` SQL SECURITY DEFINER VIEW `information_schema`.`TABLES` AS select `cat`.`name` AS `TABLE_CATALOG`,`sch`.`name` AS `TABLE_SCHEMA`,`tbl`.`name` AS `TABLE_NAME`,`tbl`.`type` AS `TABLE_TYPE`,if((`tbl`.`type` = 'BASE TABLE'),`tbl`.`engine`,NULL) AS `ENGINE`,if((`tbl`.`type` = 'VIEW'),NULL,10) AS `VERSION`,`tbl`.`row_format` AS `ROW_FORMAT`,internal_table_rows(`sch`.`name`,`tbl`.`name`,if(isnull(`tbl`.`partition_type`),`tbl`.`engine`,''),`tbl`.`se_private_id`,(`tbl`.`hidden` <> 'Visible'),`ts`.`se_private_data`,coalesce(`stat`.`table_rows`,0),coalesce(cast(`stat`.`cached_time` as unsigned),0)) AS `TABLE_ROWS`,internal_avg_row_length(`sch`.`name`,`tbl`.`name`,if(isnull(`tbl`.`partition_type`),`tbl`.`engine`,''),`tbl`.`se_private_id`,(`tbl`.`hidden` <> 'Visible'),`ts`.`se_private_data`,coalesce(`stat`.`avg_row_length`,0),coalesce(cast(`stat`.`cached_time` as unsigned),0)) AS `AVG_ROW_LENGTH`,internal_data_length(`sch`.`name`,`tbl`.`name`,if(isnull(`tbl`.`partition_type`),`tbl`.`engine`,''),`tbl`.`se_private_id`,(`tbl`.`hidden` <> 'Visible'),`ts`.`se_private_data`,coalesce(`stat`.`data_length`,0),coalesce(cast(`stat`.`cached_time` as unsigned),0)) AS `DATA_LENGTH`,internal_max_data_length(`sch`.`name`,`tbl`.`name`,if(isnull(`tbl`.`partition_type`),`tbl`.`engine`,''),`tbl`.`se_private_id`,(`tbl`.`hidden` <> 'Visible'),`ts`.`se_private_data`,coalesce(`stat`.`max_data_length`,0),coalesce(cast(`stat`.`cached_time` as unsigned),0)) AS `MAX_DATA_LENGTH`,internal_index_length(`sch`.`name`,`tbl`.`name`,if(isnull(`tbl`.`partition_type`),`tbl`.`engine`,''),`tbl`.`se_private_id`,(`tbl`.`hidden` <> 'Visible'),`ts`.`se_private_data`,coalesce(`stat`.`index_length`,0),coalesce(cast(`stat`.`cached_time` as unsigned),0)) AS `INDEX_LENGTH`,internal_data_free(`sch`.`name`,`tbl`.`name`,if(isnull(`tbl`.`partition_type`),`tbl`.`engine`,''),`tbl`.`se_private_id`,(`tbl`.`hidden` <> 'Visible'),`ts`.`se_private_data`,coalesce(`stat`.`data_free`,0),coalesce(cast(`stat`.`cached_time` as unsigned),0)) AS `DATA_FREE`,internal_auto_increment(`sch`.`name`,`tbl`.`name`,if(isnull(`tbl`.`partition_type`),`tbl`.`engine`,''),`tbl`.`se_private_id`,(`tbl`.`hidden` <> 'Visible'),`ts`.`se_private_data`,coalesce(`stat`.`auto_increment`,0),coalesce(cast(`stat`.`cached_time` as unsigned),0),`tbl`.`se_private_data`) AS `AUTO_INCREMENT`,`tbl`.`created` AS `CREATE_TIME`,internal_update_time(`sch`.`name`,`tbl`.`name`,if(isnull(`tbl`.`partition_type`),`tbl`.`engine`,''),`tbl`.`se_private_id`,(`tbl`.`hidden` <> 'Visible'),`ts`.`se_private_data`,coalesce(cast(`stat`.`update_time` as unsigned),0),coalesce(cast(`stat`.`cached_time` as unsigned),0)) AS `UPDATE_TIME`,internal_check_time(`sch`.`name`,`tbl`.`name`,if(isnull(`tbl`.`partition_type`),`tbl`.`engine`,''),`tbl`.`se_private_id`,(`tbl`.`hidden` <> 'Visible'),`ts`.`se_private_data`,coalesce(cast(`stat`.`check_time` as unsigned),0),coalesce(cast(`stat`.`cached_time` as unsigned),0)) AS `CHECK_TIME`,`col`.`name` AS `TABLE_COLLATION`,internal_checksum(`sch`.`name`,`tbl`.`name`,if(isnull(`tbl`.`partition_type`),`tbl`.`engine`,''),`tbl`.`se_private_id`,(`tbl`.`hidden` <> 'Visible'),`ts`.`se_private_data`,coalesce(`stat`.`checksum`,0),coalesce(cast(`stat`.`cached_time` as unsigned),0)) AS `CHECKSUM`,if((`tbl`.`type` = 'VIEW'),NULL,get_dd_create_options(`tbl`.`options`,if((ifnull(`tbl`.`partition_expression`,'NOT_PART_TBL') = 'NOT_PART_TBL'),0,1))) AS `CREATE_OPTIONS`,internal_get_comment_or_error(`sch`.`name`,`tbl`.`name`,`tbl`.`type`,`tbl`.`options`,`tbl`.`comment`) AS `TABLE_COMMENT` from (((((`mysql`.`tables` `tbl` join `mysql`.`schemata` `sch` on((`tbl`.`schema_id` = `sch`.`id`))) join `mysql`.`catalogs` `cat` on((`cat`.`id` = `sch`.`catalog_id`))) left join `mysql`.`collations` `col` on((`tbl`.`collation_id` = `col`.`id`))) left join `mysql`.`tablespaces` `ts` on((`tbl`.`tablespace_id` = `ts`.`id`))) left join `mysql`.`table_stats` `stat` on(((`tbl`.`name` = `stat`.`table_name`) and (`sch`.`name` = `stat`.`schema_name`)))) where (can_access_table(`sch`.`name`,`tbl`.`name`) and is_visible_dd_object(`tbl`.`hidden`))<br>character_set_client: utf8<br>collation_connection: utf8_general_ci<br> |
and the explain plan looks like this:
|
1 |
mysql> explain select count(*) from information_schema.tables G<br>*************************** 1. row ***************************<br> id: 1<br> select_type: SIMPLE<br> table: cat<br> partitions: NULL<br> type: index<br>possible_keys: PRIMARY<br> key: name<br> key_len: 194<br> ref: NULL<br> rows: 1<br> filtered: 100.00<br> Extra: Using index<br>*************************** 2. row ***************************<br> id: 1<br> select_type: SIMPLE<br> table: tbl<br> partitions: NULL<br> type: ALL<br>possible_keys: schema_id<br> key: NULL<br> key_len: NULL<br> ref: NULL<br> rows: 1023387060<br> filtered: 100.00<br> Extra: Using where; Using join buffer (Block Nested Loop)<br>*************************** 3. row ***************************<br> id: 1<br> select_type: SIMPLE<br> table: sch<br> partitions: NULL<br> type: eq_ref<br>possible_keys: PRIMARY,catalog_id<br> key: PRIMARY<br> key_len: 8<br> ref: mysql.tbl.schema_id<br> rows: 1<br> filtered: 11.11<br> Extra: Using where<br>*************************** 4. row ***************************<br> id: 1<br> select_type: SIMPLE<br> table: stat<br> partitions: NULL<br> type: eq_ref<br>possible_keys: PRIMARY<br> key: PRIMARY<br> key_len: 388<br> ref: mysql.sch.name,mysql.tbl.name<br> rows: 1<br> filtered: 100.00<br> Extra: Using index<br>*************************** 5. row ***************************<br> id: 1<br> select_type: SIMPLE<br> table: ts<br> partitions: NULL<br> type: eq_ref<br>possible_keys: PRIMARY<br> key: PRIMARY<br> key_len: 8<br> ref: mysql.tbl.tablespace_id<br> rows: 1<br> filtered: 100.00<br> Extra: Using index<br>*************************** 6. row ***************************<br> id: 1<br> select_type: SIMPLE<br> table: col<br> partitions: NULL<br> type: eq_ref<br>possible_keys: PRIMARY<br> key: PRIMARY<br> key_len: 8<br> ref: mysql.tbl.collation_id<br> rows: 1<br> filtered: 100.00<br> Extra: Using index<br> |
