Peter remarks in his blog that write scaling can be a serious problem regarding MySQL replication (especially replication delay) since the replication works single threaded which means only one CPU on the slave side may effectively process the incoming replication data which is a problem, if the master is a e.g. a 8 core system with 20 hard drives:
http://www.mysqlperformanceblog.com/2006/07/07/thoughts-on-m ysql-replication/
Now, we exactly ran into this problem recently while inserting 220 million rows of stock realtime data each day into our so called "intraday" tables - which means that the data is cleared every night and only a small amount of historic data is written to some other tables.
The beginning of each day is no problem since the intraday tables are empty and both INSERTs and UPDATESs are fast of course. As tables get filled the slave significantly falls back to a peak of *1 hour!!* delay.
Both machines have identical hardware (4 disks RAID 10, dual core xeon, 8 GB RAM, mysql 4.0). Tables are InnoDB, manually partitioned by MOD'ing integer primary keys and quite small in row size:
Into these types of tables, bulk inserts are executed:
CREATE TABLE `intraday_0` ( `instruments_id` int(10) unsigned NOT NULL default '0', `hour` tinyint(2) unsigned zerofill NOT NULL default '00', `minute` tinyint(2) unsigned zerofill NOT NULL default '00', `second` tinyint(2) unsigned zerofill NOT NULL default '00', `bid` float unsigned NOT NULL default '0', KEY `instruments_id` (`instruments_id`,`hour`,`minute`)) TYPE=InnoDB;
These types of tables are constantly updated with the most recent values:
CREATE TABLE `snapshot_0` ( `instruments_id` int(10) unsigned NOT NULL default '0', `date` date NOT NULL default '0000-00-00', `bid` float unsigned NOT NULL default '0', `bid_high` float unsigned NOT NULL default '0', `bid_low` float unsigned NOT NULL default '0', `bid_open` float unsigned NOT NULL default '0', `bid_close` float unsigned NOT NULL default '0', `bid_trend` enum('up','down','inv') NOT NULL default 'inv', `bid_time` time NOT NULL default '00:00:00', `ask` float unsigned NOT NULL default '0', `ask_high` float unsigned NOT NULL default '0', `ask_low` float unsigned NOT NULL default '0', `ask_open` float unsigned NOT NULL default '0', `ask_close` float unsigned NOT NULL default '0', `ask_trend` enum('up','down','inv') NOT NULL default 'inv', `ask_time` time NOT NULL default '00:00:00', `last` float unsigned NOT NULL default '0', `last_high` float unsigned NOT NULL default '0', `last_low` float unsigned NOT NULL default '0', `last_open` float unsigned NOT NULL default '0', `last_close` float unsigned NOT NULL default '0', `last_trend` enum('up','down','inv') NOT NULL default 'inv', `last_time` time NOT NULL default '00:00:00', `volume_last_unit` bigint(20) unsigned NOT NULL default '0', `volume_cumulative_unit` bigint(20) unsigned NOT NULL default '0', `volume_cumulative_price` bigint(20) unsigned NOT NULL default '0', PRIMARY KEY (`instruments_id`)) ENGINE=InnoDB;
InnoDB settings:
innodb_data_home_dir = /var/lib/mysqlinnodb_data_file_path = ibdata1:60Ginnodb_log_file_size = 2000Minnodb_log_buffer_size = 8Minnodb_buffer_pool_size = 1800Minnodb_additional_mem_pool_size = 16Minnodb_file_io_threads = 4innodb_lock_wait_timeout = 50innodb_flush_log_at_trx_commit = 0innodb_flush_method = O_DIRECT
To solve this problem I am considering the following two solutions:
1. Set up a true master-master replication and distribute the write load among two servers: each server receives half of all inserts / updates
- In this case the replication thread of both servers would only need to insert / update half of the data - am I right? )
2. buy new machines with 16 RAID 10 disks
- although Peter remarked that more disks and cpu's won't help, I think more disks working in a RAID 10 will certainly do. of course only one cpu will work on the replication thread, but data will be inserted / updated much faster with 16 RAID 10 disks instead of 4 - am I right? )
Thanks for any further suggestions about distributing write load or enhancing write speed in general.
http://www.mysqlperformanceblog.com/2006/07/07/thoughts-on-m ysql-replication/
Now, we exactly ran into this problem recently while inserting 220 million rows of stock realtime data each day into our so called "intraday" tables - which means that the data is cleared every night and only a small amount of historic data is written to some other tables.
The beginning of each day is no problem since the intraday tables are empty and both INSERTs and UPDATESs are fast of course. As tables get filled the slave significantly falls back to a peak of *1 hour!!* delay.
Both machines have identical hardware (4 disks RAID 10, dual core xeon, 8 GB RAM, mysql 4.0). Tables are InnoDB, manually partitioned by MOD'ing integer primary keys and quite small in row size:
Into these types of tables, bulk inserts are executed:
CREATE TABLE `intraday_0` ( `instruments_id` int(10) unsigned NOT NULL default '0', `hour` tinyint(2) unsigned zerofill NOT NULL default '00', `minute` tinyint(2) unsigned zerofill NOT NULL default '00', `second` tinyint(2) unsigned zerofill NOT NULL default '00', `bid` float unsigned NOT NULL default '0', KEY `instruments_id` (`instruments_id`,`hour`,`minute`)) TYPE=InnoDB;
These types of tables are constantly updated with the most recent values:
CREATE TABLE `snapshot_0` ( `instruments_id` int(10) unsigned NOT NULL default '0', `date` date NOT NULL default '0000-00-00', `bid` float unsigned NOT NULL default '0', `bid_high` float unsigned NOT NULL default '0', `bid_low` float unsigned NOT NULL default '0', `bid_open` float unsigned NOT NULL default '0', `bid_close` float unsigned NOT NULL default '0', `bid_trend` enum('up','down','inv') NOT NULL default 'inv', `bid_time` time NOT NULL default '00:00:00', `ask` float unsigned NOT NULL default '0', `ask_high` float unsigned NOT NULL default '0', `ask_low` float unsigned NOT NULL default '0', `ask_open` float unsigned NOT NULL default '0', `ask_close` float unsigned NOT NULL default '0', `ask_trend` enum('up','down','inv') NOT NULL default 'inv', `ask_time` time NOT NULL default '00:00:00', `last` float unsigned NOT NULL default '0', `last_high` float unsigned NOT NULL default '0', `last_low` float unsigned NOT NULL default '0', `last_open` float unsigned NOT NULL default '0', `last_close` float unsigned NOT NULL default '0', `last_trend` enum('up','down','inv') NOT NULL default 'inv', `last_time` time NOT NULL default '00:00:00', `volume_last_unit` bigint(20) unsigned NOT NULL default '0', `volume_cumulative_unit` bigint(20) unsigned NOT NULL default '0', `volume_cumulative_price` bigint(20) unsigned NOT NULL default '0', PRIMARY KEY (`instruments_id`)) ENGINE=InnoDB;
InnoDB settings:
innodb_data_home_dir = /var/lib/mysqlinnodb_data_file_path = ibdata1:60Ginnodb_log_file_size = 2000Minnodb_log_buffer_size = 8Minnodb_buffer_pool_size = 1800Minnodb_additional_mem_pool_size = 16Minnodb_file_io_threads = 4innodb_lock_wait_timeout = 50innodb_flush_log_at_trx_commit = 0innodb_flush_method = O_DIRECT
To solve this problem I am considering the following two solutions:
1. Set up a true master-master replication and distribute the write load among two servers: each server receives half of all inserts / updates
- In this case the replication thread of both servers would only need to insert / update half of the data - am I right? )
2. buy new machines with 16 RAID 10 disks
- although Peter remarked that more disks and cpu's won't help, I think more disks working in a RAID 10 will certainly do. of course only one cpu will work on the replication thread, but data will be inserted / updated much faster with 16 RAID 10 disks instead of 4 - am I right? )
Thanks for any further suggestions about distributing write load or enhancing write speed in general.
).
Comment