My previous poll got a great response so I thought I should continue these series.
The question of the day today is How much data do your store in your largest MySQL instance ? I’m interested about single instance not the total amount of data you have in MySQL in your Sharded replicated environment. Feel free to share details in comments – are you using MyISAM or Innodb ? Is it working out well for you or are there any challenges you’re seeing ?

What is the largest data size for single MySQL Instance you use in Production

  • 100GB to 1TB (34%, 298 Votes)
  • 10GB to 100GB (25%, 226 Votes)
  • 1GB to 10GB (16%, 145 Votes)
  • 1TB to 10TB (15%, 134 Votes)
  • less than 1GB (7%, 60 Votes)
  • more than 10TB (3%, 25 Votes)

Total Voters: 888

Loading ... Loading ...

Thank you for your participation !

28 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
brad voth

15tb… painful to manage

AD

We got 16TB and the hardest part is backups. Xtrabackup works good. But incrementals do not work with compressed Xtrabackups. We had a bug open for this. (We have Percona Gold support).

So, if we can get: (a) compressed xtrabackup, (b) compressed incrementals and (c) parallel recovery (or to put it more simply a simple way to parallelize the restore such that the entire io capacity of the disk sub system is utilized — we’d be much happier.

Thanks,
AD

William

I’m sort of bummed out that the last category is from 100 GB to 1 TB. I’d like to see a more granular breakdown of that segment.

AD

A few other pain points with large Mysql DBs (16TB):

– Large DBs imply large tables — so DDLs are too slow. We are using Percona 5.1.6x builds. So, online index additions would be ideal. Not sure if 5.5.x or 5.6.x already addresses this.

– Large DBs also come with large buffer pool sizes. And that implies very slow shutdown and start (we have fast-shutdown disabled — all that seems to do is just shift the slowness from shutdown to starting up) – I am thinking there is no easy solution here

– A simpler way to distribute some DBs to a different mount point. We have to use symlinks to point some DBs to other places. Instead, if we could do “create database new_db using db_file_space ‘/new_mount_point/mysql_data_dir’;” — would be nice.

– Backport crash-safe replication to 5.1.6x 🙂 — yes we want to get to 5.5 — but now that 5.6.x is round the corner, have put 5.5.x migration on hold and are planning to go straight to 5.6.x

Thanks,
AD

Brad voth

Our main pain point is that we have a single table that has ~12+tb. This data is a rolling 90 days worth of data, write mostly, but needs to be available for immediate queries if the need presents itself. This is an innodb table. The main issue is that with a table this large any table changes are out of the question, as are backups, etc. We write at a rate of 30k+ rows / sec into this table.

If DDL statements were able to be run in a multithreaded fashion through some sort of divide and conquer methodology that would help immensely. It’s just near impossible to run any sord of DDL on a table with 180b rows. When each row needs to be managed sequentially.

Laurynas Biveinis

AD –

Percona Server 5.1 has the crash-resistant replication feature: https://www.percona.com/doc/percona-server/5.1/reliability/innodb_recovery_update_relay_log.html. Could you clarify, does it fail for you under some scenarios?

Thanks,
Laurynas

Domas

I once had five minutes of stress, when I couldn’t boot up ~30TB MySQL instance, errors were way too cryptic 😉
I don’t think we’re using that configuration any more, with regular ~2-3TB machines serving the workload nowadays.

With MySQL one must shard/partition/whatnot large tables because of index locking, so I don’t even people running single huge table 😉

Tim Chadwick

This info and the resulting thread is very interesting.

I’d be curious to see this survey continue with two more questions:

How big is the largest single table, and what % of the schema does it represent?

What is the most number tables / database artifacts you house in a mysql instance?

Kristian Köhntopp

– CentOS release 5.6 (Final)
– ProLiant DL380 G7, Processors: 24, Memory: 94.41 GB
– MySQL Community Server (GPL) 5.5.16-log (mysql_55)

Filesystem Size Used Avail Use% Mounted on

/dev/mapper/vg00-log
xfs 10G 255M 9.8G 3% /mysql/logvault/log
/dev/mapper/vg00-data
xfs 40T 39T 1.2T 98% /mysql/logvault/data

The inert part of this is compressed MyISAM. The data size in this is clearly too large to manage. Also the logdata structure is not well suited for SQL. This is about to become a Hadoop/Hive cluster.

Krishna

2TB for produciton report server and still growing. (InnoDB)

Kristian Köhntopp

It were, if it worked properly. Hive and Hadoop have their share of problems, but in general handle this amount of data orders of magnitude better.

Krishna

Making application intelligent, also helps in managing terabytes of data. for eg manipulating data of more than 3 months (900M records), automatically apps should show ‘schedule by email’.

eRadical

Currently we have a new 300G instance (a new internal-analytics instance) that is growing daily by 3G.
This is the entry instance for raw data from ETL scripts.

The machine is a 16GB-RAM, Quad-Core Xeon with HT, 2 x 3TB HDD (Software-RAID 1).
It’s on a CentOS 6.3 and I’m using MariaDB Server 5.5.27.

On the same instance there is also:
– an instance replicating our website prod database (too small to count – also MariaDB),
– a TokuDB instance that is computing data from the ETL instance (under 2G on permanent data)

The oddest thing is the size of computed data: 250MB… so far.

AD

Laurynas,
Thanks – I must have misread the documentation. For some reason I had thought that crash-resistant replication was in 5.5.x only.

Peter,
Re: the compressed incremental backups with latest xtrabackup, I’ll double check if its working now and report back.

Domas,
Re: your point about sharding — well, yes, true. We are right on the cusp of whether to shard or not. Now, sharding will greatly increase application complexity. And the rate at which high iops (FusionIO, Virident, even commodity OCZ) storage costs are plummeting; I suspect our needs are going to be met with a largish single node Mysql for quite a few years more allowing us to postpone the sharding decision.

Thanks,
AD

Jouni "rautamiekka" Järvinen

Our MySQL 5.5.28-1~dotdeb.0-log on Debian64 6 for our Minecraft is running 2.9GB large database of 71 tables where most tables are InnoDB. Works fine AFA I’ve conffed it, and I’m not awesome in that.

Neda

Hi Peter,
I have designed and replicated a MySQL instance as slave for archiving our daily transactions for 6 months ,we are using this server for some reporting activities as well . It took around 700 G up to now .

Expired data will be deleted every night after working hours ,we have started with 5 million transactions per day and now i’m upgrading it to store around 9 million transactions every day .(extra transactions are fetching with ETL from row data sent by our partners )

I’ve used Innodb engine and I have partitioned big tables ,they are around 300G now ,each table contained several data files ,

vijay

Currently we are storing nearly 1TB of data on mysql instance. Is it possible to store 10TB on a single instance without impacting the performance?

phani

We are planning to setup a system in mysql
The data would be around 16 TB and there are chances of getting more
Kindly recommend us the best hardware and capacity solutions

Hasan Ovuc

2 TB innodb on percona mysql 5.5 and still growing. No big problem for now. Just to use mysqldump is almost impossible. I hope noone wants to get new key on it. I am using innobackupex for backup.

masood

@Hasan Ovuc how long your xtrabackup takes to backup and restore ?

Hasan Ovuc

@masood I gave up to take backups with xtrabackup. It was taking too long. Now, the database is about 4 TB. I automatically take snapshot backups over a storage.

Hasan Ovuc

I used innobackupex on 1 Gigabit network. It takes less than 1 day.

abhishekdb

I have few Questions , please take time and answer this questions .

We are planning to integrate around 250G of data into our main production server which contains our user information , our plan is to have complete user information in one database the below is the current configuration .

1. How much of RAM we need to request for (currently we are having 45G) which is fully occupied .
2.How much disk space we need to request for (mysql data dir -480G , total disk space allocated : 650G)

Memory Utilizations : 45G(using 99%) , Active memory : 36G (max) Memory in Use: 45G (along with idle connections)
data dir : 480G
disk Allocated : 645G
Full Backup size:121G(with compression) , Incr backup: 1.2G (perday).
data growth per month : 4G

abhishekdb

Can MySQL can handle 1 Tb of data were Queries per sec will be around 1500 with huge writes .