Where the open source community meets: Secure your spot for Percona Live Amsterdam! - Register

Downloads

Blog

What is the largest amount of data do you store in MySQL ?

November 9, 2012

Author

Peter Zaitsev

Insight for DBAs

MySQL

Share this Post:

My previous poll got a great response so I thought I should continue these series.
The question of the day today is How much data do your store in your largest MySQL instance ? I’m interested about single instance not the total amount of data you have in MySQL in your Sharded replicated environment. Feel free to share details in comments – are you using MyISAM or Innodb ? Is it working out well for you or are there any challenges you’re seeing ?

[poll id=”3″]

Thank you for your participation !

0 0 votes

Article Rating

28 Comments

Oldest

Newest Most Voted

brad voth

13 years ago

15tb… painful to manage

Author

Peter Zaitsev

13 years ago

Brad,

Thanks for feedback. What is the biggest Pain ? MyISAM and Innodb ?

13 years ago

We got 16TB and the hardest part is backups. Xtrabackup works good. But incrementals do not work with compressed Xtrabackups. We had a bug open for this. (We have Percona Gold support).

So, if we can get: (a) compressed xtrabackup, (b) compressed incrementals and (c) parallel recovery (or to put it more simply a simple way to parallelize the restore such that the entire io capacity of the disk sub system is utilized — we’d be much happier.

Thanks,
AD

William

13 years ago

I’m sort of bummed out that the last category is from 100 GB to 1 TB. I’d like to see a more granular breakdown of that segment.

13 years ago

A few other pain points with large Mysql DBs (16TB):

– Large DBs imply large tables — so DDLs are too slow. We are using Percona 5.1.6x builds. So, online index additions would be ideal. Not sure if 5.5.x or 5.6.x already addresses this.

– Large DBs also come with large buffer pool sizes. And that implies very slow shutdown and start (we have fast-shutdown disabled — all that seems to do is just shift the slowness from shutdown to starting up) – I am thinking there is no easy solution here

– A simpler way to distribute some DBs to a different mount point. We have to use symlinks to point some DBs to other places. Instead, if we could do “create database new_db using db_file_space ‘/new_mount_point/mysql_data_dir’;” — would be nice.

– Backport crash-safe replication to 5.1.6x 🙂 — yes we want to get to 5.5 — but now that 5.6.x is round the corner, have put 5.5.x migration on hold and are planning to go straight to 5.6.x

Thanks,
AD

Author

Peter Zaitsev

13 years ago

AD,

Thanks for feedback. Number of compression backup issues are fixed in Xtrabackup 2.0 do you have some specific unresolved issue ? Please feel free to open support request or let me know if there is unresolved support issue I would investigate it

Brad voth

13 years ago

Our main pain point is that we have a single table that has ~12+tb. This data is a rolling 90 days worth of data, write mostly, but needs to be available for immediate queries if the need presents itself. This is an innodb table. The main issue is that with a table this large any table changes are out of the question, as are backups, etc. We write at a rate of 30k+ rows / sec into this table.

If DDL statements were able to be run in a multithreaded fashion through some sort of divide and conquer methodology that would help immensely. It’s just near impossible to run any sord of DDL on a table with 180b rows. When each row needs to be managed sequentially.

Laurynas Biveinis

13 years ago

AD –

Percona Server 5.1 has the crash-resistant replication feature: https://www.percona.com/doc/percona-server/5.1/reliability/innodb_recovery_update_relay_log.html. Could you clarify, does it fail for you under some scenarios?

Thanks,
Laurynas

Domas

13 years ago

I once had five minutes of stress, when I couldn’t boot up ~30TB MySQL instance, errors were way too cryptic 😉
I don’t think we’re using that configuration any more, with regular ~2-3TB machines serving the workload nowadays.

With MySQL one must shard/partition/whatnot large tables because of index locking, so I don’t even people running single huge table 😉

Tim Chadwick

13 years ago

This info and the resulting thread is very interesting.

I’d be curious to see this survey continue with two more questions:

How big is the largest single table, and what % of the schema does it represent?

What is the most number tables / database artifacts you house in a mysql instance?

Author

Peter Zaitsev

13 years ago

Tim,

I have 20 different pool ideas already which I will be posting every so often so you people do not get bored 🙂

Kristian Köhntopp

13 years ago

‘- CentOS release 5.6 (Final)
– ProLiant DL380 G7, Processors: 24, Memory: 94.41 GB
– MySQL Community Server (GPL) 5.5.16-log (mysql_55)

Filesystem Size Used Avail Use% Mounted on
…
/dev/mapper/vg00-log
xfs 10G 255M 9.8G 3% /mysql/logvault/log
/dev/mapper/vg00-data
xfs 40T 39T 1.2T 98% /mysql/logvault/data

The inert part of this is compressed MyISAM. The data size in this is clearly too large to manage. Also the logdata structure is not well suited for SQL. This is about to become a Hadoop/Hive cluster.

Author

Peter Zaitsev

13 years ago

Kristian,

Impressive it is….

Krishna

13 years ago

2TB for produciton report server and still growing. (InnoDB)

Kristian Köhntopp

13 years ago

It were, if it worked properly. Hive and Hadoop have their share of problems, but in general handle this amount of data orders of magnitude better.

Krishna

13 years ago

Making application intelligent, also helps in managing terabytes of data. for eg manipulating data of more than 3 months (900M records), automatically apps should show ‘schedule by email’.

eRadical

13 years ago

Currently we have a new 300G instance (a new internal-analytics instance) that is growing daily by 3G.
This is the entry instance for raw data from ETL scripts.

The machine is a 16GB-RAM, Quad-Core Xeon with HT, 2 x 3TB HDD (Software-RAID 1).
It’s on a CentOS 6.3 and I’m using MariaDB Server 5.5.27.

On the same instance there is also:
– an instance replicating our website prod database (too small to count – also MariaDB),
– a TokuDB instance that is computing data from the ETL instance (under 2G on permanent data)

The oddest thing is the size of computed data: 250MB… so far.

13 years ago

Laurynas,
Thanks – I must have misread the documentation. For some reason I had thought that crash-resistant replication was in 5.5.x only.

Peter,
Re: the compressed incremental backups with latest xtrabackup, I’ll double check if its working now and report back.

Domas,
Re: your point about sharding — well, yes, true. We are right on the cusp of whether to shard or not. Now, sharding will greatly increase application complexity. And the rate at which high iops (FusionIO, Virident, even commodity OCZ) storage costs are plummeting; I suspect our needs are going to be met with a largish single node Mysql for quite a few years more allowing us to postpone the sharding decision.

Thanks,
AD

Jouni "rautamiekka" Järvinen

13 years ago

Our MySQL 5.5.28-1~dotdeb.0-log on Debian64 6 for our Minecraft is running 2.9GB large database of 71 tables where most tables are InnoDB. Works fine AFA I’ve conffed it, and I’m not awesome in that.

Neda

12 years ago

Hi Peter,
I have designed and replicated a MySQL instance as slave for archiving our daily transactions for 6 months ,we are using this server for some reporting activities as well . It took around 700 G up to now .

Expired data will be deleted every night after working hours ,we have started with 5 million transactions per day and now i’m upgrading it to store around 9 million transactions every day .(extra transactions are fetching with ETL from row data sent by our partners )

I’ve used Innodb engine and I have partitioned big tables ,they are around 300G now ,each table contained several data files ,

vijay

10 years ago

Currently we are storing nearly 1TB of data on mysql instance. Is it possible to store 10TB on a single instance without impacting the performance?

phani

9 years ago

We are planning to setup a system in mysql
The data would be around 16 TB and there are chances of getting more
Kindly recommend us the best hardware and capacity solutions

Hasan Ovuc

9 years ago

2 TB innodb on percona mysql 5.5 and still growing. No big problem for now. Just to use mysqldump is almost impossible. I hope noone wants to get new key on it. I am using innobackupex for backup.

masood

9 years ago

‘@Hasan Ovuc how long your xtrabackup takes to backup and restore ?

Hasan Ovuc

8 years ago

Reply to masood

‘@masood I gave up to take backups with xtrabackup. It was taking too long. Now, the database is about 4 TB. I automatically take snapshot backups over a storage.

Hasan Ovuc

9 years ago

I used innobackupex on 1 Gigabit network. It takes less than 1 day.

abhishekdb

8 years ago

I have few Questions , please take time and answer this questions .

We are planning to integrate around 250G of data into our main production server which contains our user information , our plan is to have complete user information in one database the below is the current configuration .

1. How much of RAM we need to request for (currently we are having 45G) which is fully occupied .
2.How much disk space we need to request for (mysql data dir -480G , total disk space allocated : 650G)

Memory Utilizations : 45G(using 99%) , Active memory : 36G (max) Memory in Use: 45G (along with idle connections)
data dir : 480G
disk Allocated : 645G
Full Backup size:121G(with compression) , Incr backup: 1.2G (perday).
data growth per month : 4G