What is the largest amount of data do you store in MySQL ?

My previous poll got a great response so I thought I should continue these series.
The question of the day today is How much data do your store in your largest MySQL instance ? I’m interested about single instance not the total amount of data you have in MySQL in your Sharded replicated environment. Feel free to share details in comments – are you using MyISAM or Innodb ? Is it working out well for you or are there any challenges you’re seeing ?

What is the largest data size for single MySQL Instance you use in Production

  • 100GB to 1TB (34%, 298 Votes)
  • 10GB to 100GB (25%, 226 Votes)
  • 1GB to 10GB (16%, 145 Votes)
  • 1TB to 10TB (15%, 134 Votes)
  • less than 1GB (7%, 60 Votes)
  • more than 10TB (3%, 25 Votes)

Total Voters: 888

Loading ... Loading ...

Thank you for your participation !

Share this post

Comments (28)

  • brad voth Reply

    15tb… painful to manage

    November 9, 2012 at 2:04 pm
  • Peter Zaitsev Reply

    Brad,

    Thanks for feedback. What is the biggest Pain ? MyISAM and Innodb ?

    November 9, 2012 at 2:28 pm
  • AD Reply

    We got 16TB and the hardest part is backups. Xtrabackup works good. But incrementals do not work with compressed Xtrabackups. We had a bug open for this. (We have Percona Gold support).

    So, if we can get: (a) compressed xtrabackup, (b) compressed incrementals and (c) parallel recovery (or to put it more simply a simple way to parallelize the restore such that the entire io capacity of the disk sub system is utilized — we’d be much happier.

    Thanks,
    AD

    November 9, 2012 at 3:13 pm
  • William Reply

    I’m sort of bummed out that the last category is from 100 GB to 1 TB. I’d like to see a more granular breakdown of that segment.

    November 9, 2012 at 3:25 pm
  • AD Reply

    A few other pain points with large Mysql DBs (16TB):

    – Large DBs imply large tables — so DDLs are too slow. We are using Percona 5.1.6x builds. So, online index additions would be ideal. Not sure if 5.5.x or 5.6.x already addresses this.

    – Large DBs also come with large buffer pool sizes. And that implies very slow shutdown and start (we have fast-shutdown disabled — all that seems to do is just shift the slowness from shutdown to starting up) – I am thinking there is no easy solution here

    – A simpler way to distribute some DBs to a different mount point. We have to use symlinks to point some DBs to other places. Instead, if we could do “create database new_db using db_file_space ‘/new_mount_point/mysql_data_dir’;” — would be nice.

    – Backport crash-safe replication to 5.1.6x 🙂 — yes we want to get to 5.5 — but now that 5.6.x is round the corner, have put 5.5.x migration on hold and are planning to go straight to 5.6.x

    Thanks,
    AD

    November 9, 2012 at 3:46 pm
  • Peter Zaitsev Reply

    AD,

    Thanks for feedback. Number of compression backup issues are fixed in Xtrabackup 2.0 do you have some specific unresolved issue ? Please feel free to open support request or let me know if there is unresolved support issue I would investigate it

    November 9, 2012 at 5:28 pm
  • Brad voth Reply

    Our main pain point is that we have a single table that has ~12+tb. This data is a rolling 90 days worth of data, write mostly, but needs to be available for immediate queries if the need presents itself. This is an innodb table. The main issue is that with a table this large any table changes are out of the question, as are backups, etc. We write at a rate of 30k+ rows / sec into this table.

    If DDL statements were able to be run in a multithreaded fashion through some sort of divide and conquer methodology that would help immensely. It’s just near impossible to run any sord of DDL on a table with 180b rows. When each row needs to be managed sequentially.

    November 9, 2012 at 8:05 pm
  • Laurynas Biveinis Reply

    AD –

    Percona Server 5.1 has the crash-resistant replication feature: https://www.percona.com/doc/percona-server/5.1/reliability/innodb_recovery_update_relay_log.html. Could you clarify, does it fail for you under some scenarios?

    Thanks,
    Laurynas

    November 10, 2012 at 3:52 am
  • Domas Reply

    I once had five minutes of stress, when I couldn’t boot up ~30TB MySQL instance, errors were way too cryptic 😉
    I don’t think we’re using that configuration any more, with regular ~2-3TB machines serving the workload nowadays.

    With MySQL one must shard/partition/whatnot large tables because of index locking, so I don’t even people running single huge table 😉

    November 10, 2012 at 8:41 am
  • Tim Chadwick Reply

    This info and the resulting thread is very interesting.

    I’d be curious to see this survey continue with two more questions:

    How big is the largest single table, and what % of the schema does it represent?

    What is the most number tables / database artifacts you house in a mysql instance?

    November 10, 2012 at 12:22 pm
  • Peter Zaitsev Reply

    Tim,

    I have 20 different pool ideas already which I will be posting every so often so you people do not get bored 🙂

    November 10, 2012 at 1:23 pm
  • Kristian Köhntopp Reply

    – CentOS release 5.6 (Final)
    – ProLiant DL380 G7, Processors: 24, Memory: 94.41 GB
    – MySQL Community Server (GPL) 5.5.16-log (mysql_55)

    Filesystem Size Used Avail Use% Mounted on

    /dev/mapper/vg00-log
    xfs 10G 255M 9.8G 3% /mysql/logvault/log
    /dev/mapper/vg00-data
    xfs 40T 39T 1.2T 98% /mysql/logvault/data

    The inert part of this is compressed MyISAM. The data size in this is clearly too large to manage. Also the logdata structure is not well suited for SQL. This is about to become a Hadoop/Hive cluster.

    November 11, 2012 at 3:04 pm
  • Peter Zaitsev Reply

    Kristian,

    Impressive it is….

    November 11, 2012 at 3:20 pm
  • Krishna Reply

    2TB for produciton report server and still growing. (InnoDB)

    November 11, 2012 at 10:14 pm
  • Kristian Köhntopp Reply

    It were, if it worked properly. Hive and Hadoop have their share of problems, but in general handle this amount of data orders of magnitude better.

    November 12, 2012 at 12:55 am
  • Krishna Reply

    Making application intelligent, also helps in managing terabytes of data. for eg manipulating data of more than 3 months (900M records), automatically apps should show ‘schedule by email’.

    November 12, 2012 at 2:00 am
  • eRadical Reply

    Currently we have a new 300G instance (a new internal-analytics instance) that is growing daily by 3G.
    This is the entry instance for raw data from ETL scripts.

    The machine is a 16GB-RAM, Quad-Core Xeon with HT, 2 x 3TB HDD (Software-RAID 1).
    It’s on a CentOS 6.3 and I’m using MariaDB Server 5.5.27.

    On the same instance there is also:
    – an instance replicating our website prod database (too small to count – also MariaDB),
    – a TokuDB instance that is computing data from the ETL instance (under 2G on permanent data)

    The oddest thing is the size of computed data: 250MB… so far.

    November 12, 2012 at 3:44 am
  • AD Reply

    Laurynas,
    Thanks – I must have misread the documentation. For some reason I had thought that crash-resistant replication was in 5.5.x only.

    Peter,
    Re: the compressed incremental backups with latest xtrabackup, I’ll double check if its working now and report back.

    Domas,
    Re: your point about sharding — well, yes, true. We are right on the cusp of whether to shard or not. Now, sharding will greatly increase application complexity. And the rate at which high iops (FusionIO, Virident, even commodity OCZ) storage costs are plummeting; I suspect our needs are going to be met with a largish single node Mysql for quite a few years more allowing us to postpone the sharding decision.

    Thanks,
    AD

    November 12, 2012 at 12:17 pm
  • Jouni "rautamiekka" Järvinen Reply

    Our MySQL 5.5.28-1~dotdeb.0-log on Debian64 6 for our Minecraft is running 2.9GB large database of 71 tables where most tables are InnoDB. Works fine AFA I’ve conffed it, and I’m not awesome in that.

    November 15, 2012 at 8:20 am
  • Neda Reply

    Hi Peter,
    I have designed and replicated a MySQL instance as slave for archiving our daily transactions for 6 months ,we are using this server for some reporting activities as well . It took around 700 G up to now .

    Expired data will be deleted every night after working hours ,we have started with 5 million transactions per day and now i’m upgrading it to store around 9 million transactions every day .(extra transactions are fetching with ETL from row data sent by our partners )

    I’ve used Innodb engine and I have partitioned big tables ,they are around 300G now ,each table contained several data files ,

    June 9, 2014 at 3:21 am
  • vijay Reply

    Currently we are storing nearly 1TB of data on mysql instance. Is it possible to store 10TB on a single instance without impacting the performance?

    July 15, 2015 at 5:11 am
  • phani Reply

    We are planning to setup a system in mysql
    The data would be around 16 TB and there are chances of getting more
    Kindly recommend us the best hardware and capacity solutions

    October 4, 2016 at 2:03 am
  • Hasan Ovuc Reply

    2 TB innodb on percona mysql 5.5 and still growing. No big problem for now. Just to use mysqldump is almost impossible. I hope noone wants to get new key on it. I am using innobackupex for backup.

    February 24, 2017 at 5:59 am
  • masood Reply

    @Hasan Ovuc how long your xtrabackup takes to backup and restore ?

    June 5, 2017 at 11:03 am
    • Hasan Ovuc Reply

      @masood I gave up to take backups with xtrabackup. It was taking too long. Now, the database is about 4 TB. I automatically take snapshot backups over a storage.

      June 28, 2018 at 9:35 am
  • Hasan Ovuc Reply

    I used innobackupex on 1 Gigabit network. It takes less than 1 day.

    June 23, 2017 at 8:29 am
  • abhishekdb Reply

    I have few Questions , please take time and answer this questions .

    We are planning to integrate around 250G of data into our main production server which contains our user information , our plan is to have complete user information in one database the below is the current configuration .

    1. How much of RAM we need to request for (currently we are having 45G) which is fully occupied .
    2.How much disk space we need to request for (mysql data dir -480G , total disk space allocated : 650G)

    Memory Utilizations : 45G(using 99%) , Active memory : 36G (max) Memory in Use: 45G (along with idle connections)
    data dir : 480G
    disk Allocated : 645G
    Full Backup size:121G(with compression) , Incr backup: 1.2G (perday).
    data growth per month : 4G

    June 26, 2018 at 5:53 am
  • abhishekdb Reply

    Can MySQL can handle 1 Tb of data were Queries per sec will be around 1500 with huge writes .

    June 26, 2018 at 6:33 am

Leave a Reply