October 21, 2014

Working with large data sets in MySQL

What does working with large data sets in mySQL teach you ? Of course you have to learn a lot about query optimization, art of building summary tables and tricks of executing queries exactly as you want. I already wrote about development and configuration side of the problem so I will not go to details […]

MySQL & OpenStack: How to overcome issues as your dataset grows

MySQL is the database of choice for most OpenStack components (Ceilometer is a notable exception). If you start with a small deployment, it will probably run like a charm. But as soon as the dataset grows, you will suddenly face several challenges. We will write a series of blog posts explaining the issues you may […]

Sample datasets for benchmarking and testing

Sometimes you just need some data to test and stress things. But randomly generated data is awful — it doesn’t have realistic distributions, and it isn’t easy to understand whether your results are meaningful and correct. Real or quasi-real data is best. Whether you’re looking for a couple of megabytes or many terabytes, the following […]

Missing Data – rows used to generate result set

As Baron writes it is not the number of rows returned by the query but number of rows accessed by the query will most likely be defining query performance. Of course not all row accessed are created equal (such as full table scan row accesses may be much faster than random index lookups row accesses […]

Rackspace doubling-down on open-source databases, Percona Server

Founded in 1998, Rackspace has evolved over the years to address the way customers are using data – and more specifically, databases. The San Antonio-based company is fueling the adoption of cloud computing among organizations large and small. Today Rackspace is doubling down on open source database technologies. Why? Because that’s where the industry is […]

MySQL compression: Compressed and Uncompressed data size

MySQL has information_schema.tables that contain information such as “data_length” or “avg_row_length.” Documentation on this table however is quite poor, making an assumption that those fields are self explanatory – they are not when it comes to tables that employ compression. And this is where inconsistency is born. Lets take a look at the same table […]

Generating test data from the mysql> prompt

There are a lot of tools that generate test data.  Many of them have complex XML scripts or GUI interfaces that let you identify characteristics about the data. For testing query performance and many other applications, however, a simple quick and dirty data generator which can be constructed at the MySQL command line is useful. […]

Galera data on Percona Cloud Tools (and other MySQL monitoring tools)

I was talking with a Percona Support customer earlier this week who was looking for Galera data on Percona Cloud Tools. (Percona Cloud Tools, now in free beta, is a hosted service providing access to query performance insights for all MySQL uses.) The customer mentioned they were already keeping track of some Galera stats on Cacti, and […]

OpenStack’s Trove: The benefits of this database as a service (DBaaS)

In a previous post, my colleague Dimitri Vanoverbeke discussed at a high level the concepts of database as a service (DBaaS), OpenStack and OpenStack’s implementation of a DBaaS, Trove. Today I’d like to delve a bit further into Trove and discuss where it fits in, and who benefits. Just to recap, Trove is OpenStack’s implementation […]

A closer look at the MySQL ibdata1 disk space issue and big tables

A recurring and very common customer issue seen here at the Percona Support team involves how to make the ibdata1 file “shrink” within MySQL. I can only imagine there’s a degree of regret by some of the InnoDB architects on their design decisions regarding disk-space management by the shared tablespace* because this has been a big […]