Search Results for: data set

MySQL & OpenStack: How to overcome issues as your dataset grows

MySQL is the database of choice for most OpenStack components (Ceilometer is a notable exception). If you start with a small deployment, it will probably run like a charm. But as soon as the dataset grows, you will suddenly face several challenges. We will write a series of blog posts explaining the issues you may […]

Sample datasets for benchmarking and testing

Sometimes you just need some data to test and stress things. But randomly generated data is awful — it doesn’t have realistic distributions, and it isn’t easy to understand whether your results are meaningful and correct. Real or quasi-real data is best. Whether you’re looking for a couple of megabytes or many terabytes, the following […]

The cost of not properly managing your databases

Every day hundreds of millions of dollars are wasted by allowing improperly tuned or misconfigured systems, misunderstood infrastructure, and inefficient IT operations to live and thrive in data centers around the globe. There are both direct and indirect costs associated with allowing these unhealthy systems to continue to exist. Let’s look at some. The setup: […]

How to setup a PXC cluster with GTIDs (and have async slaves replicating from it!)

This past week was marked by a series of personal findings related to the use of Global Transaction IDs (GTIDs) on Galera-based clusters such as Percona XtraDB Cluster (PXC). The main one being the fact that transactions touching MyISAM tables (and FLUSH PRIVILEGES!) issued on a giving node of the cluster are recorded on a GTID set […]

‘Indexing’ JSON documents for efficient MySQL queries over JSON data

MySQL meets NoSQL with JSON UDF I recently got back from FOSDEM, in Brussels, Belgium. While I was there I got to see a great talk by Sveta Smirnova, about her MySQL 5.7 Labs release JSON UDF functions. It is important to note that while the UDF come in a 5.7 release it is absolutely […]

How a set of queries can be killed in MySQL using Percona Toolkit’s pt-kill

You might have encountered situations where you had to kill some specific select queries that were running for long periods and choking the database. This post will go into more detail with an example of report query offloading. Report query (select) offloading to a slave server is a common practice to reduce the workload of the […]

Data inconsistencies on MySQL replicas: Beyond pt-table-checksum

Percona Toolkit’s pt-table-checksum is a great tool to find data inconsistencies between a MySQL master and its replicas. However it is sometimes not enough to know that there are inconsistencies and let pt-table-sync fix the issue: you may want to know which exact rows are different to identify the statements that created the inconsistency. This […]