May 06, 2011 |
MySQL
On April 1st, the Department of Computer Science at Rutgers University, where I am a professor, held an open house. I gave a talk called “Elephants on a Trapeze: Keeping Big Data Agile”. The talk is an introduction to performance issues related to big data without getting too technical. You’ll have to decide if I […]
Apr 08, 2011 |
MySQL
Why do B-trees need “Tricks” to work? Marko Mäkelä recently posted a couple of “tips and tricks” you can use to improve InnoDB performance. Tips and tricks. A general purpose relational database like MySQL shouldn’t need “tips and tricks” to perform well, and I lay the blame on design choices that were made in the […]
Apr 07, 2011 |
MySQL
Hot Column Addition and Deletion (HCAD) In the previous HCAD post, I described HCAD and showed that it can reduce the downtime of column addition (or deletion) from 18 hours to 3 seconds. In fact, the downtime of InnoDB is proportional to the size of the database, whereas the downtime for TokuDB 5.0 depends on […]
Apr 05, 2011 |
MySQL
From 31 minutes to 2 seconds Hot Indexing Overview TokuDB v5.0 introduces several features that are new to the MySQL world. Recently, we posted on HCAD: Hot Column addition and Deletion. In this post, we talk about Hot Indexing. What happens when you try to add a new index, as follows?
|
mysql> create index example_idx on example_tbl (example_field); |
In standard MySQL […]
Mar 30, 2011 |
MySQL
From 18 hours to 3 seconds! Hot Column Addition and Deletion (HCAD) Overview TokuDB v5.0 introduces several features that are new to the MySQL world. In this series of posts, we’re going to present some information on these features: what’s the feature, how does it work under the hood, and how do you get the […]
Mar 11, 2011 |
MySQL
In Part 1, and Part 2 of this series, I presented some thoughts on partitioning. I heard some great feedback on why people use partitioning. Here, I present a flow chart that summarizes what I’ve learned. In summary: with TokuDB in the picture there’s almost no reason to use partitioning. Or I should say, there […]
Jan 28, 2011 |
MySQL
Review In part one, I presented a very brief and particular view of partitioning. I covered what partitioning is, with hardly a mention of why one would use partitioning. In this post, I’ll talk about a few use cases often cited as justification for using partitions. Lots of disks → Lots of partitioning of tables […]
Jan 21, 2011 |
MySQL
Why Partition a Database? Partitioning is a commonly touted method for achieving performance in MySQL and other databases. (See here, here, here and many other examples.) I started wondering where the performance from partitions comes from, and I’ve summarized some of my thoughts here. But first, what is partitioning? (I’ve taken the examples from Giuseppe […]
Nov 17, 2010 |
MySQL
Summary B-trees suffer from fragmentation. Fragmentation causes headaches — in query performance and space used. Solutions, like dump and reload or OPTIMIZE TABLE are a pain and not always effective. Fractal trees don’t fragment. So if fragmentation is a problem, check out Tokutek What is fragmentation? What do I mean when I say “fragmentation”? People […]
Aug 15, 2010 |
MySQL
Tokutek is pleased to announce immediate availability of TokuDB for MySQL, version 4.1. It is designed for continuous querying and analysis of large volumes of rapidly arriving and changing data, while maintaining full ACID properties. New in TokuDB v4.1 includes important improvements, most notably support for SAVEPOINT and improved Fast Loader performance (introduced in v4.0). […]
Aug 06, 2009 |
MySQL
Tokutek® announces the release the release of the TokuDB storage engine for MySQL®, version 2.1.0. This release offers the following improvements over our previous release: Faster indexing of sequential keys. Faster bulk loads on tables with auto-increment fields. Faster range queries in some circumstances. Added support for InnoDB. Upgraded from MySQL 5.1.30 to 5.1.36. Fixed […]
Apr 28, 2009 |
MySQL
Schlomi Noach recently wrote a useful primer on the depth of B-trees and how that plays out for point queries — in both clustered indexes, like InnoDB, and in unclustered indexes, like MyISAM. Here, I’d like to talk about the effect of B-tree depth on insertions and range queries. And, of course, I’ll talk about […]
Apr 04, 2008 |
MySQL
Recall that I’ve claimed that it takes 28 years to fill a disk with random insertions, given a set of reasonable assumptions. Recall what they are: We are focusing on the storage engine (a la MySQL) level, and we are looking at a database on a single disk — the one we are using for […]
Mar 11, 2008 |
MySQL
I’ve been waving my hands about lower bounds. Well, sometimes I haven’t been waving my hands, because the lower bounds are tight. But in other cases (lenient insertions, range queries), the lower bounds are very far from what we’re used to. So now, for a bit of math: Brodal and Fagerberg showed in 2003 that […]
Mar 04, 2008 |
MySQL
Sorry for the delay, now on to range queries and lenient updates. Let’s call them queries and updates, for short. So far, I’ve shown that B-trees (and any of a number of other data structures) are very far from the “tight bound.” I’ll say a bound is a tight if it’s a lower bound and […]
Feb 11, 2008 |
MySQL
Last time, I introduced the notion of strict and lenient updates. Now it’s time to see what the performance characteristics are of each. Just to rehash, we are focusing on the storage engine (a la MySQL) level, and we are looking at a database on a single disk — the one we are using for […]
Feb 05, 2008 |
MySQL
So far, I’ve analyzed point and range queries. Now it’s time to talk about insertions and deletions. We’ll call the combination updates. Updates come in two flavors, and today we’ll cover both. Depending on the exact settings of your database, the updates give a varying amount of feedback. For example, when a key is deleted, […]
Jan 21, 2008 |
MySQL
Last time I talked about point queries. The conclusion was that big databases and point queries don’t mix. It’s ok to do them from time to time, but it’s not how you’re going to use your database, unless you have a lot of time. Today, I’d like to talk about range queries, which seem much […]