On April 9-10 the National Science Foundation hosted the Workshop on the Science of Power Management (SciPM 2009), where I gave an invited talk. Here I give a brief summary of my talk along with a pointer to the slides.
The talk describes how MySQL with TokuDB can provide a path to more energy-efficient database implementations. It’s a theoretical talk. That is, rather than presenting results from an existing implementation, it provides food for thought about future possibilities.
Here’s an executive summary of the talk.
Disks use a substantial fraction of the computing power in a typical database application. Although different workloads and configurations can give very different values, somewhere around 1/3 to 2/3 of the total energy consumed by the computing unit seems like a good ballpark.
Computation is only one part of the power equation for a data center. However, many other components (such as cooling) consume power roughly proportionally to (or at least correlated with) computation. Thus, my analysis of power consumed by computation can give insight into how other parts of the system are affected.
Typically B-tree-based databases are configured with many small-capacity disks rather than a small number of large capacity disks. A 120GB disk consumes roughly half of the power of a 2TB disk, even though it is about 17 times smaller. Using larger-capacity disks has the potential to reduce the power consumed by disks by as an order of magnitude in Watts per GB.
So why not use large capacity disks? Well, B-tree-based storage engines scale with seek time. Increasing the number of spindles drives up the number of random seeks per second in the system.
Fractal-tree-based storage engines (such as TokuDB) scale with bandwidth rather than with disk seeks. (Bandwidth scales as the square root of disk capacity.) Consequently, a well-balanced system based on fractal trees can use fewer larger-capacity disks.
Cutting the power consumption of disks (there are a couple of ways to measure power consumption, as explored in the talk itself) can have a big impact on the overall power consumption of the system.
I’ve left out most of the details. That’s what the talk is for.
Comments welcome!