Fractal Tree™ indexes are green. They have the potential to be greener still. Here’s why:
Remarkably, data centers consume 1-3 percent of all the US electricity. A majority of this power is used to drive servers and storage systems. Significant energy savings remain on the table.
Here’s why Fractal Tree indexing enables more energy-efficient storage: Data centers typically use many small-capacity disks rather than a few large-capacity disks. Why? One reason is to harness more spindles to obtain more I/Os per second. In some high-performance applications, users go so far as to employ techniques such as “short stroking” to get more performance (and less storage) out of drives. But Fractal Tree indexes are so I/O-efficient that they don’t need as many I/Os.
Consider the power consumption of disks. An enterprise 80 to 160 GB disk runs at something like 4W (idle power), while an enterprise 1-2 TB disk runs at something like 8W (idle power). If you replace many small-capacity disks by a small number of large-capacity disks, you can maintain the same capacity, but reduce your storage power consumption per GB by close to an order of magnitude. So Fractal Tree indexes enable energy-efficient hardware when the metric is Watts per GB.
For a databases, however, joules per DB operation may be a better metric. Fractal Tree indexes are so I/O efficient, that they are terrific when measured as Joules per operation.
What about power consumed by servers? A lot of our customers see an increase in server activity due to the increase in throughput. Fractal Tree indexes are so I/O-efficient that they drive CPUs harder, consequently using more power. But, assuming that a user is trying to keep the same overall target number of inserts/deletes, Fractal Tree indexes are still more efficient in terms of joules per database insert/delete.
Given how important these topics are, Bradley and I recently attended the National Science Foundation workshop “Energy-Efficient Data Management” in Arlington, VA. This was a two-day planning meeting, where researchers from industry and academia convened to discuss open problems in energy-efficient data management. We discussed how to devise and deploy new data-management methods and new data-intensive applications that are more energy efficient.
I spoke about how better data structures have the potential to deliver energy savings. For details, see the slides themselves: “How Fast Indexing Makes Databases Greener.”
The main purpose of the talk was to discuss open areas for research. Here are three open problems I covered in my talk. For more details see the slides.
- Area 1: Develop a massively multithreaded Fractal Tree variant that could run on future-generation machines consisting of thousands very very slow cores.
- Area 2: Develop an Energy-Efficient SSD/Rotational Disk Hybrid.
- Area 3: The proof is in the pudding.
Thanks again to the NSF for supporting Tokutek through SBIR grants on topics like these.