Sharding and Time Base PartitioningPeter Zaitsev
For large number of online applications once you implemented proper sharding you can consider your scaling problems solved – by getting more and more hardware you can grow. As I recently wrote it however does not mean it is the most optimal way by itself to do things.
The “classical” sharding involves partitioning by user_id,site_id or somethat similar. This allows to spread data more or less evenly across the boxes and use any number of boxes. However this may be not the most optimal approach by itself because not all data belonging to same user is equal.
Consider Blog or Forum as example – most likely few last posts will get majority of hits while things written year ago are accessed with much less frequency. You can often level off this significantly for reads by using caching (if things are accessed frequently they are served from cache) but you still have to deal with writes which can be significant depending on your design.
It does not only have to be active portions of data same way you can have active users and ones which are almost dead.
Another interesting type of data which I find often kept on the same “cluster” without good reason is some sort of logs or history data. Think about Wikipedia page version history for example which accumulates to huge volume which very few users need to read, various change logs etc are other type of this data.
Besides separating “cold” data from “hot” it often makes sense to separate data based on its importance for system operation – for example if page versions data is currently unavailable for Wikipedia it is still possible to serve 99% of reads and even possibly handle writes by queuing new version creation.
It may make sense to separate data on table (or partition) level to get better “clustering” – because data is usually cached by pages rather than by rows and index entries having all hot data in separate table from cold data is much more efficient for caching than having table with the mix, even though the total size remains same.
It also often makes sense to separate data on the server level. Keeping Hot and Production Critical dataset small you can both make system to perform faster as well as well as get plenty of operating benefits – small database takes less time to backup and restore it is easier to do ALTER TABLE and replication would not fall behind as easily.
You can also use different hardware for different parts of data – you can hold “Hot” data on fast RAID volume or even SSD while place archive data on slow but large SATA volumes (unless response time will not become show stopper)
Of course not all applications need to use this technique but there is significant class of application which can benefit from it dramatically.