So far, I’ve analyzed point and range queries. Now it’s time to talk about insertions and deletions. We’ll call the combination updates. Updates come in two flavors, and today we’ll cover both.
Depending on the exact settings of your database, the updates give a varying amount of feedback. For example, when a key is deleted, all rows with that key are deleted (assuming the database allows duplicate keys). The normal behavior is to return the number of rows deleted. The normal behavior when deleting a key that has no corresponding rows in the database is to return an error message. On insertion, one can allow duplicate or not. In the latter case, the storage engine returns an error message if a duplication insertion is attempted.
We’ll see that the details of error messages have a profound influence on the lower-bound arguments I’ve been making (and we’ll see a bit later that they have a profound influence on how quickly you can actually implement these operations). Here, we define two classes of update operations:
Strict Updates
These are the insertions and deletions that come by default in MySQL, and many other databases.
As I say, these are the types of updates that we are all used to.
Lenient Updates
The alternative to strict updates is lenient updates. In a sense, they are equivalent, in that there’s nothing you can do with one that you can’t do with the other. It’s just going to turn out that lenient updates can be made to run much faster.
Conclusions
What’s the upshot? I’ve been presenting operations in terms of lower bounds that are dominated by disk seeks (slow!) versus those dominated by disk bandwidth (much faster). See if you can use this analysis to understand the behavior of these two types of updates.
Resources
RELATED POSTS