In this blog, I’ll go over my thoughts on what we can expect in the world of SQL databases.
After reading Baron’s prediction on databases, here:
https://www.xaprb.com/blog/defining-moments-in-database-history/
I want to provide my own view on what’s coming up next for SQL databases. I think we live in interesting times, when we can see the beginning of the next-generation of RDBMSs.
There are defining characteristics of such databases:
With this, I see following interesting projects:
Protocol compatibility is a wise approach, although not strictly necessary. It lowers an entry barrier for the existing applications.
Both CockroachDB and TiDB, at the moment of this writing, still have rough edges and can’t be used in serious deployments (from my experience). I expect both projects will make a big progress in 2017.
What shared characteristics can we expect from these systems?
As I mentioned above, we may see that the read performance is degraded (as latency increases), and often it will be defined more by network performance than anything else. Storage IO and CPU cycles will be secondary factors. There will be more work on how to understand and tune the network traffic.
We may need to get used to the fact that point or small range selects become much slower. Right now, we see very fast point selects for traditional RDBM (MySQL, PostgreSQL, etc.).
Heavy writes will be problematic. The problem is that all writes will need to go through the consistency protocol. Write-optimized storage engines will help (both CockroachDB and TiDB use RocksDB in the storage layer).
The long transactions (let’s say changing 100000 or more rows) also will be problematic. There is just too much network round-trips and housekeeping work on each node, making long transactions an issue for distributed systems.
Another shared property (at least between CockroachDB and TiDB) is the active use of the Raft protocol to achieve consistency. So it will be important to understand how this protocol works to use it effectively. You can find a good overview of the Raft protocol here: http://container-solutions.com/raft-explained-part-1-the-consenus-problem/.
There probably are more NewSQL technologies than I have mentioned here, but I do not think any of them captured critical market- or mind-share. So we are at the beginning of interesting times . . .
What about MySQL? Can MySQL become the database that provides all these characteristics? It is possible, but I do not think it will happen anytime soon. MySQL would need to provide automatic sharding to do this, which will be very hard to implement given the current internal design. It may happen in the future, though it will require a lot of engineering efforts to make it work properly.
Resources
RELATED POSTS