Welcome to the first of several discussions with some of our upcoming Percona Live Data Performance Conference 2016 speakers! In this series of blogs, we’ll highlight some of the talks that will happen at this year’s conference, as well as discuss the technologies and outlooks of the speakers themselves. Make sure to read to the end to get a special Percona Live registration bonus!
In this first installment, we’ll meet Sergej Jurecko, co-founder of Biokoda d.o.o. His talk will be ActorDB – an alternative view of a distributed database. ActorDB is a database that was developed using a distributed model: it uses an SQL database that speaks the MySQL client/server protocol. I had a chance to speak with Sergej and get some insight into what his talk will cover:
Percona: Give me brief history of yourself: how did you get into database development, where do you work, what you love about it?
Sergej: I am a co-founder of a private company in Slovenia named Biokoda. Our clients range from small companies to telecoms who offer our solutions to their customers and government institutions. The requirements that our products try to solve always include high availability, ease-of-management, ease-of-scale and self-hosted.
A few years ago we were tasked with building a file sync app. This requires you to store a potentially very large file hierarchy for every user. When it came to choosing a database, our options were KV stores, traditional SQL databases and document stores (which were much less mature then they are now).
Designing a database that would be an ideal fit for our use case and requirements became a fun engineering challenge. Then writing it became a fun engineering challenge. Sure, it would have been safer and easier to stick with an existing mature SQL database, but I’m an eternal engineering optimist. Sometimes you have to take a crazy chance if you believe in it!
Percona: Your talk is going to be on “ActorDB – an alternative view of a distributed database.” What is it about distributed databases that causes concern for people? What are the pros and cons? And what affects could it have on application performance?
Sergej: The biggest concern, and rightfully so, is safety. There are many pitfalls developers of distributed databases can fall into. The most basic issues are: What is the consensus algorithm, is it implemented correctly and thoroughly tested? What is the storage engine, is it custom built? If yes how well is its reliability tested?
The advantage of getting it right is a way to store state without a single point of failure. It is a way to horizontally grow your database with your needs. If your database is a part of your products, these things become important selling points for your products.
When it comes to performance, distributed databases tend to lose out on a per-node basis. But because they can scale out to more nodes, they can achieve higher performance by an order of magnitude.
What we tried to do is base ActorDB on as much solid, proven ground as possible. We avoided developing our own storage and SQL engine, and instead based it on existing proven technology. We even avoided developing our own client protocol and libraries.
Percona: Does scaling horizontally cause difficulties with expense justification? How would you characterize the ROI for horizontal versus vertically scaling?
Sergej: I’m not sure how much that is even a factor. It depends on what kind of customers you are speaking to and what their needs are. The kind of companies we are in contact with often use horrifically inefficient languages as a base for their products, because those languages make solving problems in them easier. The ROI is in faster development time.
One could make the same case with distributed databases. Excluding KV stores, you still have structured values, indexes and sometimes SQL. If horizontally distributed databases solve more problems for you than vertically distributed ones, then that is the ROI. They spare you from solving those difficult problems.
The industry has moved on from the one-size-fits-all mentality. Distributed databases are not a replacement for traditional monolithic ones. There are things possible in monolithic databases that are not possible in distributed ones, and vice versa. There is room for both, and now developers have a choice as to what best fits their needs.
I think the tech industry has more in common with the fashion industry than we like to admit. Technologies grow and die in popularity much like fashion. When new concepts like eventual consistency rise up, we sometimes get a bit too enthusiastic about them. Right now I think the traditional way of thinking is coming back a bit. It turns out a nice SQL interface to the database is important, and new and untested storage engines are pretty dangerous.
Percona: What do you see as an issue that we the community needs to be on top of with regard to distributed database development? What keeps you up at night with regard to ActorDB and the implementation of your solution?
Sergej: Well I’ve already mentioned the main issues: distributed consensus and storage engine. The key issue for us is in our Raft implementation. At the end of the day, a database must have solid performance – which means you can’t just grab an off the shelf Raft implementation and use it. It must be tightly integrated with the storage engine.
But what literally keeps me up at night is the unexplored potential that we see in the product. There are so many interesting avenues we have not developed yet.
Percona: What are you most looking forward to at Percona Live Data Performance Conference 2016?
Sergej: I’m looking forward to discovering something new and finding out what others are doing. But mostly getting more feedback, especially if negative! That is often the most useful kind of feedback.
You can read more of Sergej’s thoughts and about ActorDB at the Biokoda blog.
Want to find out more about Sergej and ActorDB? Register for Percona Live Data Performance Conference 2016, and come see his talk ActorDB – an alternative view of a distributed database. Use the code “FeaturedTalk”and receive $100 off the current registration price!
The Percona Live Data Performance Conference is the premier open source event for the data performance ecosystem. It is the place to be for the open source community as well as businesses that thrive in the MySQL, NoSQL, cloud, big data and Internet of Things (IoT) marketplaces. Attendees include DBAs, sysadmins, developers, architects, CTOs, CEOs, and vendors from around the world.
The Percona Live Data Performance Conference will be April 18-21 at the Hyatt Regency Santa Clara & The Santa Clara Convention Center.