October 22, 2014

Commodity Hardware, Commodity Software and Commodity People

In the previous post I mentioned not all architectures and solutions work for Commodity People, and people seems to agree with me.
Number of vendors would claim they are in Commodity Software or Hardware business but few would probably mention they are doing it for Commodity People, because few people would like to be called commodity – each of us would like to rightfully think he is special and unique.

Thinking more about the topic I think being “Commodity People Friendly” is one of important properties for commodity products. Look for example at Dell HP or Whitebox x86 servers, they are not only cheaper but they are also easier to use than Mini Computer systems from IBM. Directly attached storage is more simple to use than SAN, MySQL is more simple to use than Oracle or DB2, PHP is more simple than Java.

Even for the same Vendors you can find commodity products are designed to use by commodity people – they tend to be more failsafe and easy to use than higher end ones. Look for example at LinkSys home routers, by CISCO and compare it to traditional IOS based series. HP is another example Vendor being on commodity and high end markets.

I believe at large extent MySQL gained its popularity due to these properties of being simple to use and forgiving to the errors. Look at early MySQL versions – if you insert too long data it would just strip it but continue working, you can copy (MyISAM) table while server is running and it would typically work. Even if table is corrupted MySQL still would run just giving you errors sometimes. No transactions means you do not have to deal with deadlocks or learn isolation modes. Perfect solution for absolute beginner.

Now from queries design MySQL would support a lot of functions (which you can always lookup in the manual) but it did not support any complex constructs like subqueries or views so you rarely would scratch your head thinking what is this query you peer has written suppose to do.

Of course you could tell me who cares about “commodity people” – we would hire smartest guy out where who can compute optimal join combination for any 15 way join in their head not to mention understanding all ups and downs of database management system. This sounds nice in theory but there are not so many smart guys out where, and if they are they may be pretty expensive so you can’t have many of them in the team. In many MySQL Projects there are not even dedicated MySQL people and Web developers simply use MySQL as they fit suitable and one of them assumes DBA roles and installs MySQL and chases developer if they write too bad queries (in a good case). In a bad case you may find MySQL which just happened to come with OS and queries which no one ever run EXPLAIN for which just happen to work anyway because of very small database size.

Now, for a smaller projects even if you happened to have smart MySQL guy you might not be in a better shape as your business may be at risk if he leaves and no one else is able to understand his smart ways. This can be much worse than having commodity solutions used which may not be optimal but which everyone in the team understand and is able to support if needed.

What scares me in MySQL Development is what is is quickly leaving this Commodity Space in terms of overall feature complexity. MySQL 5.2 will give you many storage engines to play with (many with transactions an some with clustering) with support of partitioning stored procedures views and a lot of stuff. Think how much freedom evil smarty has to design something which would be hard for other people to understand and support. Interesting enough the fact MySQL is Open Source puts it in a worse shape than Oracle and other systems here – typically you get charged more for Advanced features but with MySQL everything is free out where to try so there is no financial barriers stopping you from shooting yourself in the foot.

MySQL Skills also would likely loose portability. If you look at MySQL 4.0 you could simply ask if person knows MySQL or does not, in the new versions it becomes possible for someone to be an expert in one area and familiar with one design approach but not with the other. Someone could be great MySQL Expert but have no experience with MySQL Cluster, other may be good with MySQL Cluster but have no idea how to write storage engines or optimize for Falcon storage engine. Relatively simple 1000 page book which would cover pretty much all of MySQL features in version 4.0 becomes a book shelf for new MySQL versions.

I can’t say this is any unusual development. I think it is natural for software to chase features, because this is where customers are leading the product (“implement this and we’ll buy it”) but this is also why old products become feature overkill, slow and complicated. I’m not in positions to complain though, increasing complication will mean more services will come to us for Consulting Services

Do I see MySQL replaced in its space any time soon ? I do not think so. I do not think we need yet another SQL database because SQL language itself is way to complicated and outdated, plus it is not expressive enough for many modern application needs. I would expect solutions to be developed which operate on more flexible data structures, handle distributed semantics of web better and are expressive in a different form. I know we already had a false start in this area with XML databases but this is not unusual for technology to rethink itself and gain the market from the second attempt.

Some interesting developments we kind of have in this are is of course famous Google BigTable or FaceBook API, I know few other companies have their special “database” interfaces which run on top of MySQL or other databases. Other interesting development is Scalable Blob Streaming. This projects starts with retrieving data from storage engine using HTTP Protocol and I expect it would not be long before other operations would follow.

One thing I was thinking a lot is all these great Storage Engines – do they all really need MySQL to run ? At this point all Open Source storage engines out where are for MySQL but why could not one develop smaller and lighter top part ? The same way as you can run PHP as part of Apache but you can hook it up to bunch of other web servers as well, such as lighttpd.

In fact in this area MySQL Cluster, which was always best isolated of MySQL source leads the pack – there are bunch of interfaces to talk to MySQL cluster as PHP session storage or as a REST Web Service, which are all rather interesting.

One more interesting development I expect we might see is more active use of remote storage services. Amazon has S3 service which deals with files, I bet similar service could be designed for many of data store applications, especially for specific needs when for example large amounts of data need to be analyzed which requires large amount of hardware to offer quick response time, but only for short time frame.

Anyway it is hard to predict the future so it would be fun to watch how things develop.

About Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master's Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.

Comments

  1. lanh says:

    The way you use commodity can be a bit confusing.

  2. Use the phrase “busy people” instead of “commodity people” and all of the negativity just melts away…

  3. Amazon S3 actaully deals with MIME-typed blobs. I’ve written a MySQL storage engine for it, available at http://fallenpegasus.com/code/mysql-awss3/

    One of the things in my “working on” pile is a generic schema storage engine for S3, so that someone could just take an existing MyISAM table, and say “ALTER TABLE ENGINE=AWSS3″, and it would keep on working.

    Also getting out of the SQL language sphere, S3 has some interesting key search sematics, and there are some very interesting large database key search REST APIs from utility computing providers that will be very interesting.

  4. Try calling it a commodity skillset……. doesn’t point the finger at people being commodity.

    Kevin

  5. Jan says:

    You might be interested in CouchDb (http://couchdb.org/). A few articles on my blog explain a bit more than the wiki-documentation.

    Cheers,
    Jan

  6. peter says:

    Robert, By commodity people I do not mean Busy people. I think this is more question of Skill Set as Kevin notes but also about state of mind. Some people try to innovate and use crazy decisions which would people later call smart others would use classical approaches, or simplistic approaches.

    The great example in this case is memcached. When it just came out a lot of people could say – what a crazy idea, database already uses memory well and can cache things, not to mention MySQL Query Cache. Many people I run into even now do not like to use memcache or other proved approaches like sharding because this makes things to complicated for them they would prefer to live in the wonderful world where you can throw queries at your single database and it will handle any load.

  7. peter says:

    Jan, Mark

    Thanks I should take a closer look at S3 storage engine and CoachDB

  8. I meant that when smart people become busy, they act like “commodity people”; that is, busy people also exhibit a low tolerance for complicated things, and want products which are more safe, easier to use, and more forgiving of errors.

    Once you realize that, you might even start to think of “commodity people” as just “busy with something other than what you want them to be busy with”. Heck, they might even be smart. ;)

  9. acer says:

    very good!

Speak Your Mind

*