November 29, 2014

Open Development vs Making a Big Splash

I find it very interesting how Sun does not get the very basic principle of true community Open Source development – you’ve got to give up on making a big splash.

Traditional close source company often develop product in the secret and when it comes out as a surprise for computers and making a big splash for the users. Does it remind you something ? Yes! this is exactly how Innodb Plugin was released last year or MySQL 5.4 performance improvements this year.

Community did not know about them and did not participate early in this efforts.

Another big splash which seems to be planned later this year is “Performance Schema” – which is in development for years as this post claims but to date there is no code for community to play with

I believe if you want to learn true community respect, you’ve got to respect (and so involve) community. You should show your early and buggy code and let community play with it and complain. This is true Open Source development and it takes guts.

I should praise Drizzle for having code available from very early days. I should praise Monty for having Maria storage engine tree available from its early and buggy days. I should praise Falcon and PBXT teams which were not shy to show us multiple redesigns they had to go through. This earns my respect and this is how I think Open Source development should be done.

At Percona I think we’re doing relatively well – our trees are public and lauchpad and I do not think we had ever a project which was cooking internally for more than a month. As soon as we got something which was interesting enough to start playing, such as Xtrabackup we got it out, just stating this is very early code.

Google does not get very many points on this one – I think they are very obsessed with “everything Google does must be great” policies which prevents them from releasing code before it is complete and properly tested at least internally. Though I think Google do not position themselves as doing Open Source development but rather being kind enough to publicly share the development which is done internally. This is fair and I appreciate it.

So why Sun and Oracle are after big splashes ? Well I think this is the only way tradition companies know how to market the stuff. If project starts from single line of code and growths day by day to completion it is very hard to build excitement around it. There is no big news just a slow process.

I would argue however releasing certain code as Stable/GA should be enough of event for masses – it is not however enough for community which see it just as minor incremental development from Initial Alpha/Beta releases.

About Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master's Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.

Comments

  1. Mark Callaghan says:

    What do you mean by Google — the company in general or me?

  2. Mark Matthews says:

    Peter,

    What’s https://code.launchpad.net/~marc.alff/mysql-server/mysql-6.0-perfschema then? “To Date”, it’s available. Maybe it wasn’t available as early as you wished, but it’s been live on launchpad for over a month now.

  3. peter says:

    Mark,

    I do not know if this is your decision or company policy as I’m thinking more about it. But I hope you agree the story with google patches was a lot they were developed internally and when published. If I understand correctly Google is not actively looking for people to contribute to their patch set… it is rather contribution to MySQL on its own.

  4. peter says:

    Mark Matthews,

    Well sorry for not catching it when. I just know we were looking at it in February when bunch of posts showing cool stuff appeared and we could not get our hands on it.
    It is great it is available now. The good question remains how long before the code started to be written and tree becoming available public.

  5. Mark Callaghan says:

    Yes, it is done in isolation. I am not sure if anyone would be interested in participating on an old branch (5.0.37) with a very particular benevolent dictator (me) as we have very specific and occasionally odd feature requests. We also don’t want most of the code to change.

  6. Mark,

    one reason they might do this is because they are scared someone might steal their thunder. If I remember right this happened when Red Hat developed AIGLX. Ubuntu, with their very frequent release schedule managed to include AIGLX in a relase before Red Hat managed to get it into a Fedora relase. Ubuntu got a lot of press thanks to AIGLX.

    Big corporation want control and “big splash” gives it to them.

  7. peter says:

    Mark,

    Right. I think you have the different goals of making the code to solve needs of specific customer/use case – Google internal use. The community development inevitably includes balancing interests of a lot of people. Though I think you would be surprised by amount of people willing to work with you if you open up Google-MySQL development.

  8. peter says:

    There can be commercial reasons both to delay publishing the code until very release as well as keep feature plans private. I just think neither of these should have place in true OpenSource project development.

  9. Mark Leith says:

    I did not claim that PERFORMANCE_SCHEMA had been in development for years, what I said was:

    “In fact – it’s not “new”, it’s something that has been in the worklog system for a long time, and has had much much much discussion internally between some of the brightest engineers in the group.”

    I.e it was a *spec* that had been around for years, and had been talked about for a long time.

    I also mentioned in the very same post (all be it, in the comments):

    “Here’s one of the commits for the Maria instrumentation late last year:

    http://lists.mysql.com/commits/56225

    They’ve all been out in the open in the commits email list:

    http://www.google.co.uk/search?hl=en&safe=off&q=site%3Alists.mysql.com+mysql-6.0-perf&btnG=Search&meta=

    So in fact the development was perfectly in the open for anybody that cared to read the commits list – anybody can subscribe to that, just like anybody could subscribe to the percona lists. Just because we were not shouting about it does not mean we were doing it entirely behind closed doors. Sure, I admit there was not a tree available (as there is now, as already pointed out), but the patches are there for anybody to see/apply.

    Let’s look at when the tree was created – September 29 2008:

    http://lists.mysql.com/commits/54666

    Since we have joined Sun, we have done nothing *but* work our way towards a more open development model. Picking on the PERFORMANCE_SCHEMA is a poor show, it was a kind of gorilla project when it was actually picked up and started work on by Marc and Peter, because they understood the importance to many users to have this information available (people just like you, and Google, and any other person that wants to tune the system pragmatically rather than with black magic).

    They took it, they ran with it, they got it to where it is today – ready for the communities input – entirely on their own, outside of the main development roadmap / schedule, “under the radar” as it were. It was not a weekend project that you can throw on to a tree within a month, it took a lot of hours to get to a stable condition, where people could actively start using it and providing feedback.

    I salute them.

    You should too.

    Oh, and XtraDB was not a “big splash”? Was that done entirely out in the open – from day one?

    Is nobody allowed to announce great developments with a big splash?

    Do we constantly have to find ways to bicker?

    I despair.

  10. Mark Callaghan says:

    NIMBY applies here. We want someone else’s branch to be the one under which open development is done. Perhaps MariaDB will fill that role. But I expect a lot. Is any MySQL branch governed by rules as are PostgreSQL or Apache projects? Or is my participation subject to the decisions (and whims) of the branch leader(s)?

  11. Mark Leith says:

    Every “branch leader” has their own agenda.. ;)

  12. Peter,

    I don’t know much much influence you have in this case, but it’d be *awesome* if Sphinx followed the model you recommend. I’ve been working to upgrade to 0.9.9-rc2 and am finding and fixing bugs. I’d like a better way to share those with the world and/or see if they’ve already been fixed in some unreleased code.

    Thanks!

    Jeremy
    (off to build a test csae for Andrew to demonstrate kill-list merge bugs)

  13. Mark Callaghan says:

    @Mark — yes they do. There isn’t anything wrong with an agenda, but they aren’t always a good fit for me. Most of the branches allow for technology transfer from each other, so our occasionally redundant efforts are not wasted.

  14. Mark Leith says:

    @Mark: Agreed, it’s open source, that’s the beauty of it – take it where you want it in the environment that you are in.

    Mikael has done a great job recently of looking at the merits of the Percona and Google branches to split up the buffer pool mutex, a great testament to knowledge share, and being able to evaluate what is best, outside of the main tree.

  15. peter says:

    Jeremy,

    Regarding Sphinx I’m constantly suggesting Andrew to get the public tree somewhere and make it easy for people to have their own subtrees to make idea exchange easily. I’m not sure why he did not do it. I’ve showed him your post and hope it motivates him :)

  16. Mark Callaghan says:

    Mikael and the 5.4 team is doing a great job. I await more results from them so I can improve my patches. The Percona builds have been a great way for me to determine my progress, now I can compare performance between my patches, Percona and MySQL 5.4.

  17. There are two issues here.

    If you’re like Mark Callaghan and I , where releasing Open Source to the MySQL community is a secondary goal, you often don’t have time to work with the community as much as you would like.

    … this is the “work on it, prove it works for your workload, and then throw it over the fence” model.

    This is somewhat valid. The world isn’t a perfect place and we don’t all have infinite time.

    However, if this IS your primary goal, and you’re going to be working on it and REALLY wanting it to gain traction, then you should be public about your project design requirements, source code, bug database, etc – all from day zero.

    It takes you about 20% longer but you end up with a MUCH better code base.

    The only time I think this DOES NOT work is in the consumer space.

    If you release a consumer product the press RIPS you apart when it has bugs EVEN if it’s a beta product.

    It’s really a shame but it’s just the way it is… I can’t blame apple for taking the big splash approach.

    For Sun/MySQL/Oracle/InnoDB to do this on Open Source is just really stupid and costing them money.

    Seriously… if you WANT my money, if you WANT me to purchase your products, I need to know they are reliable…

    Dumping all this stuff on me all at once is not going to make me sign a check any sooner.

  18. Mike Hogan says:

    This post http://www.scaledb.com/component/option,com_wordpress/Itemid,243/p,47 looks at some of the other factors inside big companies that will invariably cause the open source process to become increasingly closed. Part of it is the underlying shift from developers being the customers, to end-users becoming the customers (especially paying customers). Just as Crossing the Chasm tells us that a company must change to accommodate its user base, so too will successful open source efforts.

  19. peter says:

    Kevin,

    Thanks for a great way to put it. Indeed for Percona it was often “we do what our customer needs” approach with releasing them public. We however gradually moved to gather more people idea and do “community” development too – as soon as we were able to support it. The OurDelta and Open Database Alliance has buliding product as a goal so it is going to be much more public.

    There are two things here though – public design and public release. You do not have to involve a lot of public for feature release to release it early. For example with Xtrabackup we did not publicly asked for ideas on how to create it. We just did it and released it in an early stage within idea interception.

    Now speaking about consumer products – I think it is a lot about expectations and goals. You can see many Linux distribution development trees available from very early stage. In case of tree it is open for experts to participate but it is not easily downloadable so people who do not understand what this is download it use it and ruin their data.

    If we’re speaking about MySQL 5.4 I do not think it would be the problem to create the tree and announce it on the development site – Press guys would not go into building and testing launchpad tree anyway. On the conference 5.4 beta could be announced to press.

  20. Vadim says:
  21. Mark Callaghan says:

    Is XtraDB development open? Can I propose changes on the launchpad branches?

  22. peter says:

    Mark,

    Sure we would appreciate any suggestions and code contributions.

  23. Mark Callaghan says:

    Wonderful

  24. Jason says:

    I’m not sure if I’m out of it or what, but I wasn’t even aware that 5.4 was released.. I’ve been using 5.1 lately, and planned on continuing to use it. Is 5.4 worth moving to?

  25. James Day says:

    Jason, if you don’t mind a demonstrator branch, please do and share your feedback. It’ll be a while before it’s had broad enough use for MySQL/Sun to say it’s ready for general availability. 5.4 definitely isn’t released for general use, it’s still a work in progress and doesn’t even have all of its optimiser and other improvements merged in from 6.0 yet. Those who are using it are having good experiences, though, so it shouldn’t be unduly painful.

    If stability is your main goal, forget 5.4 until it’s generally available and think of it, XtraDB and the Google patches as teasers for what’s to come. Or stick it on a non-critical slave or ten and have fun with it, reporting any bugs you find.

  26. shodan says:

    Jeremy,

    Re #12, I’ve recently created a public SVN mirror at http://code.google.com/p/sphinxsearch so the code’s transparent now. Since we converted to SVN it was merely lack of time… and knowledge. Once I learned about the possibility of syncing to Google Code so easily, I pushed our repo there immediately. We still have to provide automatically tested (!) tarballs though.

    Peter,

    Re #15, you know the whole story, at least I told it once or twice. ;)

    Kevin,

    Re #17, some companies, and I’ve a grim feeling *many* companies, will NOT actually purchase your product if it’s free, open, reliable, etc. Heard a war story about Sphinx recently. De-facto CTO approached his CEO about buying Sphinx support etc. CEO turned the initiative down with basically “it worked for free for almost a year now; why pay anything”… and that’s not really uncommon. And that’s not the only type of a war story. But that’s another story.

  27. Mark Callaghan says:

    Well, I am doing my best to drive customers towards expert consultants. I think we have doubled the number of my.cnf parameters for InnoDB. Most of them have great defaults that don’t need to be touched. But if you are looking for that extra 5%, then maybe you need to bring in the hired guns. Enjoy!

  28. David Lutz says:

    Peter,

    The Percona announcement for XtraDB last December certainly looks just like any other Big Splash announcement.

Speak Your Mind

*