I just posted slides from a talk I gave at a Facebook application developer conference in Las Vegas this weekend. The talk is titled Outrun the Lions. Our customers run several of the top 10 applications on Facebook right now (as measured by the number of active users), and I revealed the secrets to building applications that can handle the load.

The title is a pun on the story about lions and gazelles in Africa (every day, a gazelle wakes and knows it must outrun the fastest lion; every day a lion awakes and knows it must outrun the slowest gazelle). If you’re a Facebook application developer, the lion is not your competition. It’s your users. They will love you if you do a good job. They will love you so much that the load will destroy your application, and then they’ll go away and they won’t come back. (Another speaker at the conference has the stats to prove this.)

A lot of our customers in our consulting practice are Facebook application developers, so we see the same patterns a lot. This talk was direct advice at how to avoid the problems they see as they grow. I skipped all the “you might consider these 99 options” and just said “here’s something our clients use and it works. Read the book for the full details.” If you read the slides, you’ll get an idea of how our customers have successfully scaled their Facebook applications to perform well even when they become very popular.

The reference to the book was because the company who sponsored the conference (Offerpal Media) gave everyone free copies of our new book High Performance MySQL 2nd Edition. Not only was this very nice of them to do, hopefully it’ll help them avoid trouble with their applications.

10 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Eran Galperin

Excellent presentation (though it’s just a giant teaser 😉 ). I have a question though regarding Apache – do you have any tangible evidence that the other web servers you mentioned are better performing in general? I think it should be a case-by-case basis.

jeffatrackaid

— a small server is just a big server waiting to happen —
Baron

This is a key issue I’ve seen with several projects. People just do not realize how quickly they may need to scale. With Facebook and similar Web 2.0 social media, you need instant scalability. In a few cases, I’ve worked on the infrastructure side of projects that mushroomed from a hobby project to 100,000’s of users within a short time period. Some of these projects buckled under the load because they were never designed with scalability in mind. The application developers had over-relied on the database, optimized nothing, and then wondered why their latest quad core boxes were crashing under the load.

I am not a app developer, so maybe it is difficult to know what will be a bottleneck. At a minimum, I think some baseline optimization, splitting read/write queries, and similar steps could be a starting point for most projects.

Arjen Lentz

A gazelle doesn’t need to outrun the fastest lion. Lions optimise, they won’t exert more energy than necessary and thus they always go for the slowest animal. Therefore, a gazelle merely needs to outrun the slowest gazelle, with some margin, and not be close to the edge of a group to accidentally get caught.
A lot of conditionals, I admit, but the point is that the gazelle too can optimise.

jeffatrackaid

Perhaps it is just our client base that gives me this impression, but I typically find that many web application developers (and their managers) still see scalability as largely an infrastructure issue. If the database is slow, add larger or more hardware. Management likes this approach as well as it is often cheaper to add another DB server than to add another programmer to a project.

I am not sure if Percona sees similar issues? I know you guys have a very different client base than we do. With our clients (SMBs, ecom, social app, forums, etc), I often see application level scalability issues are often an afterthought.

Clint Byrum

I’ve seen exactly what Baron is talking about at several companies. Developers of web apps vary greatly. Some of them get it and realize how to scale apps. Some of them are used to writing 50, 100, maybe even 1000 user apps, and don’t understand the scaling problems of apps that get many thousands of requests per second at peak times. I see a lot of assumptions that infrastructure will cache, or “do the right thing” when instead they could do things slightly differently in the app, and get sometimes 100 fold increases in scalability and reliability.

Mark Callaghan

Given the performance issues with servers that have more than 4 cores, what is the short-term workaround?
1) put several mysql instances on the one server
2) put several mysql instances on the one server and pin servers to a subset of CPUs
3) turn off some of the cores – or use one server pinned to 4 cores
4) run mysql on 4 of the cores and sphinx on the other 4 cores