On many web sites you would see a counter how many time given object – blog post, forum thread, image, movie etc was viewed. This is sometimes handy feature but it can be rather expensive from performance point of view.
The nasty thing with counters as they are implemented the most trivial way – they convert read load to write load. When you would simply fetch given object information now you do not only fetch the data but also update the view counter.
For smaller single system web site with no caching the overhead may well be insignificant – as you run update for the same row which just was fetched it does not cause any synchronous IO and background IO can be batched rather effectively.
As the system growths however and you implement some form of caching, ie memcache you end up reducing number of reads from database dramatically but writes still have to happen at full speed. It also really hurts if you choose replication (rather than partitioning) as your scaling path – it becomes very easy to overload the replication, especially as it has to serialize transactions on the slave which could be executed in parallel on the master.
So we learned it can be quite nasty, so how can we deal with this problem ?
I assume you really need to have these counters (in some cases there are more counters then there really needs to be) and also assume you would like to see semi-realtime update for them so we can’t simply update them nightly using batch job. What to do in this case ?
First separate them into special counter table, having say “id” and “counter” columns. Using separate table for updates already solves the problem in a lot of cases, and here is why. Assume you have a counter in the blogpost table instead which has 2000 byte rows in average. 10mil of posts result in 20GB table which may be too much to fit into the buffer pool (or other cache). Now the table which I just mentioned would have approximately 20 byte rows even for Innodb tables which makes it 200MB in total which easily fits in buffer pool. Fitting of the working set in buffer is paramount as if you get it you will be able to do many thousands of updates per second with no problem (assuming log commit would not be bottleneck). Of course the fact you update short rows also helps.
The next problem to deal with is log flushes. If you have innodb_flush_log_at_trx_commit=0 or 2 or if you use MyISAM tables you would not have the problem, if you have innodb_flush_log_at_trx_commit=1 log flushes can well be the problem – even with BBU thousands of log flushes per second may cause significant overhead or may simply be bottleneck. If you can reduce your durability requirements this is best thing to do if not you would either use MyISAM table as counter table (can well work if you do not run into table locks bottleneck) or implement delayed table update as explained below.
At this point we’ve optimized update complexity significantly however sheer volume of updates may be very high and ask for optimization. Thinking about this and some other problems I found it too bad memcache is a passive cache, meaning it just caches stuff and you can’t easily tell it to perform certain action (ie write back to the database) when object is removed from the cache, or just do it periodically. If you’re in system programming you perhaps could implement cache with such semantics and it will handle real-time counters very efficiently pushing few updates back to MySQL server.
If you rather use existing solutions you can use memcache + another mysql instance (or simply the database which is not replicated) to log updates in heap/myisam/archive table and aggregate it in the database for example each hour. If your memcache is sized so objects which touched within last hour are not removed from it this would always give you correct counters while being very efficient.
By doing these delayed updates you can not only reduce number of updates, say if popular object viewed 1000 times within an hour you still will need to do only one update but you also can implement all updates from the chunk in single transaction saving on log commits significantly.
I mentioned you can use MyISAM or HEAP tables for this purpose, how do you deal with Table Locks in this case – the script which processes the views within last hour has to run rather complicated group by query ? Sure it does but it does not have to do it for the same table as inserts go to. The trick you can use here is standard “shadow table” trick – use two tables insert into one and process and truncate another. MySQL offers atomic RENAME TABLE call which can be used to swap two tables.