About three months ago I announced ClickAider to become available to general public. And I think it is about the time to write about the progress we have with this project for those who interested.

The project generates decent interest and we have about 3000 sites Registered over this time, which I consider decent number especially as we did not do much of advertisement and PR keeping it low profile and working out few bugs which we might have.

We use GeoIP DNS based load balancing between “gathering” servers in Europe and US which seems to work very well both providing level of HA if one of the servers goes down and allowing to increase accuracy by reducing round trip. Over time we are planning to get more locations with pair of servers in each so we do not need to use relatively slow DNS based fail over if one of the servers goes down.

We get some 600 tracking events per second on our lighttpd tracking servers which currently works well and there is still some capacity available but we’re still planning to get rid of little PHP code we have left at this layer to get it even more efficient. It would be good to handle some 5000 events/sec per server.

MySQL 5.1 with Partitioning works nicely for data storage with no MySQL bugs hitting us with this project so far. We use Innodb tables now because checking and repairing MyISAM is nightmare and PBXT which could be good for this work is just not ready.

MySQL Performance is in fact the most serious issues we have to work with, even now reporting only clicks statistics for some huge sites like Mininova may take quite a while to generate.

The typical solution for trackers is to have summary data built one way or around and we might need to do some of it for certain most common queries. At this point however we’re looking how much performance we can get from real time aggregation because we want absolutely unrestricted dynamic filters and dynamic timezones and this makes things hard to aggregate. This is surely fun challenge to deal with.

We also continue to work on adding more Advertisers we support and improving a ways we track the old ones. Recently we’ve added support for Vibrant Media Intellitxt (hovers only at this point) . We also now support ShoppingAds even though they are not yet out of private beta version.

Looking at the site we’ve added major improvement of saved reports – now you can create custom reports with all filters you would want to and save them for quick use at later time. For example you can track what are the most popular click directions for US audience compared to general audience or track performance for referrals from given domain name to see if partnership makes any sense for you.

Finally we’ve added Demo Account so you do not have to register any more to see system in action. We used our MySQL Performance Forums site for the demo, which might be a bit low traffic but still good to see how system works. This is the reason why we added Google Adsense Adds on that site.

If you have any other ideas what would you like to see implemented in ClickAider, let us know.

21 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Brian Aker

H!

That is wonderful to hear about partitioning. How many partitions are you running with? Did you have to adjust file descriptors?

Cheers,
-Brian

Norbert Tretkowski

What are you using for GeoIP DNS based load balancing?

Norbert Tretkowski

Peter,

I’m very interested in it, please motivate Aurimas. 🙂 I’m currently looking for a DNS-based load balancing solution, but focused mostly on appliances (e.g. from Zeus). Good to know that PowerDNS does DNS-based load balancing as well.

Brian Aker

Hi Peter,

I am hearing that there are issues at around 300 partitions, so I am curious about your setup. The stated limit is 1024, but I have not met anyone who can make that work. In testing I can’t get MyISAM anywhere near this, and for Archive I modified the use of file descriptors and open handles o that it doesn’t have the problems MyISAM is having.

So did you set up your partitions so that they are “into the future”? Re-partioning requires a complete rebuild.

Cheers,
-Brian

Vadim Tkachenko

Brian,

We have 25 tables per server, each divided into 12 partitions (one per month).
We cover period from Jun-2007 to May-2008, so as you see we have “future” partitions.
Now we have to not forget to add new partitions near May-2008. It would be good MySQL does it automatically 🙂

Basi

Maybe you can use the event scheduler available on 5.1 to create new partitions on the fly, only at the moment the new partitions are needed.

Jonathon Coombes

I agree with Basi on this option. I have implemented a solution using partitioning under 5.1 using the event scheduler to automagically create new partitions as needed. This was because of the limitation Brian mentioned in trying to exceed the 300 mark. From memory, I ended up restricting it to 250 and did a roll of the partitions – create a new one and archive off the oldest. The event scheduler made this job much easier to achieve, but I never tried to solve the higher number of partitions, as I put it down to the beta state of the software.

Brian Aker

Hi!

Lookup on a non-partitioned key is one of the issues with partitioning, the other is just the use of resources. When a table is opened all of the underlying tables are opened at the same time. Depending on engine and number of partitions this can turn out pretty bad.

Cheers,
-Brian

Brian Aker

Hi!

Right now only Archive does a lazy open (and I am thinking about adding a reaper to go through and close based on non-usage).

One thing to consider for those talking about events. An event that does an alter will lock the table up throughout the alter. You also need 2X the diskspace during the alter (half for the old partition, and half for the new).

Cheers,
-Brian

Brian Aker

Hi!

It just wasn’t written that way (and I would agree that it is bad). Partitioning on “at rest” data makes sense or on data that doesn’t require 24/7. For 24/7… I suspect someone is going to have to come up with a solution that doesn’t require blocking.

I need to look and see if NDB can now do online partitioning adding, I know it was talked about.

Cheers,
-Brian

AlexN

Adding demo account is the most important improvement. Now, even in beta stage
the service looks attractive. Spylog is too expensive, Google analytics has too
many bugs that they are not going to fix soon. At reasonable price it would be
nice alternative.
The biggest problem with all these sites is connection availability. It is better
to loose some information than to loose some visitors, annoyed by “connecting to
#[email protected]” message.

Emin

I know it is not a bottleneck now, but suggest to look at nginx instead of lighttpd as well. Much better performance.

Emin

Peter, I am running a website which is although much smaller, still gets some few hundreds requests for static content and a few dozen requests for dynamic content per second.

I do not have formal benchmarks at hands, but I have myself experienced a significant performance improvement on the server. Very important point that is quite often omitted in the benchmarks is that although nginx may be for example 10% faster in serving requests, CPU and memory usage are much, much less, actually almost zero. This leaves it all for PHP/MySQL and other memory/cpu hungry processes. My server load averages have dropped significantly after I moved to nginx and its flexibility in for example adjusting config files or even upgrading the web server without stopping service for a second are unbeatable.

adrian ilarion ciobanu

re: What are you using for GeoIP DNS based load balancing? (Norbert)

there is actually a djbdns-based dns-auth server called geoipdns that has some more functionality and adds views with per-record granularity instead of per-zone for geo-based filters/rules: http://pub.mud.ro/wiki/Geoipdns