Large result sets vs. compression protocolAndrew Aksyonoff
mysql_connect() function in PHP’s MySQL interface (which for reference maps to mysql_real_connect() function in MySQL C API) has a $client_flags parameter since PHP 4.3.0. This parameter is barely known and almost always overlooked but in some cases it could provide a nice boost to your application.
There’s a number of different flags that can be used. We’re interested in a specific one, MYSQL_CLIENT_COMPRESS. This flag tells the client application to enable compression in the network protocol when talking to mysqld. It reduces network traffic but at the cost of some CPU time: server has to compress the data and client has to decompress it. So there’s little sense in using it if your Web application is on the same host as the database.
When the database is on a dedicated server then compression essentially means trading CPU time (on both server and client) for network time. Obviously, if the network is fast enough, the benefit in network time will not outweight the loss in CPU time. The question is, where exactly does the border lie?
It turns out that 100 Mbit link (with 1.4 ms round-trip time) is not fast enough. Oleksandr Typlynski, one of the Sphinx users, has conducted a benchmark, indexing 600 MB of data over 100 Mbit link. The data was textual and compressed well, reducing traffic more than 3 times. With compression, total indexing time reduced to 87 sec from 127 sec. That’s almost 1.5x improvement in total run time. MySQL query time improvement is even greater. On the other hand 1 Gbit link was fast enough; and total run time was 1.2x times worse with compression.
The bottom line: if you’re fetching big result sets to the client, and client and MySQL are on different boxes, and the connection is 100 Mbit, consider using compression. It’s a matter of adding one extra magic constant to your application, but the benefit might be pretty big.