JOIN Performance & Charsets

October 21, 2008
Author
Ryan Lowe
Share this Post:

We have written before about the importance of using numeric types as keys, but maybe you’ve inherited a schema that you can’t change or have chosen string types as keys for a specific reason. Either way, the character sets used on joined columns can have a significant impact on the performance of your queries.

Take the following example, using the InnoDB storage engine:

t1 has been populated with 100,000 records, while t2 has 400,000. The data set fit easily in memory.

Notice the differences in key_len and the explicit call to CONVERT() in the WHERE clause. This is a result of the joined columns being of different character sets.

The above query took an average of 4.33 seconds to execute with t1 as utf8 and t2 as latin1. Converting both tables to utf8 resulted in an average execution time of 3.12 seconds and had the following EXPLAIN:

Notice here how there is no CONVERT() required and the key_len on both tables match. Just this simple change resulted in more than a 25% improvement in average execution time, from 4.33 to 3.12 seconds.

This test was performed with MySQL 5.0.67, FreeBSD 7, on a box with 2GB RAM.

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Far
Enough.

Said no pioneer ever.
MySQL, PostgreSQL, InnoDB, MariaDB, MongoDB and Kubernetes are trademarks for their respective owners.
© 2026 Percona All Rights Reserved