Where the open source database community meets: Use code PERCONA75 and secure your spot for Percona Live.  Register

Be Careful when Joining on CONCAT

October 16, 2007
Author
Aurimas Mikalauskas
Share this Post:

The other day I had a case with an awful performance of a rather simple join. It was a join on tb1.vid = CONCAT(‘prefix-‘, tb2.id) with tb1.vid – indexed varchar(100) and tb2.id – int(11) column. No matter what I did – forced it to use key, forced a different join order – it did not want to use tb1.vid index for it. And no surprise it was way too slow, the number of rows analyzed was really huge:

Then I took a look at MySQL manual and here’s a short quote about CONCAT:

…If all arguments are non-binary strings, the result is a non-binary string. If the arguments include any binary strings, the result is a binary string. A numeric argument is converted to its equivalent binary string form; if you want to avoid that, you can use an explicit type cast

OK, let’s check if that really helps:

Much better now.

0 0 votes
Article Rating
Subscribe
Notify of
guest

18 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
paul
paul
18 years ago

I am surprised to see this blog posting, as there is nothing to be careful about joining on concat, simply if you are doing that your DB design is grossly wrong. Stop and think how ridiculous it is to be concat’ing a string on every query and joining on it.

Dale
Dale
18 years ago

I agree. I had the same initial thought. I could not think of a time that I would need to use a concat() within the where clause. The whole idea behind a good scheme model is that you use good keys that match up without needing further tweaking.

Daniel Schneller
18 years ago

While I generally agree that the schema design is suboptimal if you get into situations like this, sometimes you just cannot help it. Unfortunately(?) we do not live in a perfect world, and existing systems cannot always easily be changed. So it is important to know the caveats that might bite you if you have to do such things.

Peter Zaitsev
Admin
18 years ago

Dale, Paul

This would be the case if we would live in the perfect world. True in the database with good design you would not join on CONCAT but there are a lot of databases which are written with not so good design, and these are the ones which we’re called to fix quite commonly.

Often to do it right you would need to redo quite a lot, however the customer may not be ready for that in many cases so we end up squeezing as much as we can from the current schema which brings us to deal with strange queries which bring various weird issues.

Peter Zaitsev
Admin
18 years ago

Thanks Daniel,

I wanted to add one more thing about it – generally this choice of making Number a binary string is counter intuitive to me.

The number is a string and for numbers it does not matter if it is case sensitive and case insensitive so I’d see CONCAT to simply handle numbers by converting them to the type of other argument.

Though I do not know may be this more intuitive solution would some ugly side effects.

paul
paul
18 years ago

‘@peter

I understand where your coming from and whilst I don’t recommend wrapping crap design with hacks, you could actually eliminate the concat from the select query by adding an additional column to the table and use a trigger which updates this additional column by performing the concat at point of data insert/update. Then by changing the select query to use our new column we have eliminated the concat on where clause and have got all the speed benefits of a properly designed schema.

ron
ron
15 years ago

Thanks for posting this!

I had a query with a CONCAT in it (the db I was forced to use had a bad schema) that was running really slow. Wrapping the integer part of the query in an explicit CAST made a huge difference.

Barry Fisher
15 years ago

Thanks very much. Reduced a timeout server load down to 6 seconds for me. Fantastic advice!

Pierre
Pierre
13 years ago

Thanks! This worked perfectly!

Daniel
Daniel
13 years ago

Thanks for your tip!

As some mentioned above, it would be great to not have to deal with these performance-killers but it still happens…
You really helped me out 🙂

Hopefully i will have time soon to fix the real problem with the whole setup..

Pierre
Pierre
13 years ago

Thanks! I already had to deal twice with this situation. There are always people idealizing things, but practical solutions like this can save huge amount of time. Not always things can be redesigned from scratch…

Titan
Titan
13 years ago

Hi there! Is there any ways to EXPLAIN the SQL which is imported from the SOURCE command! @@

Frank
Frank
12 years ago

Of course that one might need CONCAT in real world, only unexperienced person might think that’s not necessary. I need to calculate IBAN from bank number+acc.number+sufix country number (3 separate columns that by design can only be separate). And cast or convert isn’t doing proper job with a number with 23 digits.

Damodaran
Damodaran
12 years ago

Hi..It had a remarkable performance improvement..Thanks for the post.

Well said Frank..

“Of course that one might need CONCAT in real world, only unexperienced person might think that’s not necessary.”

williamjacques
11 years ago

Thanks for sharing very useful Tips.

johnives
10 years ago

Hi, Aurimas..

Excellent post.I want to thank you for this informative read, I really appreciate sharing this great post. Keep up your work.

Far
Enough.

Said no pioneer ever.
MySQL, PostgreSQL, InnoDB, MariaDB, MongoDB and Kubernetes are trademarks for their respective owners.
© 2026 Percona All Rights Reserved