Where the open source database community meets: Use code PERCONA75 and secure your spot for Percona Live.  Register

Aurora Hash Join Optimization (with a Gentle Reminder on Lab Features)

January 31, 2018
Author
Jervin Real
Share this Post:

Aurora Hash Join Lab ModeThe Aurora hash join feature for relational databases has been around for a while now. But unlike MySQL Block Nested Loop algorithm, an Aurora hash join only caters to a specific number of use cases. When implemented with the optimizer properly, they can provide great benefits with certain workloads. Below we’ll see a brief example of a quick win.

This new feature is available in Aurora lab mode version 1.16. Because this is a lab feature, it’s important to make sure to test your queries before upgrading, especially if you are looking to scale up to the new R4 instances before the Superbowl to avoid hitting the same problem I discuss below.

When lab mode is enabled and hash_join is ON, you can verify the optimizer feature from the optimizer_switch variable:

Hash joins work well when joining large result sets because – unlike block nested loop in the same query – the optimizer scans the larger table and matches it against the hashed smaller table instead of the other way around. Consider the tables and query below:

With hash joins enabled, we can see from the Extra column in the EXPLAIN output how it builds the join conditions:

Without hash joins, it’s a straightforward Cartesian (almost) product of all three tables:

Now, the execution times without hash joins enabled:

Clearly with this optimization enabled, we have more than a 50% gain from the example query.

Now while this type of query might be rare, most of us know we need to avoid really large JOINs as they are not scalable. But at some point, we find some that take advantage of the feature. Here is an excerpt from an actual production query I’ve recently worked on. It shows the good execution plan versus the one using hash joins.

This particular EXPLAIN output only differs in the row where without a hash join, it uses an index, and the query executes normally. With the hash join enabled, the optimizer thought it was better to use it instead:

Needless to say, it caused problems. Unfortunately, a bug on Aurora 1.16 exists where hash joins cannot be turned off selectively (it is enabled by default) from the parameter group. If you try this, you get an error “Error saving: Invalid parameter value: hash_join=off for: optimizer_switch”. The only way to disable the feature is to turn off lab_mode, which requires an instance restart. An alternative is to simply add SET optimizer_switch='hash_join=off'; from the application, especially if you rely on some of the other lab mode features in Aurora.

To summarize, the new hash join feature is a great addition. But as it’s a lab feature, be careful when upgrading!

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Far
Enough.

Said no pioneer ever.
MySQL, PostgreSQL, InnoDB, MariaDB, MongoDB and Kubernetes are trademarks for their respective owners.
© 2026 Percona All Rights Reserved