Remember when Percona significantly improved query processing time by fixing the optimizer bug? I have described all the details in More Performant Query Processing in Percona Server for MySQL blog post. This time, we dug deeper into all the ideas from Enhanced for MySQL and based on our analysis, we proposed several new improvements. All the changes are available in Percona Server 8.0.43 and 8.4.6, and newer versions

Before discussing specific improvements, let me describe how we measured the kind of performance gain we achieved. The top-level approach is common for benchmarks: you need a baseline version and then compare your modified version to that baseline. At Percona, we have a dedicated environment for performance testing. For detailed information about our testing methodology, the sysbench workloads used, and the hardware configuration, please see our previous blog post: Percona Server for MySQL Performance Improvements.

Let’s get to improvements

One of the ideas was to implement mem_root_deque class using std::list instead of a custom implementation. This class is used throughout the SQL engine to hold lists of expressions, fields, and table references during query parsing, optimization, and execution, making memory management more efficient and predictable across the core of MySQL’s query processing pipeline. Implementation of the change was quite easy, because of mem_root_dequeue well-defined interface that fits almost perfectly with std::list interface.

Here are the results of benchmarking. Each cell displays the improvement percentage compared to the baseline version. In other words, it says how much faster or slower is the changed version in comparison to the baseline version.

We can observe a noticeable improvement in nearly all cases. Once we filter out those affected by jitter, only a single case remains, showing a slight decrease in performance. However, this minor exception does not detract from the overall trend, as the general performance gains are clear and significant. The evidence strongly supports that the enhancements lead to a consistent and measurable improvement, demonstrating the effectiveness of the changes implemented.

Another improvement we decided to embrace is the idea of inlining several functions involved in hot execution paths. One case was especially interesting. In commit WL#13899 : INSTANT DROP (and ADD) COLUMN, Oracle refactored the rec_init_offsets() function. A portion of the code was extracted into a separate function, and the if-else series was refactored into a switch-case. Even after inlining the new function, we haven’t observed the expected improvement. Deeper analysis revealed that refactoring the if-else to a switch-case introduced an additional conditional jump at the assembler level. Reverting this part to the original version showed a performance boost.

Summary

Yes, it is true that these improvements are not dramatic or game-changing in the sense of producing a 10% boost for every single case. They are, instead, relatively modest gains – subtle refinements that accumulate over time. Historically, we have seen similar patterns in systems like MySQL, where major bottlenecks were gradually identified and eliminated, leading to significant overall performance enhancements. Yet even after the most obvious inefficiencies were addressed, there remained – and likely still remain – numerous smaller opportunities for optimization. These incremental gains, usually on the order of a few tenths of a percent, may seem insignificant on their own, but in aggregate they meaningfully enhance overall system performance. The pursuit of these small improvements is a continuous process, highlighting the ongoing potential for fine-tuning and optimization even in mature, well-optimized software.

Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments