How Percona Support Handles Bugs

how percona handles bugsOne of the great things about Percona, and a Percona Support contract, is that we not only guarantee application performance but we also provide bug fixes for covered software—and not just advice on how to use it. This is most likely missing from most customer’s in-house support, as it requires a team with code knowledge to build and test infrastructure, which only a few companies can afford to invest in.

Whether you deploy MySQL®, MariaDB®, MongoDB®, or PostgreSQL—on-premise, in the cloud, in a DBaaS environment, bare metal, virtualized or containerized—Percona Support has you and your database covered.

Now, back to bugs. While there is no such thing as “bug-free” software, there are often some misunderstandings about bugs and how they are handled. What is a bug? What is a feature? What is a repeatable bug? How will Percona troubleshoot the bug? In this post, we’ll answer some of these questions, and detail how Percona Support supports bug reporting.

Features vs. Bugs

Sometimes, software is designed to work a certain way that may not be what some users expect or want. However, that doesn’t mean that it is a “bug” in the true sense—it may just require a change in behavior to use in the correct manner rather than the way it was utilized in the past. These are considered features rather than bugs.

Unfixable Bugs

There are some behaviors that most people would call a bug, but they arise from design limitations or oversight that are impossible to fix in the current GA version without introducing changes that would destabilize the software. These bugs will need to be fixed in future GA releases. Some bugs are not bugs but rather design tradeoffs. These can’t be “fixed” unless tradeoffs are made, and are therefore tied closer to “features” than bugs.

Workaround

There are going to be unexpected behaviors, unfixable bugs, and bugs that take time to fix, so our first practical response to running into this type of bug is finding a workaround that does not expose it. The Percona Support team helps identify these types of bugs and build workarounds that will result in minimal impact on your business. But be prepared: changes to the application, deployed version, schema, or configuration are often required.

Emergencies

Emergencies are just that—emergencies. When you have one, Percona’s first area of focus is to restore your system to working order. Percona offers 24x7x365 support for production outages to all of our support customers, as well as options for real-time electronic and phone access to its expert technical support team, not just asynchronous communications through a ticketing system.

Bug Turnaround Times

We cannot guarantee turnaround time on a bug fix, as all bugs are different. Some are rather trivial for which we can provide a hotfix as soon as 24 hours after we have a repeatable test case. Others are much more complicated and can take weeks of engineering to fix (or be determined non-fixable in the current GA version of the software). The best thing to do is to report a bug and provide any additional information which would be helpful to get it resolved. (Check out our article “How to report bugs, improvements, and new feature requests” for more information.

Verified Bug Fixes

Once you submit a bug, we will first verify if it is actually a bug. As we detailed above, it might be a feature, or intended behavior, or a user mistake. It’s also possible that it only happened one time and it cannot be repeated. Having a repeatable test case that reveals a bug is the best way for it to be fixed quickly. Our support team is often able to help you create a test case if you’re unable to do so on your own.

Sporadic Bugs

Bugs that only show up sporadically are the hardest ones to fix. For example, you might have a system crash once every few months with no way to repeat it. The cause of such bugs can be very complicated; such as a buffer overrun in one piece of code causing corruption and crashes in other places hours later. And while there are a number of diagnostic tools that exist for such bugs, they can still take some time to resolve. Finally, without that repeatable test case, it is often impossible to verify that the proposed fix actually resolves the bug.

Environmental Bugs

Some bugs are caused by what can be called your “environment”, or setup. It could be hardware bugs or incompatibilities, a build not quite compatible with your version of the operating system, etc. In some cases, we can very clearly point to issues in your environment, and in others, we may suspect the environment is an issue and will ask to see if the bug also happens in other environments, such as with different hardware or OS installation.

Hot Fixes

Our default policy is that we fix bugs in the next release of our software so it can go through the full GA cycle and be properly documented. If workaround can be found so that you can wait until the next release for a fix, this is the best choice. If not, with a Percona Support Contract, we can provide you with a hotfix—a special build containing the version of the software you’re running, with the bug fix of interest applied. Hotfixes are especially helpful if you’re not looking to do a full software upgrade—requiring several revisions—but want to validate the fix with the minimum number of changes. Hotfixes might also be different from the final bug fix that goes into the GA release, as our goal is to provide a working solution for you faster. Afterward, we may optimize or re-architect the code, come up with better option names, etc. that will resolve any outstanding bugs.

Bug Diagnostics

Depending on the nature of the bug, there are multiple tools that our support team will use for diagnostics and finding a way to fix the bug. To set expectations, this can be a very involved process requiring that you provide information or try things on your system, such as:

  • If you have a test case that can be repeated by the Percona team to trigger the bug, the diagnostic problem is solved from the customer side. Internal debugging starts at this point.
  • If we have a crash that we can’t repeat on our system we may ask you to enable “core” file or run the program under a debugger so we can get more information when the crash happens.
  • If the problem is related to performance, you should be ready to gather information such as EXPLAIN, status counters, information from performance schema, etc. along with system-level information like pt-pmp output, pt-stalk, oprofile, or perf.
  • If the problem is a “deadlock,” we often need information from gdb about the full state of the system. Information from processlist, performance_schema, and SHOW ENGINE INNODB STATUS can also be helpful.
  • It may also be helpful to have a test system on which you can repeat the problem in your environment and experiment without impacting a production environment. It is not possible in all cases, but it is useful for bug resolution.
  • Sometimes, for hard-to-repeat bugs, we will need to run a special diagnostics build that provides us with additional debug information. Or, we might need to run a debug build or do a run under valgrind or other software designed to catch bugs. This can have a large performance impact, so it is good to see if your workload can be scaled down for this to be feasible.
  • Depending on your environment, we might need to login to troubleshoot your bug or request that you upload the data needed to repeat the bug in our lab (assuming it is not too sensitive). In cases where direct login is not possible, we can help you create a repeatable test case via phone, chat, or email. Using screen sharing can also be very helpful.

Bugs and Non-Percona Software

Percona Support does cover some software not produced by Percona. For open source software, if it is not exempt from bug fix support, we will provide the custom build with a bug fix as well as provide the suggested fix to the software maintainer for its possible inclusion in its next release. For example, if we find a bug in the MySQL Community Edition, we will pass our suggested fix to the MySQL Engineering team at Oracle. For other software that is not open source, such as Amazon RDS, we can help to facilitate creation and submission of a repeatable test case and workaround, but we can’t provide a fix as we do not have access to the source code.

In Conclusion

When we think about software bugs, there are some good parallels with human “bugs”. Some issues are trivial to diagnose and the fix is obvious, while others might be very hard to diagnose, with doctor after doctor still not able to determine the cause of your disease. Then, even when the diagnosis is found, a cure is not always available or feasible, and we have to settle for “managing” a disease—our parallel to implementing changes and settling for a workaround. In the same way as human doctors, we can’t guarantee we will get to the root of every problem, or fix every problem we find. However, as with having good doctors, having us on your team will help maximize your chances of a successful bug resolution.

How Percona Can Help

Percona’s experts can maximize your database performance with our open source database support, managed services or consulting professional services. For more information on our database services, contact us at +1-888-316-9775 (USA), +44 203 608 6727 (Europe), or have us reach out to you directly.

Share this post

Leave a Reply