How do we ensure that, when we have 35+ testable option combinations for mysqld, we test each and every combination of them? For example: will a different innodb_log_file_size combined with more innodb_log_files_in_group and a modified innodb_fast_shutdown setting truly not affect Percona’s log archiving feature?
Most option-related bugs are caused by the setting of 1 or 2 mysqld options to a non-standard value. Maybe in an odd situation 3 mysqld options need to be set in combination. So, starting with 2 option combinations (1 option set is easy to calculate: it matches the number of options to be tested), let’s see how many combinations we would have to run: 35^2 = 1225 combinations. aa, ab, ac, …. ba, bb, bc… etc.
In real life mysqld testing, this is not entirely true as one would never specify “aa” in relation to mysqld options: no-one specifies “mysqld –innodb_log_file_size=x –innodb_log_file_size=x”, as the second option would simply overwrite the first to no effect, and nothing (real) is tested, except for mysqld option handling. The actual formula in this case is 35*(35-1)=1190 etc.
In any case, this means 1000+ times mysqld start, run some test, mysqld stop, analyze logs. Even in parallel mode (we tend to run 8 parallel “trials” – i.e. specific combinations – at the same time). In short, that means: a whole lot of machine processing power.
Now, let’s take it one step further – 3 way testing: 35^3: 42875 combinations. aaa, aab, … aba, abb, … baa, bab, etc. Or, how about making absolutely sure we cover all grounds and go 4 way (like the example we started out with above combined where we employ only a single log archiving option in combination with 3 other options)?
This would mean 35^4 trials (or more correctly about 35*(35-1)*(35-2)*(35-3) trials) = 1256640: 1.2 million trial runs. At 5 minutes run time per trial, 8 trials in parallel, that means: 1256640*5/8/60/24/365=about 1.5 year of testing. Per software patch. And that is just assuming there are only two possible values for each option.
Clearly that will not happen, even with a very expensive server farm. So, we need to be smarter, and this is where option combinatorics comes in.
Though option combinatorics or “pairwise testing” is by no means a new concept (see here and here), and though advanced (sometimes free) software solutions exist (there is a good list here), as far as I know no one has applied this technique to mysqld option test case generation yet. That is until now.
If you have read the article on how we tested Percona Server 5.6, you already know that Percona Server 5.6 quality was always a number one priority. Ensuring that no stone was left unturned, a solution to cover the growing number of mysqld options (both in upstream Oracle’s MySQL and in Percona Server) was necessary.
Studying over time the methodology, existing software applications, their unique features and capabilities, I finally settled on PictMaster (English version) + PICT, a closed source predecessor/brother of QICT, both Microsoft tools. PICT can be downloaded from Microsoft here.
Sidenote: in terms of learning more about the methodology, use the links above and refer to the pages & whitepapers linked from them. You can also read more about PICT/QICT in combination with Pairwise Testing here, and here.
PICT stands for Pairwise Independent Combinatorial Testing. One could also call it PECT – Pairwise Exclusion – as that is what it does (and what Pairwise testing is all about): it excludes “duplicates” (2-way, 3-way, etc.) in option combinations. If you read the article on QICT, how this works should be clear now.
Next, consider PictMaster: a ‘GUI to’ or ‘shell around’ PICT (PICT being an DOS executable). A GUI build in – belief it or not – Microsoft Excel. It is an excellent tool, and very advanced. Even after a lot of use and manual study, I am yet to use some of it’s most advanced features like sub-models and extended sub-models. A basic usage guide is outside of the scope of the article, but if you get stuck, feel free to ping me on Skype (roel.mysql) at any time.
The original runs we did for Percona Server 5.6 were done more or less with these settings:
And in the final “quicker” rounds these settings were used:
As you can see, the 4 and eventually 3-way coverage is huge, even if the number of trials is only respectively 948 and 133. Notice also the long time it takes to generate the final number of testcases: hours. It really makes you wonder about the mathematical dynamics when your high-end computer is churning away on combinatorics for a few hours
One recommendation if you want to get started with PictMater is to immediately have a cursory look through the manual (included with the download), and to un-hide all hidden rows in the original PictMaster excel file. It will give you a better idea of the different areas available, even if you will not be using them.
PictMaster uses some interesting programming tactics, like having to mark cells in some color to relate them to another cell in the same color etc. Once you get used to using it, it is fast and quite easy to use, but be warned there is a learning curve; you will need the PictMaster manual on more then one occasion.
After PictMaster (and PICT) generates the testcases (inserted into a blank excel file), I do some post-processing using the transpose option of Excel (you’ll need a later Excel version or use OpenOffice Calc to transpose huge option sets) and Notepad++ to get the data into a format which can be easily inserted into a combinations.pl (CC) configuration file for RQG.
In terms of handling “not-used” options, or for options that do not have a parameter value assigned (like RQG’s “–notnull” option which is also part of my PictMaster option matrix), I used two workarounds by inserting “NOTUSED” and “SINGLE” into the “value hierarchy” (i.e. possible values) row in PictMaster, and then handle those keywords through post-processing in Notepad++.
Notepad++ has excellent extended syntax (n,t etc.) plus great regex support, so you can do whatever is necessary to get from PictMaster output to combinations.pl (RQG) ready syntax.
Percona’s widely read Percona Data Performance blog highlights our expertise in enterprise-class software, support, consulting and managed services solutions for both MySQL® and MongoDB® across traditional and cloud-based platforms. The decades of experience represented by our consultants is found daily in numerous and relevant blog posts.
Besides specific database help, the blog also provides notices on upcoming events and webinars.
Want to get weekly updates listing the latest blog posts? Subscribe to our blog now! Submit your email address below and we’ll send you an update every Friday at 1pm ET.