]]>
]]>

You are here

Latest MySQL Performance Blog posts

Subscribe to Latest MySQL Performance Blog posts feed
Percona's MySQL & InnoDB performance and scalability blog
Updated: 1 hour 32 min ago

Percona XtraBackup 2.2.9 is now available

February 17, 2015 - 10:22am

Percona is glad to announce the release of Percona XtraBackup 2.2.9 on February 17, 2015. Downloads are available from our download site or Percona Software Repositories.

Percona XtraBackup enables MySQL backups without blocking user queries, making it ideal for companies with large data sets and mission-critical applications that cannot tolerate long periods of downtime. Offered free as an open source solution, Percona XtraBackup drives down backup costs while providing unique features for MySQL backups.

Bugs Fixed:

  • Percona XtraBackup was vulnerable to MITM attack which could allow exfiltration of MySQL configuration information via --version-check option. This vulnerability was logged as CVE 2015-1027. Bug fixed #1408375.
  • xtrabackup_galera_info file isn’t overwritten during the Galera auto-recovery. Bug fixed #1418584.
  • Percona XtraBackup man pages are now included with binary packages. Bug fixed #1156209.
  • Percona XtraBackup now sets the maximum supported session value for lock_wait_timeout variable to prevent unnecessary timeouts when the global value is changed from the default. Bug fixed #1410339.
  • New option --backup-locks, enabled by default, has been implemented to control if backup locks will be used even if they are supported by the server. To disable backup locks innobackupex should be run with innobackupex --no-backup-locks option. Bug fixed #1418820.

Release notes with all the bugfixes for Percona XtraBackup 2.2.9 are available in our online documentation. Bugs can be reported on the launchpad bug tracker. Percona XtraBackup is an open source, free MySQL hot backup software that performs non-blocking backups for InnoDB and XtraDB databases.

The post Percona XtraBackup 2.2.9 is now available appeared first on MySQL Performance Blog.

‘Indexing’ JSON documents for efficient MySQL queries over JSON data

February 17, 2015 - 8:45am
MySQL meets NoSQL with JSON UDF

I recently got back from FOSDEM, in Brussels, Belgium. While I was there I got to see a great talk by Sveta Smirnova, about her MySQL 5.7 Labs release JSON UDF functions. It is important to note that while the UDF come in a 5.7 release it is absolutely possible to compile and use the UDF with earlier versions of MySQL because the UDF interface has not changed for a long time. However, the UDF should still be considered alpha/preview level of quality and should not be used in production yet! For this example I am using Percona Server 5.6 with the UDF.

That being said, the proof-of-concept that I’m about to present here uses only one JSON function (JSON_EXTRACT) and it has worked well enough in my testing to present my idea here. The JSON functions will probably be GA sometime soon anyway, and this is a useful test of the JSON_EXTRACT function.

The UDF let you parse, search and manipulate JSON data inside of MySQL, bringing MySQL closer to the capabilities of a document store.

Since I am using Percona Server 5.6, I needed to compile and install the UDF. Here are the steps I took to compile the plugin:

  1. $ cd mysql-json-udfs-0.3.3-labs-json-udfs-src
  2. $ cmake -DMYSQL_DIR=/usr/local/mysql .
  3. $ sudo make install
  4. $ sudo cp *.so /usr/local/mysql/lib/plugin
JSON UDF are great, but what’s the problem

The JSON functions work very well for manipulating individual JSON objects, but like all other functions, using JSON_EXTRACT in the WHERE clause will result in a full table scan. This means the functions are virtually useless for searching through large volumes of JSON data.  If you want to use MySQL as a document store, this is going to limit the usefulness in the extreme as the ability to extract key/value pairs from JSON documents is powerful, but without indexing it can’t scale well.

What can be done to index JSON in MySQL for efficient access?

The JSON UDF provides a JSON_EXTRACT function which can pull data out of a JSON document. There are two ways we can use this function to “index” the JSON data.

  1. Add extra columns to the table (or use a separate table, or tables) containing the JSON and populate the columns using JSON_EXTRACT in a trigger. The downside is that this slows down inserts and modifications of the documents significantly.
  2. Use Flexviews materialized views to maintain an index table separately and asynchronously. The upside is that insertion/modification speed is not affected, but there is slight delay before index is populated. This is similar to eventual consistency in a document store.

Writing triggers is an exercise I’ll leave up to the user. The rest of this post will discuss using Flexviews materialized views to create a JSON index.

What is Flexviews?

Flexviews can create ‘incrementally refreshable’ materialized views. This means that the views are able to be refreshed efficiently using changes captured by FlexCDC, the change data capture tool that ships with Flexviews. Since the view can be refreshed fast, it is possible to refresh it frequently and have a low latency index, but not one perfectly in sync with the base table at all times.

The materialized view is a real table that is indexed to provide fast access. Flexviews includes a SQL_API, or a set of stored procedures for defining and maintaining materialized views.

See this set of slides for an overview of Flexviews: http://www.slideshare.net/MySQLGeek/flexviews-materialized-views-for-my-sql

Demo/POC using materialized view as an index

The first step to creating an incrementally refreshable materialized view with Flexviews, is to create a materialized view change log on all of the tables used in the view. The CREATE_MVLOG($schema, $table) function creates the log and FlexCDC will immediately being to collect changes into it.

mysql> call flexviews.create_mvlog('ssb_json','json'); Query OK, 1 row affected (0.01 sec)

Next, the materialized view name, and refresh type must be registered with the CREATE($schema, $mvname, $refreshtype) function:

mysql> call flexviews.create('ssb_json','json_idx','INCREMENTAL'); Query OK, 0 rows affected (0.00 sec) -- save the just generated identifier for the view. You can use GET_ID($schema,$mvname) later. mysql> set @mvid := last_insert_id(); Query OK, 0 rows affected (0.00 sec)

Now one or more tables have to be added to the view using the ADD_TABLE($mvid, $schema, $table, $alias,$joinclause) function. This example will use only one table, but Flexviews supports joins too.

mysql> call flexviews.add_table(@mvid, 'ssb_json','json','json',null); Query OK, 1 row affected (0.00 sec)

Expressions must be added to the view next. Since aggregation is not used in this example, the expressions should be ‘COLUMN’ type expressions. The function ADD_EXPR($mvid, $expression_type, $expression, $alias) is used to add expressions. Note that JSON_EXTRACT returns a TEXT column, so I’ve CAST the function to integer so that it can be indexed. Flexviews does not currently have a way to define prefix indexes.

mysql> call flexviews.add_expr(@mvid, 'COLUMN', "cast(json_extract(doc,'D_DateKey') as date)", 'D_DateKey'); Query OK, 1 row affected (0.00 sec) mysql> call flexviews.add_expr(@mvid, 'COLUMN',"cast(json_extract(doc,'C_CustomerKey') as unsigned)", 'C_CustomerKey'); Query OK, 1 row affected (0.00 sec) mysql> call flexviews.add_expr(@mvid, 'COLUMN',"cast(json_extract(doc,'S_SuppKey') as unsigned)", 'S_SuppKey'); Query OK, 1 row affected (0.01 sec) mysql> call flexviews.add_expr(@mvid, 'COLUMN',"cast(json_extract(doc,'P_PartKey') as unsigned)", 'P_PartKey'); Query OK, 1 row affected (0.00 sec) mysql> call flexviews.add_expr(@mvid, 'COLUMN',"cast(json_extract(doc,'LO_OrderKey') as unsigned)", 'LO_OrderKey'); Query OK, 1 row affected (0.00 sec) mysql> call flexviews.add_expr(@mvid, 'COLUMN',"cast(json_extract(doc,'LO_LineNumber') as unsigned)", 'LO_LineNumber'); Query OK, 1 row affected (0.00 sec)

I’ve also projected out the ‘id’ column from the table, which is the primary key. This ties the index entries to the original row, so that the original document can be retrieved.

mysql> call flexviews.add_expr(@mvid, 'COLUMN', 'id', 'id'); Query OK, 1 row affected (0.00 sec)

Since we want to use the materialized view as an index, we need to index the columns we’ve added to it.

mysql> call flexviews.add_expr(@mvid, 'KEY',"LO_LineNumber", 'LO_LineNumber_Idx'); Query OK, 1 row affected (0.00 sec) mysql> call flexviews.add_expr(@mvid, 'KEY',"LO_OrderKey", 'LO_OrderKey_Idx'); Query OK, 1 row affected (0.00 sec) mysql> call flexviews.add_expr(@mvid, 'KEY',"P_PartKey", 'P_PartKey_Idx'); Query OK, 1 row affected (0.00 sec) mysql> call flexviews.add_expr(@mvid, 'KEY',"S_SuppKey", 'S_SuppKey_Idx'); Query OK, 1 row affected (0.00 sec) mysql> call flexviews.add_expr(@mvid, 'KEY',"D_DateKey", 'D_DateKey_Idx'); Query OK, 1 row affected (0.00 sec) mysql> call flexviews.add_expr(@mvid, 'KEY',"C_CustomerKey", 'C_CustomerKey_Idx'); Query OK, 1 row affected (0.00 sec)

Finally, the view has to be created. There are 6 million rows in my table, the JSON functions are UDF so they are not as fast as built in functions, and I indexed a lot of things (six different indexes are being populated at once) so it takes some time to build the index:

mysql> call flexviews.enable(@mvid); Query OK, 2 rows affected (35 min 53.17 sec)

After the materialized view is built, you can see it in the schema. Note there is also a delta table, which I will explain a bit later.

mysql> show tables; +--------------------+ | Tables_in_ssb_json | +--------------------+ | json | | json_idx | | json_idx_delta | +--------------------+ 3 rows in set (0.00 sec)

Here is the table definition of json_idx, our materialized view. You can see it is indexed:

CREATE TABLE `json_idx` ( `mview$pk` bigint(20) NOT NULL AUTO_INCREMENT, `D_DateKey` date DEFAULT NULL, `C_CustomerKey` bigint(21) unsigned DEFAULT NULL, `S_SuppKey` bigint(21) unsigned DEFAULT NULL, `P_PartKey` bigint(21) unsigned DEFAULT NULL, `LO_OrderKey` bigint(21) unsigned DEFAULT NULL, `LO_LineNumber` bigint(21) unsigned DEFAULT NULL, `id` bigint(20) NOT NULL DEFAULT '0', `mview$hash` int(10) unsigned DEFAULT NULL, PRIMARY KEY (`mview$pk`), KEY `LO_LineNumber_Idx` (`LO_LineNumber`), KEY `LO_OrderKey_Idx` (`LO_OrderKey`), KEY `P_PartKey_Idx` (`P_PartKey`), KEY `S_SuppKey_Idx` (`S_SuppKey`), KEY `D_DateKey_Idx` (`D_DateKey`), KEY `C_CustomerKey_Idx` (`C_CustomerKey`), KEY `mview$hash_key` (`mview$hash`) ) ENGINE=InnoDB AUTO_INCREMENT=6029221 DEFAULT CHARSET=latin1;

Here are some sample contents. You can see the integer values extracted out of the JSON:

mysql> select * from json_idx limit 10; +----------+------------+---------------+-----------+-----------+-------------+---------------+----+------------+ | mview$pk | D_DateKey | C_CustomerKey | S_SuppKey | P_PartKey | LO_OrderKey | LO_LineNumber | id | mview$hash | +----------+------------+---------------+-----------+-----------+-------------+---------------+----+------------+ | 1 | 1996-08-08 | 6748 | 1 | 178778 | 35620 | 2 | 1 | 3151656687 | | 2 | 1994-05-20 | 5272 | 1 | 52362 | 102790 | 4 | 2 | 2181615425 | | 3 | 1995-05-04 | 22870 | 1 | 136407 | 146757 | 3 | 3 | 544130577 | | 4 | 1996-06-16 | 12217 | 1 | 129103 | 151200 | 1 | 4 | 2969697839 | | 5 | 1992-07-20 | 21757 | 1 | 35243 | 151745 | 1 | 5 | 1438921571 | | 6 | 1997-08-16 | 18760 | 1 | 150766 | 159232 | 6 | 6 | 3941775529 | | 7 | 1994-03-04 | 757 | 1 | 15750 | 188902 | 3 | 7 | 2142628088 | | 8 | 1993-11-04 | 17830 | 1 | 192023 | 253828 | 5 | 8 | 3480147565 | | 9 | 1993-07-12 | 16933 | 1 | 59997 | 269062 | 5 | 9 | 3572286145 | | 10 | 1998-06-16 | 26660 | 1 | 30023 | 298272 | 3 | 10 | 1971966244 | +----------+------------+---------------+-----------+-----------+-------------+---------------+----+------------+ 10 rows in set (0.00 sec)

Now, there needs to be an easy way to use this index in a select statement. Since a JOIN is needed between the materialized view and the base table, a regular VIEW makes sense to access the data. We’ll call this the index view:

mysql> create view json_idx_v as select * from json natural join json_idx; Query OK, 0 rows affected (0.00 sec)

And just for completeness, here is the contents of a row from our new index view:

mysql> select * from json_idx_v limit 1G *************************** 1. row *************************** id: 1 doc: {"LO_OrderKey":"35620","LO_LineNumber":"2","LO_CustKey":"6748","LO_PartKey":"178778","LO_SuppKey":"1","LO_OrderDateKey":"19960808","LO_OrderPriority":"3-MEDIUM","LO_ShipPriority":"0","LO_Quantity":"38","LO_ExtendedPrice":"7055726","LO_OrdTotalPrice":"14691804","LO_Discount":"8","LO_Revenue":"6491267","LO_SupplyCost":"111406","LO_Tax":"1","LO_CommitDateKey":"19960909","LO_ShipMode":"REG AIR","C_CustomerKey":"6748","C_Name":"Customer#000006748","C_Address":"RSPYBRlR7RX6 f7J8","C_City":"PERU 5","C_Nation":"PERU","C_Region":"AMERICA","C_Phone":"27-580-967-4556","C_MktSegment":"AUTOMOBILE","S_SuppKey":"1","S_Name":"Supplier#000000001","S_Address":"sdrGnXCDRcfriBvY0KL,i","S_City":"PERU 0","S_Nation":"PERU","S_Region":"AMERICA","S_Phone":"27-989-741-2988","D_DateKey":"19960808","D_Date":"Augest 8, 1996","D_DayOfWeek":"Friday","D_Month":"Augest","D_Year":"1996","D_YearMonthNum":"199608","D_YearMonth":"Aug1996","D_DayNumInWeek":"6","D_DayNumInMonth":"8","D_DayNumInYear":"221","D_MonthNumInYear":"8","D_WeekNumInYear":"32","D_SellingSeason":"Summer","D_LastDayInWeekFl":"0","D_LastDayInMonthFl":"1","D_HolidayFl":"0","D_WeekDayFl":"1","P_PartKey":"178778","P_Name":"turquoise powder","P_MFGR":"MFGR#1","P_Category":"MFGR#11","P_Brand":"MFGR#1125","P_Colour":"beige","P_Type":"STANDARD POLISHED NICKEL","P_Size":"25","P_Container":"JUMBO BAG"} mview$pk: 1 D_DateKey: 1996-08-08 C_CustomerKey: 6748 S_SuppKey: 1 P_PartKey: 178778 LO_OrderKey: 35620 LO_LineNumber: 2 mview$hash: 3151656687 1 row in set (0.00 sec)

Using the UDF to find a document

The UDF does a full table scan, parsing all six million documents (TWICE!) as it goes along. Unsurprisingly, this is slow:

mysql> select * from json where json_extract(doc,'LO_OrderKey') = 35620 and json_extract(doc,'LO_LineNumber') = 2G *************************** 1. row *************************** id: 1 doc: {"LO_OrderKey":"35620","LO_LineNumber":"2","LO_CustKey":"6748","LO_PartKey":"178778","LO_SuppKey":"1","LO_OrderDateKey":"19960808","LO_OrderPriority":"3-MEDIUM","LO_ShipPriority":"0","LO_Quantity":"38","LO_ExtendedPrice":"7055726","LO_OrdTotalPrice":"14691804","LO_Discount":"8","LO_Revenue":"6491267","LO_SupplyCost":"111406","LO_Tax":"1","LO_CommitDateKey":"19960909","LO_ShipMode":"REG AIR","C_CustomerKey":"6748","C_Name":"Customer#000006748","C_Address":"RSPYBRlR7RX6 f7J8","C_City":"PERU 5","C_Nation":"PERU","C_Region":"AMERICA","C_Phone":"27-580-967-4556","C_MktSegment":"AUTOMOBILE","S_SuppKey":"1","S_Name":"Supplier#000000001","S_Address":"sdrGnXCDRcfriBvY0KL,i","S_City":"PERU 0","S_Nation":"PERU","S_Region":"AMERICA","S_Phone":"27-989-741-2988","D_DateKey":"19960808","D_Date":"Augest 8, 1996","D_DayOfWeek":"Friday","D_Month":"Augest","D_Year":"1996","D_YearMonthNum":"199608","D_YearMonth":"Aug1996","D_DayNumInWeek":"6","D_DayNumInMonth":"8","D_DayNumInYear":"221","D_MonthNumInYear":"8","D_WeekNumInYear":"32","D_SellingSeason":"Summer","D_LastDayInWeekFl":"0","D_LastDayInMonthFl":"1","D_HolidayFl":"0","D_WeekDayFl":"1","P_PartKey":"178778","P_Name":"turquoise powder","P_MFGR":"MFGR#1","P_Category":"MFGR#11","P_Brand":"MFGR#1125","P_Colour":"beige","P_Type":"STANDARD POLISHED NICKEL","P_Size":"25","P_Container":"JUMBO BAG"} 1 row in set (54.49 sec) mysql> explain select * from json where json_extract(doc,'LO_OrderKey') = 35620 and json_extract(doc,'LO_LineNumber') = 2G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: json type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 5233236 Extra: Using where 1 row in set (0.00 sec)

Using the index view to find a document

mysql> select id, doc from json_idx_v where LO_OrderKey = 35620 and LO_LineNumber = 2G *************************** 1. row *************************** id: 1 doc: {"LO_OrderKey":"35620","LO_LineNumber":"2","LO_CustKey":"6748","LO_PartKey":"178778","LO_SuppKey":"1","LO_OrderDateKey":"19960808","LO_OrderPriority":"3-MEDIUM","LO_ShipPriority":"0","LO_Quantity":"38","LO_ExtendedPrice":"7055726","LO_OrdTotalPrice":"14691804","LO_Discount":"8","LO_Revenue":"6491267","LO_SupplyCost":"111406","LO_Tax":"1","LO_CommitDateKey":"19960909","LO_ShipMode":"REG AIR","C_CustomerKey":"6748","C_Name":"Customer#000006748","C_Address":"RSPYBRlR7RX6 f7J8","C_City":"PERU 5","C_Nation":"PERU","C_Region":"AMERICA","C_Phone":"27-580-967-4556","C_MktSegment":"AUTOMOBILE","S_SuppKey":"1","S_Name":"Supplier#000000001","S_Address":"sdrGnXCDRcfriBvY0KL,i","S_City":"PERU 0","S_Nation":"PERU","S_Region":"AMERICA","S_Phone":"27-989-741-2988","D_DateKey":"19960808","D_Date":"Augest 8, 1996","D_DayOfWeek":"Friday","D_Month":"Augest","D_Year":"1996","D_YearMonthNum":"199608","D_YearMonth":"Aug1996","D_DayNumInWeek":"6","D_DayNumInMonth":"8","D_DayNumInYear":"221","D_MonthNumInYear":"8","D_WeekNumInYear":"32","D_SellingSeason":"Summer","D_LastDayInWeekFl":"0","D_LastDayInMonthFl":"1","D_HolidayFl":"0","D_WeekDayFl":"1","P_PartKey":"178778","P_Name":"turquoise powder","P_MFGR":"MFGR#1","P_Category":"MFGR#11","P_Brand":"MFGR#1125","P_Colour":"beige","P_Type":"STANDARD POLISHED NICKEL","P_Size":"25","P_Container":"JUMBO BAG"} 1 row in set (0.00 sec) mysql> explain select id, doc from json_idx_v where LO_OrderKey = 35620 and LO_LineNumber = 2G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: json_idx type: ref possible_keys: LO_LineNumber_Idx,LO_OrderKey_Idx key: LO_OrderKey_Idx key_len: 9 ref: const rows: 4 Extra: Using index condition; Using where *************************** 2. row *************************** id: 1 select_type: SIMPLE table: json type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 8 ref: ssb_json.json_idx.id rows: 1 Extra: NULL 2 rows in set (0.00 sec)

Keeping the index in sync

Flexviews materialized views need to be refreshed when the underlying table changes. Flexviews includes a REFRESH($mvid, $mode, $transaction_id) function.

I am going to remove one document from the table:

mysql> delete from json where id = 10000; Query OK, 1 row affected (0.01 sec)

Note there is now one row in the materialized view change log. dml_type is -1 because it is a delete:

mysql> select * from flexviews.mvlog_f1673fac9814a93508a1c917566ecd4dG *************************** 1. row *************************** dml_type: -1 uow_id: 113 fv$server_id: 33 fv$gsn: 1083 id: 10000 doc: {"LO_OrderKey":"3359521","LO_LineNumber":"2","LO_CustKey":"10306","LO_PartKey":"77997","LO_SuppKey":"4","LO_OrderDateKey":"19951010","LO_OrderPriority":"2-HIGH","LO_ShipPriority":"0","LO_Quantity":"43","LO_ExtendedPrice":"8492457","LO_OrdTotalPrice":"27032802","LO_Discount":"2","LO_Revenue":"8322607","LO_SupplyCost":"118499","LO_Tax":"4","LO_CommitDateKey":"19951228","LO_ShipMode":"FOB","C_CustomerKey":"10306","C_Name":"Customer#000010306","C_Address":"4UR9tz8","C_City":"ROMANIA 5","C_Nation":"ROMANIA","C_Region":"EUROPE","C_Phone":"29-361-986-3513","C_MktSegment":"BUILDING","S_SuppKey":"4","S_Name":"Supplier#000000004","S_Address":"qGTQJXogS83a7MB","S_City":"MOROCCO 4","S_Nation":"MOROCCO","S_Region":"AFRICA","S_Phone":"25-128-190-5944","D_DateKey":"19951010","D_Date":"Octorber 10, 1995","D_DayOfWeek":"Wednesday","D_Month":"Octorber","D_Year":"1995","D_YearMonthNum":"199510","D_YearMonth":"Oct1995","D_DayNumInWeek":"4","D_DayNumInMonth":"10","D_DayNumInYear":"283","D_MonthNumInYear":"10","D_WeekNumInYear":"41","D_SellingSeason":"Fall","D_LastDayInWeekFl":"0","D_LastDayInMonthFl":"1","D_HolidayFl":"0","D_WeekDayFl":"1","P_PartKey":"77997","P_Name":"burnished olive","P_MFGR":"MFGR#2","P_Category":"MFGR#24","P_Brand":"MFGR#2426","P_Colour":"orchid","P_Type":"MEDIUM PLATED TIN","P_Size":"16","P_Container":"WRAP PKG"} 1 row in set (0.01 sec)

Now we can verify the materialized view is out of date:

mysql> select * from json_idx where id = 10000; +----------+------------+---------------+-----------+-----------+-------------+---------------+-------+------------+ | mview$pk | D_DateKey | C_CustomerKey | S_SuppKey | P_PartKey | LO_OrderKey | LO_LineNumber | id | mview$hash | +----------+------------+---------------+-----------+-----------+-------------+---------------+-------+------------+ | 10000 | 1995-10-10 | 10306 | 4 | 77997 | 3359521 | 2 | 10000 | 2937185172 | +----------+------------+---------------+-----------+-----------+-------------+---------------+-------+------------+ 1 row in set (2.60 sec)

To bring the index up to date we must refresh it. Usually you will use the ‘BOTH’ mode to ‘COMPUTE’ and ‘APPLY’ the changes at the same time, but I am going to use COMPUTE mode to show you what ends up in the delta table:

mysql> select * from json_idx_deltaG *************************** 1. row *************************** dml_type: -1 uow_id: 113 fv$gsn: 1083 D_DateKey: 1995-10-10 C_CustomerKey: 10306 S_SuppKey: 4 P_PartKey: 77997 LO_OrderKey: 3359521 LO_LineNumber: 2 id: 10000 mview$hash: 2937185172 1 row in set (0.00 sec)

Delta tables are similar to materialized view change log tables, except they contain insertions and deletions to the view contents. In this case, you can see dml_type is -1 and id = 10000, so the row from the view corresponding to the row we deleted, will be deleted when the change is applied.

Finally the change can be applied:

mysql> call flexviews.refresh(flexviews.get_id('ssb_json','json_idx'), 'APPLY',NULL); Query OK, 2 rows affected (0.47 sec) mysql> select * from json_idx where id = 10000; -- note, we didn't index id in the MV Empty set (2.61 sec)

Finally, it makes sense to try to keep the index in sync as quickly as possible using a MySQL event:

DELIMITER ;; CREATE EVENT IF NOT EXISTS flexviews.refresh_json_idx ON SCHEDULE EVERY 1 SECOND DO BEGIN DECLARE v_got_lock tinyint default 0; SELECT GET_LOCK('JSON_IDX_LOCK', 0) INTO v_got_lock; IF v_got_lock = 1 THEN CALL flexviews.refresh(flexviews.get_id('ssb_json','json_idx'),'BOTH',NULL); SELECT RELEASE_LOCK('JSON_IDX_LOCK') INTO @discard; END IF; END;; DELIMITER ;

So there you have it. A way to index and quickly search through JSON documents and keep the index in sync automatically.

The post ‘Indexing’ JSON documents for efficient MySQL queries over JSON data appeared first on MySQL Performance Blog.

Percona XtraDB Cluster 5.6: a tale of 2 GTIDs

February 13, 2015 - 6:27am

Say you have a cluster with 3 nodes using Percona XtraDB Cluster (PXC) 5.6 and one asynchronous replica connected to node1. If asynchronous replication is using GTIDs, moving the replica so that it is connected to node2 is trivial, right? Actually replication can easily break for reasons that may not be obvious at first sight.

Summary

Let’s assume we have the following setup with 3 PXC nodes and one asynchronous replica:


Regarding MySQL GTIDs, a Galera cluster behaves like a distributed master: transactions coming from any node will use the same auto-generated uuid. This auto-generated uuid is related to the Galera uuid, it’s neither ABC, nor DEF, nor GHI.

Transactions executed for instance on node1 but not replicated to all nodes with Galera replication will generate a GTID using the uuid of the node (ABC). This can happen for writes on MyISAM tables if wsrep_replicate_myisam is not enabled.

Such local transactions bring the same potential issues as errant transactions do for a regular master-slave setup using GTID-based replication: if node3 has a local transaction, when you connect replica1 to it, replication may break instantly.

So do not assume that moving replica1 from node2 to node3 is a safe operation if you don’t check errant transactions on node3 first.

And if you find errant transactions that you don’t want to get replicated to replica1, there is only one good fix: insert a corresponding empty transaction on replica1.

Galera GTID vs MySQL GTID

Both kinds of GTIDs are using the same format: <source_id:trx_number>.

For Galera, <source_id> is generated when the cluster is bootstrapped. This <source_id> is shared by all nodes.

For MySQL, <source_id> is the server uuid. So it is easy to identify from which server a transaction originates.

Knowing the Galera GTID of a transaction will give you no clue about the corresponding MySQL GTID of the same transaction, and vice versa. You should simply consider them as separate identifiers.

MySQL GTID generation when writing to the cluster

What can be surprising is that writing to node1 will generate a MySQL GTID where <source_id> is not the server uuid:

node1> select @@server_uuid; +--------------------------------------+ | @@server_uuid | +--------------------------------------+ | 03c236a0-f860-11e3-9b80-9cebe8067a3f | +--------------------------------------+ node1> select @@global.gtid_executed; +------------------------------------------+ | @@global.gtid_executed | +------------------------------------------+ | b1f3f33a-0789-ee1c-43f3-f8373e12f1ea:1 | +------------------------------------------+

Even more surprising is that if you write to node2, you will see a single GTID set as if both transactions had been executed on the same server:

node2> select @@global.gtid_executed; +------------------------------------------+ | @@global.gtid_executed | +------------------------------------------+ | b1f3f33a-0789-ee1c-43f3-f8373e12f1ea:2 | +------------------------------------------+

Actually this is reasonable: the cluster acts as a distributed master regarding MySQL replication, so it makes sense that all nodes share the same <source_id>.

And by the way, if you are puzzled about how this ‘anonymous’ <source_id> is generated, look at this:

mysql> show global status like 'wsrep_local_state_uuid'; +------------------------+--------------------------------------+ | Variable_name | Value | +------------------------+--------------------------------------+ | wsrep_local_state_uuid | 4e0c0cc5-f876-11e3-bc0c-07c8c1ed0e15 | +------------------------+--------------------------------------+ node1> select @@global.gtid_executed; +------------------------------------------+ | @@global.gtid_executed | +------------------------------------------+ | b1f3f33a-0789-ee1c-43f3-f8373e12f1ea:1 | +------------------------------------------+

If you ‘sum’ both <source_id>, you will get ffffffff-ffff-ffff-ffff-ffffffffffff.

How can local transactions show up?

Now the question is: given that any transaction executed on any node of the cluster is replicated to all nodes, how can a local transaction (a transaction only found on one node) appear?

The most common reason is probably a write on a MyISAM table if wsrep_replicate_myisam is not enabled, simply because writes on MyISAM tables are not replicated by Galera by default:

# Node1 mysql> insert into myisam_table (id) values (1); mysql> select @@global.gtid_executed; +----------------------------------------------------------------------------------+ | @@global.gtid_executed | +----------------------------------------------------------------------------------+ | 03c236a0-f860-11e3-9b80-9cebe8067a3f:1, b1f3f33a-0789-ee1c-43f3-f8373e12f1ea:1-8 | +----------------------------------------------------------------------------------+ # Node2 mysql> select @@global.gtid_executed; +------------------------------------------+ | @@global.gtid_executed | +------------------------------------------+ | b1f3f33a-0789-ee1c-43f3-f8373e12f1ea:1-8 | +------------------------------------------+

As you can see the GTID of the local transaction on node1 uses the uuid of node1, which makes it easy to spot.

Are there other statements that can create a local transaction? Actually we found that this is also true for FLUSH PRIVILEGES, which is tracked in the binlogs but not replicated by Galera. See the bug report.

As you probably know, these local transactions can hurt when you connect an async replica on a node with a local transaction: the replication protocol used by GTID-based replication will make sure that the local transaction will be fetched from the binlogs and executed on the async slave. But of course, if the transaction is no longer in the binlogs, that triggers a replication error!

Conclusion

I will repeat here what I always say about MySQL GTIDs: it is a great feature but replication works a bit differently from regular position-based replication. So make sure you understand the main differences otherwise it is quite easy to be confused.

The post Percona XtraDB Cluster 5.6: a tale of 2 GTIDs appeared first on MySQL Performance Blog.

Online GTID rollout now available in Percona Server 5.6

February 10, 2015 - 12:00am

Global Transaction IDs (GTIDs) are one of my favorite features of MySQL 5.6. The main limitation is that you must stop all the servers at the same time to allow GTID-replication. Not everyone can afford to take a downtime so this requirement has been a showstopper for many people. Starting with Percona Server 5.6.22-72.0 enabling GTID replication can be done without almost no downtime. Let’s see how to do it.

Implementation of the Facebook patch

Finding a solution to migrate to GTIDs with no downtime is not a new idea, and several companies have already developed their own patch. The 2 best known implementations are the one from Facebook and the one from Booking.com.

Both options have pros and cons, and we finally chose to port the Facebook patch and add a new setting (gtid_deployment_step).

Performing the migration

Let’s assume we have a master-slaves setup with 4 servers A, B, C and D. A is the master:


The 1st step is to take each slave out of rotation, one at a time, and set gtid_mode = ON and gtid_deployment_step = ON (and also log_bin, log_slave_updates and enforce_gtid_consistency).


gtid_deployment_step = ON means that a server will not generate GTIDs when it executes writes, but it will record a GTID in its binary log if it gets an event from the replication stream tagged with a GTID.

The 2nd step is to promote one of the slaves to become the new master (for instance C) and to disable gtid_deployment_step. It is a regular slave promotion so you should do it the same way you deal with planned slave promotions (for instance using MHA or your own scripts). Our patch doesn’t help you do this promotion.

At this point replication will break on the old master as it has gtid_mode = OFF and gtid_deployment_step = OFF.


Don’t forget that you need to use CHANGE MASTER TO MASTER_AUTO_POSITION = 1 to enable GTID-based replication.

The 3rd step is to restart the old master to set gtid_mode = ON. Replication will resume automatically, but don’t forget to set MASTER_AUTO_POSITION = 1.


The final step is to disable gtid_deployment_step on all slaves. This can be done dynamically:

mysql> SET GLOBAL gtid_deployment_step = OFF;

and you should remove the setting from the my.cnf file so that it is not set again when the server is restarted.

Optionally, you can promote the old master back to its original role.

That’s it, GTID replication is now available without having restarted all servers at the same time!

Limitations

At some point during the migration, a slave promotion is needed. And at this point, you are still using position-based replication. The patch will not help you with this promotion so use your regular failover scripts. If you have no scripts to deal with that kind of situation, make sure you know how to proceed.

Also be aware that this patch provides a way to migrate to GTIDs with no downtime, but not a way to migrate away from GTIDs with no downtime. So test carefully and make sure you understand all the new stuff that comes with GTIDs, like the new replication protocol, or how to skip transactions.

Other topologies

If you are using master-master replication or multiple tier replication, you can follow the same steps. With multiple tier replication, simply start by setting gtid_mode = ON and gtid_deployment_step = ON for the leaves first.

Conclusion

If you’re interested by the benefits of GTID replication but if taking a downtime has always scared you, you should definitely download the latest Percona Server 5.6 and give it a try!

The post Online GTID rollout now available in Percona Server 5.6 appeared first on MySQL Performance Blog.

Percona Server 5.6.22-72.0 is now available

February 6, 2015 - 9:04am

Percona is glad to announce the release of Percona Server 5.6.22-72.0 on February 6, 2015. Download the latest version from the Percona web site or from the Percona Software Repositories.

Based on MySQL 5.6.22, including all the bug fixes in it, Percona Server 5.6.22-72.0 is the current GA release in the Percona Server 5.6 series. Percona Server is open-source and free – and this is the latest release of our enhanced, drop-in replacement for MySQL. Complete details of this release can be found in the 5.6.22-72.0 milestone on Launchpad.

New Features:

  • Percona Server is now able to log the query’s response times into separate READ and WRITE INFORMATION_SCHEMA tables. Two new INFORMATION_SCHEMA tables QUERY_RESPONSE_TIME_READ and QUERY_RESPONSE_TIME_WRITE have been implemented for READ and WRITE queries correspondingly.
  • Percona Server now supports Online GTID deployment. This enables GTID to be deployed on existing replication setups without making the master read-only and stopping all the slaves. This feature was ported from the Facebook branch.
  • New ps_tokudb_admin script has been implemented to make the TokuDB storage engine installation easier.

Bugs Fixed:

  • SET STATEMENT ... FOR would crash the server if it could not execute the due to: 1) if the was Read-Write in a Read-Only transaction (bug #1387951), 2) if the needed to re-open an already open temporary table and would fail to do so (bug #1412423), 3) if the needed to commit implicitly the ongoing transaction and the implicit commit would fail (bug #1418049).
  • TokuDB storage engine would fail to load after the upgrade on CentOS 5 and 6. Bug fixed #1413956.
  • Fixed a potential low-probability crash in XtraDB linear read-ahead code. Bug fixed #1417953.
  • Setting the max_statement_time per query had no effect. Bug fixed #1376934.

Other bugs fixed: #1407941, and #1415843 (upstream #75642)

Release notes for Percona Server 5.6.22-72.0 are available in the online documentation. Please report any bugs on the launchpad bug tracker

The post Percona Server 5.6.22-72.0 is now available appeared first on MySQL Performance Blog.

The future of MySQL quality assurance: Introducing pquery

February 4, 2015 - 12:00am

Being a QA Engineer, how would you feel if you had access to a framework which can generate 80+ crashes – a mix of hitting developer introduced assertions (situations that should not happen), and serious unforeseen binary crashes – for the world’s most popular open source database software – each and ever hour? What if you could do this running on a medium spec machine – even a laptop?

The seniors amongst you may object “But… generating a crash or assertion is one thing – creating a repeatable testcase for the same is quite another.”

Introducing pquery, mtr_to_sql, reducer.sh (the pquery-enabled version), and more:

80+ coredumps per hour. Fully automatic testcase creation. Near-100% testcase reproducibility. C++ core. 15 Seconds run time per trial. Up to 20-25k lines of SQL executed per trial. CLI testcases. Compatible with sporadic issues. High-end automation of many aspects.

It all started when we realized how slow RQG really is. The Random Query Generator by Philip Stoev is a fantastic tool, and it has been greatly expanded over the years, but though it is a Swiss army knife in what you can do with it, when it comes to speed it is not the fastest. The Perl backend – combined with much random-ness – has made the system slow. At least when compared with the ultra-fast execution of C++ code.

I discussed things with Alexey Bychko, and before long pquery was born. The C++ core code of pquery is Alexey’s creation. It easily executes 20k random lines of SQL in 15-20 seconds, with full logging (including errors) turned on. Though the tool looks fairly simple when reviewing the code, it is incredibly powerful.

Now, one thing which people being introduced to QA for MySQL (or any other large software with many features/functions/options etc.) have to grasp is “random spread testing”. If your random spread (and the amount of samples) is large enough (read: ‘sufficient’), it is relatively easy to get a good “overall quality” estimate of your software by doing a few runs (taking usually around 1-3 days – or longer if various options are being tested).

So,we now had speed (pquery) and near-perfect testcase simplification/reduction & reproducibility (the new pquery adaption of reducer.sh) – but we needed one thing more: SQL which would cover… well… every area of mysqld. A fast framework without matching grammars is not worth much…

Converting the grammars from RQG seemed like a mammoth task – and we would really just be re-writing RQG in C. And creating a new SQL generator was an almost hopeless venture (I gave it a short try) – given the huge variety and complexity when constructing SQL statements.

I took the cheeky road. And it paid off. mtr_to_sql.sh was born.

The MTR testcases included (and delivered) with the MySQL server download contain individual tests for nearly every possible SQL syntax possible, including ones that – ought to – error out (but not crash). Not only that, there are specific MTR testcases for each feature, not to mention the many MTR testcases that were added to cover bug regression testing. Where there is smoke…

[roel@localhost 5.6]$ pwd /bzr/5.6 [roel@localhost 5.6]$ find . | grep ".test$" | wc -l 3867 [roel@localhost 5.6]$ cat ~/percona-qa/mtr_to_sql.sh | grep BZR_PATH | head -n1 BZR_PATH="/bzr/5.6/mysql-test" [roel@localhost 5.6]$ time ~/percona-qa/mtr_to_sql.sh Done! Generated /tmp/mtr_to_sql.yy for RQG, and /tmp/mtr_to_sql.sql for pquery (SQL is indentical in both, but first is formatted for use with RQG) real 0m20.150s user 1m2.951s sys 0m1.214s [roel@localhost 5.6]$ cat /tmp/mtr_to_sql.sql | wc -l 107541

107.5K of high-quality SQL. Covering every possible functionality and buggy area out there. Free.

Let the testing begin!

I was quite dumbfounded when (after further tuning and scripting) we started seeing 50+, then 80+ cores per hour. Sig11’s (crashes), Sig6’s (asserts), server hangs, character set issues, error log errors and so. Many crashes and issues in optimized code. Fun.

Our best weeks yet?

Pquery update: 220+ bugs logged, of which 140+ in MySQL, 60+ in Percona, 10+ in TokuTek (with limited TokuTek testing) #mysql #percona #qa

— Roel Van de Paar (@RoelVandePaar) November 21, 2014

Last week @rameshvs02 & @RoelVandePaar logged 70+ bugs. Today reducer.sh was updated w/ pquery functionality. Testcases on their way! #mysql

— Roel Van de Paar (@RoelVandePaar) October 27, 2014

All of the above can be done on commodity hardware, running a single server, running single-threaded SQL (single client) and with no special mysqld options activated.

Compare this to RQG. Even with combinations.pl running hundreds if not thousands of mysqld — option combinations, and with nearly-everything-is-random-sql, it still comes nowhere near even one tenth of that number/speed. And this is even when using high-end hardware, 8 simultaneous mysqld servers, up to 20-25 client threads and at times special grammar-mix tools like MaxiGen etc.

In preparation for the Twitter week mentioned above we started running 4-5 simultaneous pquery run’s (5x mysqld, still all single threaded; a single client per mysqld) in different shell screen sessions, controlled by cron jobs.

A whole set of automation scripts were quickly added to handle the huge influx in bugs (you can get all for free (GPLv2) at $bzr branch lp:percona-qa – see pquery*.sh files), and now you can quickly review a list of issues pquery discovered. For writing this article, I started a run and in it’s first hour it found exactly 85 crashes. Here is a report from around ~2h;

[roel@localhost 830147]$ ~/percona-qa/pquery-results.sh ================ Sorted unique issue strings (Approx 439 trials executed, 167 remaining reducer scripts) btr0btr.ic line 143 (Seen 31 times: reducers 123,124,135,150,159,164,16,173,175,178,179,18,196,199,224,22,238,245,286,310,319,324,366,388,38,401,67,73,78,88,9) btr0btr.ic line 169 (Seen 1 times: reducers 158) btr0cur.cc line 769 (Seen 1 times: reducers 304) buf0buf.cc line 2738 (Seen 2 times: reducers 113,257) fsp0fsp.cc line 1899 (Seen 5 times: reducers 145,174,409,69,85) . is_set (Seen 32 times: reducers 112,165,170,192,203,218,231,249,24,253,259,273,278,280,289,329,331,333,336,338,363,371,373,379,384,398,404,44,47,6,72,82) .length % 4 (Seen 4 times: reducers 169,220,307,80) m_lock .= __null .... thd->mdl_context.is_lock_owner.m_namespace, ...., ...., MDL_SHARED(Seen 3 times: reducers 297,403,86) row0quiesce.cc line 683 (Seen 1 times: reducers 97) row0umod.cc line 338 (Seen 1 times: reducers 357) .slen % 2 (Seen 21 times: reducers 106,122,131,144,221,250,251,252,275,282,296,316,318,32,359,375,39,405,407,43,46) .slen % 4 (Seen 5 times: reducers 103,382,76,7,81) sort_field->length >= length (Seen 1 times: reducers 138) timer == __null (Seen 36 times: reducers 133,139,149,160,161,181,183,184,185,20,212,227,229,234,244,260,266,274,292,294,295,298,301,308,326,327,330,343,346,364,367,400,48,50,59,99) .tlen % 2 (Seen 8 times: reducers 117,119,200,205,213,217,285,35) .tlen % 4 (Seen 3 times: reducers 25,355,365) trx0roll.cc line 264 (Seen 1 times: reducers 40) Z21mysql_execute_commandP3THD (Seen 4 times: reducers 182,237,291,393) ZN13Bounded_queueIhhE4pushEPh (Seen 3 times: reducers 101,118,214) ZN8Protocol13end_statementEv (Seen 4 times: reducers 211,410,42,61) ================

For these (standard by now) pquery runs, we use pquery-run.sh. It starts a server, executes and monitors the pquery binary, and then checks on the outcome:

[roel@localhost percona-qa]$ ./pquery-run.sh [07:23:21] [0] Workdir: /sda/459689 | Rundir: /dev/shm/459689 | Basedir: /sda/Percona-Server-5.6.21-rel69.0-687.Linux.x86_64-debug [07:23:21] [0] mysqld Start Timeout: 60 | Client Threads: 1 | Queries/Thread: 25000 | Trials: 100000 | Save coredump trials only: TRUE [07:23:21] [0] Pquery timeout: 15 | SQL file used: /home/roel/percona-qa/pquery/main-new.sql [07:23:21] [0] MYSAFE: --maximum-bulk_insert_buffer_size=1M --maximum-join_buffer_size=1M --maximum-max_heap_table_size=1M --maximum-max_join_size=1M --maximum-myisam_max_sort_file_size=1M --maximum-myisam_mmap_size=1M --maximum-myisam_sort_buffer_size=1M --maximum-optimizer_trace_max_mem_size=1M --maximum-preload_buffer_size=1M --maximum-query_alloc_block_size=1M --maximum-query_prealloc_size=1M --maximum-range_alloc_block_size=1M --maximum-read_buffer_size=1M --maximum-read_rnd_buffer_size=1M --maximum-sort_buffer_size=1M --maximum-tmp_table_size=1M --maximum-transaction_alloc_block_size=1M --maximum-transaction_prealloc_size=1M --log-output=none --sql_mode=ONLY_FULL_GROUP_BY [07:23:21] [0] Archiving a copy of this script (/home/roel/percona-qa/pquery-run.sh) in the workdir (/sda/459689) for later reference... [07:23:21] [0] Archiving a copy of the infile used for this run (/home/roel/percona-qa/pquery/main-new.sql) in the workdir (/sda/459689) for later reference... [07:23:21] [0] Generating datadir template (using mysql_install_db)... [07:23:29] [0] Making a copy of mysqld used to /sda/459689/mysqld (handy for core file analysis and manual bundle creation)... [07:23:29] [0] Storing a copy of ldd files for mysqld in same directory also... [07:23:29] [0] Starting pquery testing iterations... [07:23:29] [0] ====== TRIAL #1 ====== [07:23:29] [0] Ensuring there are no relevant servers running... [07:23:29] [0] Clearing rundir... [07:23:29] [0] Generating new workdir /dev/shm/459689/1... [07:23:29] [0] Copying datadir from template... [07:23:29] [0] Starting mysqld. Error log is stored at /dev/shm/459689/1/log/master.err [07:23:29] [0] Waiting for mysqld (pid: 10879) to fully start... [07:23:31] [0] Server started ok. Client: /sda/Percona-Server-5.6.21-rel69.0-687.Linux.x86_64-debug/bin/mysql -uroot -S/dev/shm/459689/1/socket.sock [07:23:31] [0] Starting pquery (log stored in /dev/shm/459689/1/pquery.log)... [07:23:31] [0] pquery running... ./pquery-run.sh: line 150: 10879 Aborted (core dumped) $CMD > ${RUNDIR}/${TRIAL}/log/master.err 2>&1 <---- success! [07:23:32] [0] Cleaning up... [07:23:36] [0] pquery summary: 578/769 queries failed (24.84% were successful) [07:23:36] [0] mysqld core detected at /dev/shm/459689/1/data/core.10879.1000.1000.6.1414700611.mysqld [07:23:36] [1] Copying rundir from /dev/shm/459689/1 to /sda/459689/1 [07:23:36] [1] ====== TRIAL #2 ====== [...]

(The MYSAFE settings are some generic settings specifically suited for QA tested – kindly provided by Shane Bester)

Within the scripts many QA-goodies are already built-in: automated gdb query extraction from the core and the error log (each added 3x to the end of the sql trace to ensure maximum reproducibility), unique issue classification using bug-relevant strings, /dev/shm execution for optimal speed, etc. – it is all based/build on many years of mysqld QA experience.

If you can’t wait to spin off some I-crashed-mysqld (it’s easy you know…) bugs of your own, download the tools at lp:percona-qa ($bzr branch lp:percona-qa) and checkout the many pquery* scripts. Some shown in use above.

And, when you’re ready for slightly more serious feature testing – add whatever statements (+matching mysqld options/parameters) your feature uses to the sql file (or even replace it), plus the relevant mysqld options to pquery-run.sh (MYEXTRA string) (You can use sql-interleave.sh/sql-interleave-switch.sh to interleave new sql into the main sql file(s), available in the pquery/ directory of the percona-qa branch mentioned above). Soon you will see fireworks.

Enjoy the show!

The post The future of MySQL quality assurance: Introducing pquery appeared first on MySQL Performance Blog.

Percona Live 2015 Lightning Talks, BoF submission deadline Feb. 13! And introducing “MySQL 101″ program

February 3, 2015 - 5:00am

It’s hard to believe that the Percona Live MySQL Conference and Expo is just over two months away (April 13-16 in Santa Clara, California). So if you’ve been thinking about submitting a proposal for the popular “Lightning Talks” and/or “Birds of a Feather” sessions, it’s time to get moving because the deadline to do so if February 13.

Lightning Talks provide an opportunity for attendees to propose, explain, exhort, or rant on any MySQL-related topic for five minutes. Topics might include a new idea, successful project, cautionary story, quick tip, or demonstration. All submissions will be reviewed, and the top 10 will be selected to present during the one-hour Lightning Talks session on Wednesday (April 15) during the Community Networking Reception. Lighthearted, fun or otherwise entertaining submissions are highly welcome. Submit your proposal here.

“MySQL 101″ is coming to Percona Live 2015.

Birds of a Feather (BoF) sessions enable attendees with interests in the same project or topic to enjoy some quality face time. BoFs can be organized for individual projects or broader topics (e.g., best practices, open data, standards). Any attendee or conference speaker can propose and moderate an engaging BoF. Percona will post the selected topics and moderators online and provide a meeting space and time. The BoF sessions will be held Tuesday night, (April 14) from 6- 7 p.m. Submit your BoF proposal here.

This year we’re also adding a new program for MySQL “newbies.” It’s called “MySQL 101,” and the motto of this special two-day program is: “You send us developers and admins, and we will send you back MySQL DBAs.” The two days of practical training will include everything they need to know to handle day-to-day MySQL DBA tasks.

“MySQL 101,” which is not included in regular Percona Live registration, will cost $400. However, the first 101 tickets are just $101 if you use the promo code “101” during checkout.

New: 25-Minute Sessions
On the first day of the conference, Percona is now offering 25-minute talks that pack tons of great information into a shorter format to allow for a wider range of topics. The 25-minute sessions include:

I also wanted to give another shout-out to Percona Live 2015’s awesome sponsor, which include: VMware, Yahoo, Deep Information Sciences, Pythian, Codership, Machine Zone, Box, Yelp, MariaDB, SpringbokSQL, Tesora, BlackMesh, SolidFire, Severalnines, Tokutek, VividCortex, FoundationDB, ScaleArc, Walmart eCommerce and more.(Sponsorship opportunities are still available.)

The great thing about Percona Live conferences is that there is something for everyone within the MySQL ecosystem – veterans and newcomers alike. And for the first time this year, that community expands to encompass OpenStack. Percona Live attendees can also attend OpenStack Live events. Those events run April 13-14, also at the Hyatt Regency Santa Clara and Santa Clara Convention Center.

OpenStack Live 2015’s awesome sponsors include: PMC Sierra and Nimble Storage!

With so much to offer this year, this is why there are several more options in terms of tickets. Click the image below for a detailed view of what access each ticket type provides.

Register here for the Percona Live MySQL Conference and Expo.

Register here for the OpenStack Live Conference and Expo.

For full conference schedule details please visit the Percona Live MySQL Conference and Expo website and the OpenStack Live Conference Website!

I hope to see you in Santa Clara in a couple months!

 

The post Percona Live 2015 Lightning Talks, BoF submission deadline Feb. 13! And introducing “MySQL 101″ program appeared first on MySQL Performance Blog.

Faster fingerprints and Go packages for MySQL

February 2, 2015 - 7:31am

I’m happy to announce Go packages for MySQL. Particularly exciting is a new query fingerprint function which is very fast and efficient, but I’ll talk about that later. First, go-mysql is currently three simple Go packages for parsing and aggregating MySQL slow logs. If you’ve been following Percona development, you’ve no doubt heard of Percona Cloud Tools (PCT), a somewhat new performance management web service for MySQL.

One tool in PCT is “Query Analytics” which continuously analyzes query metrics from the slow log. The slow log provides the most metrics and therefore the most performance insight into MySQL. percona-agent, the open-source agent for PCT, uses go-mysql to parse and analyze the slow log, so the code has both heavy formal testing and heavy real-world testing. If you’re working with Go, MySQL, and MySQL slow logs, we invite you to try go-mysql.

Last October we implemented a completely new query fingerprint function. (See “Fingerprints” in the pt-query-digest doc for a background on query fingerprints.) Since mydumpslow, the very first slow log parser circa 2000, fingerprints have been accomplished with regular expressions. This approach is normally fine, but percona-agent needs to be faster and more efficient than normal to reduce the cost of observation. Regex patterns are like little state machines. One regex can be very fast, but several are required to produce a good fingerprint. Therefore, the regex approach requires processing the same query several times to produce a fingerprint. Even worse: a regex can backtrack which means a single logical pass through the query can result in several physical passes. In short: regular expressions are a quick and easy solution, but they are very inefficient.

Several years ago, a former colleague suggested a different approach: a single pass, purpose-built, character-level state machine. The resulting code is rather complicated, but the resulting performance is a tremendous improvement: 3-5x faster in informal benchmarks on my machine, and it handles more edge cases. In simplest terms: the new fingerprint function does more with less, which makes percona-agent and Query Analytics better.

Check out  github.com/percona/go-mysql, browse the percona-agent source code if you wish, and try Percona Cloud Tools for free.

The post Faster fingerprints and Go packages for MySQL appeared first on MySQL Performance Blog.

Pages

]]>