Where the open source community meets: Secure your spot for Percona Live Amsterdam! - Register

Downloads

Blog

How X Plugin Works Under the Hood

September 16, 2016

Author

Miguel Angel Nieto

Insight for DBAs

MySQL

Share this Post:

In this blog post, we’ll look at what MySQL does under the hood to transform NoSQL requests to SQL (and then store them in InnoDB transactional engine) when using the X Plugin.

X Plugin allows MySQL to function as a document store. We don’t need to define any schema or use SQL language while still being a fully ACID database. Sounds like magic – but we know the only thing that magic does is make planes fly! 🙂

Alexander already wrote a blog post exploring how the X Plugin works, with some examples. In this post, I am going to show some more query examples and how they are transformed.

I have enabled the slow query log to see what it is actually being executed when I run NoSQL queries.

Creating our first collection

We start the MySQL shell and create our first collection:

$ mysqlsh -u root --py
Creating an X Session to root@localhost:33060
No default schema selected.
[...]
Currently in Python mode. Use sql to switch to SQL mode and execute queries.
mysql-py> db.createCollection("people")

$ mysqlsh -u root --py

Creating an X Session to root@localhost:33060

No default schema selected.

[...]

Currently in Python mode. Use sql to switch to SQL mode and execute queries.

mysql-py> db.createCollection("people")

What is a collection in SQL terms? A table. Let’s check what MySQL does by reading the slow query log:

CREATE TABLE `people` (
  `doc` json DEFAULT NULL,
  `_id` varchar(32) GENERATED ALWAYS AS (json_unquote(json_extract(`doc`,'$._id'))) STORED NOT NULL,
  PRIMARY KEY (`_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4

CREATE TABLE `people` (

`doc` json DEFAULT NULL,

`_id` varchar(32) GENERATED ALWAYS AS (json_unquote(json_extract(`doc`,'$._id'))) STORED NOT NULL,

PRIMARY KEY (`_id`)

) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4

As we correctly guessed, it creates a table with two columns. One is called “doc” and it stores a JSON document. A second column named “_id” and is created as a virtual column from data extracted from that JSON document. _id is used as a primary key, and if we don’t specify a value, MySQL will choose a random UUID every time we write a document.

So, the basics are clear.

- It stores everything inside a JSON column.

- Indexes are created on virtual columns that are generated by extracting data from that JSON. Every time we add a new index, a virtual column will be generated. That means that under the hood, an alter table will run adding the column and the corresponding index.

Let’s run a getCollections that would be similar to “SHOW TABLES” in the SQL world:

mysql-py> db.getCollections()
[
    
]

mysql-py> db.getCollections()

[

]

This is what MySQL actually runs:

SELECT C.table_name AS name, IF(ANY_VALUE(T.table_type)='VIEW', 'VIEW', IF(COUNT(*) = COUNT(CASE WHEN (column_name = 'doc' AND data_type = 'json') THEN 1 ELSE NULL END) + COUNT(CASE WHEN (column_name = '_id' AND generation_expression = 'json_unquote(json_extract(`doc`,''$._id''))') THEN 1 ELSE NULL END) + COUNT(CASE WHEN (column_name != '_id' AND generation_expression RLIKE '^(json_unquote[[.(.]])?json_extract[[.(.]]`doc`,''[[.$.]]([[...]][^[:space:][...]]+)+''[[.).]]{1,2}$') THEN 1 ELSE NULL END), 'COLLECTION', 'TABLE')) AS type FROM information_schema.columns AS C LEFT JOIN information_schema.tables AS T USING (table_name)WHERE C.table_schema = 'test' GROUP BY C.table_name ORDER BY C.table_name;

SELECT C.table_name AS name, IF(ANY_VALUE(T.table_type)='VIEW', 'VIEW', IF(COUNT(*) = COUNT(CASE WHEN (column_name = 'doc' AND data_type = 'json') THEN 1 ELSE NULL END) + COUNT(CASE WHEN (column_name = '_id' AND generation_expression = 'json_unquote(json_extract(`doc`,''$._id''))') THEN 1 ELSE NULL END) + COUNT(CASE WHEN (column_name != '_id' AND generation_expression RLIKE '^(json_unquote[[.(.]])?json_extract[[.(.]]`doc`,''[[.$.]]([[...]][^[:space:][...]]+)+''[[.).]]{1,2}$') THEN 1 ELSE NULL END), 'COLLECTION', 'TABLE')) AS type FROM information_schema.columns AS C LEFT JOIN information_schema.tables AS T USING (table_name)WHERE C.table_schema = 'test' GROUP BY C.table_name ORDER BY C.table_name;

This time, the query is a bit more complex. It runs a query on information_schema.tables joining it, with information_schema.columns searching for tables that have “doc” and “_id” columns.

Inserting and reading documents

I am going to start adding data to our collection. Let’s add our first document:

mysql-py> db.people.add(
      ...  {
      ...     "Name": "Miguel Angel",
      ...     "Country": "Spain",
      ...     "Age": 33
      ...   }
      ... )

mysql-py> db.people.add(

... {

... "Name": "Miguel Angel",

... "Country": "Spain",

... "Age": 33

... }

... )

In the background, MySQL inserts a JSON object and auto-assign a primary key value.

INSERT INTO `test`.`people` (doc) VALUES (JSON_OBJECT('Age',33,'Country','Spain','Name','Miguel Angel','_id','a45c69cd2074e611f11f62bf9ac407d7'));

1	INSERT INTO `test`.`people` (doc) VALUES (JSON_OBJECT('Age',33,'Country','Spain','Name','Miguel Angel','_id','a45c69cd2074e611f11f62bf9ac407d7'));

Ok, this is supposed to be schemaless. So let’s add someone else using different fields:

mysql-py> db.people.add(
      ...  {
      ...     "Name": "Thrall",
      ...     "Race": "Orc",
      ...     "Faction": "Horde"
      ...   }
      ... )

mysql-py> db.people.add(

... {

... "Name": "Thrall",

... "Race": "Orc",

... "Faction": "Horde"

... }

... )

Same as before, MySQL just writes another JSON object (with different fields):

INSERT INTO `test`.`people` (doc) VALUES (JSON_OBJECT('Faction','Horde','Name','Thrall','Race','Orc','_id','7092776c2174e611f11f62bf9ac407d7'));

1	INSERT INTO `test`.`people` (doc) VALUES (JSON_OBJECT('Faction','Horde','Name','Thrall','Race','Orc','_id','7092776c2174e611f11f62bf9ac407d7'));

Now we are going to read the data we have just inserted. First, we are going to find all documents stored in the collection:

mysql-py> db.people.find()

1	mysql-py> db.people.find()

MySQL translates to a simple:

SELECT doc FROM `test`.`people`;

1	SELECT doc FROM `test`.`people`;

And this is how filters are transformed:

mysql-py> db.people.find("Name = 'Thrall'")

1	mysql-py> db.people.find("Name = 'Thrall'")

It uses a SELECT with the WHERE clause on data extracted from the JSON object.

SELECT doc FROM `test`.`people` WHERE (JSON_EXTRACT(doc,'$.Name') = 'Thrall');

1	SELECT doc FROM `test`.`people` WHERE (JSON_EXTRACT(doc,'$.Name') = 'Thrall');

Updating documents

Thrall decided that he doesn’t want to belong to the Horde anymore. He wants to join the Alliance. We need to update the document:

mysql-py> db.people.modify("Name = 'Thrall'").set("Faction", "Alliance")

1	mysql-py> db.people.modify("Name = 'Thrall'").set("Faction", "Alliance")

MySQL runs an UPDATE, again using a WHERE clause on the data extracted from the JSON. Then, it updates the “Faction”:

UPDATE `test`.`people` SET doc=JSON_SET(doc,'$.Faction','Alliance') WHERE (JSON_EXTRACT(doc,'$.Name') = 'Thrall');

1	UPDATE `test`.`people` SET doc=JSON_SET(doc,'$.Faction','Alliance') WHERE (JSON_EXTRACT(doc,'$.Name') = 'Thrall');

Now I want to remove my own document:

mysql-py> db.people.remove("Name = 'Miguel Angel'");

1	mysql-py> db.people.remove("Name = 'Miguel Angel'");

As you can already imagine, it runs a DELETE, searching for my name on the data extracted from the JSON object:

DELETE FROM `test`.`people` WHERE (JSON_EXTRACT(doc,'$.Name') = 'Miguel Angel');

1	DELETE FROM `test`.`people` WHERE (JSON_EXTRACT(doc,'$.Name') = 'Miguel Angel');

Summary

The magic that makes our MySQL work like a document-store NoSQL database is:

- Create a simple InnoDB table with a JSON column.

- Auto-generate the primary key with UUID values and represent it as a virtual column.

- All searches are done by extracting data JSON_EXTRACT, and passing that info to the WHERE clause.

I would define the solution as something really clever, simple and clean. Congrats to Oracle! 🙂

0 0 votes

Article Rating

9 Comments

Oldest

Newest Most Voted

Paul DuBois

9 years ago

“I have enabled the slow query log to see what it is actually being executed when I run NoSQL queries.”

Why would you enable the slow query log (rather than the general query log)? Every NoSQL query is slow enough to show up in the slow log? 🙂

Author

Miguel Angel Nieto

9 years ago

Reply to Paul DuBois

Hehe. To avoid all the “connect – disconnect – connect – disconnect” noise 😛

Manjot Singh

9 years ago

Reply to Paul DuBois

we like to set long_query_time=0; this also works better with a lot of tools

9 years ago

thanks!

Alex

9 years ago

Seems that using a UUID without including caveats would negatively impact performance rather quickly .

http://mysqlserverteam.com/storing-uuid-values-in-mysql-tables/

Vasiliy Lyk'yanchikov

9 years ago

Hi Miguel,
Good post, thanks for that. I want to translate it on russian language. Please let me know if you have any objection.

Author

Miguel Angel Nieto

9 years ago

Reply to Vasiliy Lyk'yanchikov

No objection. Thanks 🙂

Vasiliy Lyk'yanchikov

9 years ago

Reply to Miguel Angel Nieto

Here is the translation of your article on russian language:
http://sqlinfo.ru/articles/info/29.html

You write:
mysql-py> db.getCollections()
[
]
but if previously we have created a collection “people”, then we get:
mysql-py> db.getCollections()
[

]

Happy new year.

Vasiliy Lyk'yanchikov

9 years ago

Reply to Vasiliy Lyk'yanchikov

the system removes the braces 🙂
must be “Collection:people” in braces

Resources

July 31, 2026

Bogdan Degtyariov

Stored Procedures memory consumption in Percona Server for MySQL

MySQL

July 22, 2026

Dennis Kittrell

How to Migrate from MySQL Galera Cluster to Percona XtraDB Cluster

PostgreSQL

July 15, 2026

Sonia Valeja

PostgreSQL Meta Commands that save time every day

Far
Enough.

Said no pioneer ever.

Get Started

Open source database software from experts who stand with you in production. Forever free from lock-in and other corporate BS.

Connect

Privacy

Legal

Security Center

MySQL, PostgreSQL, InnoDB, MariaDB, MongoDB and Kubernetes are trademarks for their respective owners.

How X Plugin Works Under the Hood

Creating our first collection

Inserting and reading documents

Updating documents

Summary

Stored Procedures memory consumption in Percona Server for MySQL

How to Migrate from MySQL Galera Cluster to Percona XtraDB Cluster

PostgreSQL Meta Commands that save time every day

Far Enough.

Far
Enough.