Author - Alexander Rubin

Using Apache Spark and MySQL for Data Analysis

What is Spark
Apache Spark is a cluster computing framework, similar to Apache Hadoop. Wikipedia has a great description of it:
Apache Spark is an open source cluster computing framework originally developed in the AMPLab at University of California, Berkeley but was later donated to the Apache Software Foundation where it remains today. In contrast to […]

Read more

Advanced Query Tuning in MySQL 5.6 and MySQL 5.7 Webinar: Q&A

Thank you for attending my July 22 webinar titled “Advanced Query Tuning in MySQL 5.6 and 5.7” (my slides and a replay available here). As promised here is the list of questions and my answers (thank you for your great questions).
Q: Here is the explain example:

MySQL

mysql> explain extended select id, site_id from test_index_id where […]

Read more

Generated (Virtual) Columns in MySQL 5.7 (labs)

About 2 weeks ago Oracle published the MySQL 5.7.7-labs-json version which includes a very interesting feature called “Generated columns” (also know as Virtual or Computed columns). MariaDB has a similar feature as well: Virtual (Computed) Columns.
The idea is very simple: if we store a column

MySQL

`FlightDate` date

1

`FlightDate` date

in our table we may want to filter or group […]

Read more

Using MySQL Event Scheduler and how to prevent contention

MySQL introduced the Event Scheduler in version 5.1.6. The Event Scheduler is a MySQL-level “cron job”, which will run events inside MySQL. Up until now, this was not a very popular feature, however, it has gotten more popular since the adoption of Amazon RDS – as well as similar MySQL database as a service […]

Read more

Advanced MySQL Query Tuning (Aug. 6) and MySQL 5.6 Performance Schema (Aug. 13) webinars

I will be presenting two webinars in August:

Aug 6, 10 a.m. PDT: Advanced Query Tuning in MySQL 5.6 and Beyond
Aug 13, 10 a.m. PDT:  Using Performance Schema to Monitor and Troubleshoot MySQL 5.6

This Wednesday’s webinar on advanced MySQL query tuning will be focused on tuning the “usual suspects”: queries with “Group By”, “Order By” and […]

Read more

Using UDFs for geo-distance search in MySQL

In my previous post about geo-spatial search in MySQL I described (along with other things) how to use geo-distance functions. In this post I will describe the geo-spatial distance functions in more details.
If you need to calculate an exact distance between 2 points on Earth in MySQL (very common for geo-enabled applications) you have at […]

Read more

Measure the impact of MySQL configuration changes with Percona Cloud Tools

When you make a change to your MySQL configuration in production it would be great to know the impact (a “before and after” type of picture). Some changes are obvious. For many variables proper values can be determined beforehand, i.e. innodb_buffer_pool_size or innodb_log_file_size. However, there is 1 configuration variable which is much less obvious for many […]

Read more

Using MySQL 5.6 Performance Schema in multi-tenant environments

Hosting a shared MySQL instance for your internal or external clients (“multi-tenant”) was always a challenge. Multi-tenants approach or a “schema-per-customer” approach is pretty common nowadays to host multiple clients on the same MySQL sever. One of issues of this approach, however, is the lack of visibility: it is hard to tell how many resources (queries, […]

Read more

Using InfiniDB MySQL server with Hadoop cluster for data analytics

In my previous post about Hadoop and Impala I benchmarked performance of analytical queries in Impala.
This time I’ve tried InfiniDB for Hadoop (open-source version) on the modern hardware with an 8-node Hadoop cluster. One of the main advantages (at least for me) of InifiniDB for Hadoop is that it stores the data inside the Hadoop cluster but uses the […]

Read more

MySQL Audit Plugin now available in Percona Server 5.5 and 5.6

The new Percona Server 5.5.37-35.0 and Percona Server 5.6.17-65.0-56, announced yesterday (May 6), both include the open source version of the MySQL Audit Plugin. The MySQL Audit Plugin is used to log all queries or connections (“audit” MySQL usage). Until yesterday’s release, the MySQL Audit Plugin was only available in MySQL Enterprise.
EDIT:  Just to be clear, this implementation […]

Read more