October 2, 2014

Using Apache Hadoop and Impala together with MySQL for data analysis

Apache Hadoop is commonly used for data analysis. It is fast for data loads and scalable. In a previous post I showed how to integrate MySQL with Hadoop. In this post I will show how to export a table from  MySQL to Hadoop, load the data to Cloudera Impala (columnar format) and run a reporting […]

Concatenating MyISAM files

Recently, I found myself involved in the migration of a large read-only InnoDB database to MyISAM (eventually packed). The only issue was that for one of the table, we were talking of 5 TB of data, 23B rows. Not small… I calculated that with something like insert into MyISAM_table… select * from Innodb_table… would take […]

Recovery after DROP & CREATE

In a very popular data loss scenario a table is dropped and empty one is created with the same name. This is because  mysqldump in many cases generates the “DROP TABLE” instruction before the “CREATE TABLE”:

If there were no subsequent CREATE TABLE the recovery would be trivial. Index_id of the PRIMARY index of […]

SELECT UNION Results INTO OUTFILE

Here’s a quick tip I know some of us has overlooked at some point. When doing SELECT … UNION SELECT, where do you put the the INTO OUTFILE clause? On the first SELECT, on the last or somewhere else? The manual has the answer here, to quote: Only the last SELECT statement can use INTO […]

GROUP_CONCAT useful GROUP BY extension

(There is an updated version of this post here) MySQL has useful extention to the GROUP BY operation: function GROUP_CONCAT: GROUP_CONCAT(expr) – This function returns a string result with the concatenated non-NULL values from a group. Where it can be useful?