Category Archives: data science

Using Apache Spark and MySQL for Data Analysis

What is Spark Apache Spark is a cluster computing framework, similar to Apache Hadoop. Wikipedia has a great description of it: Apache Spark is an open source cluster computing framework originally developed in the AMPLab …

Read More
Using Apache Hadoop and Impala together with MySQL for data analysis

Apache Hadoop is commonly used for data analysis. It is fast for data loads and scalable. In a previous post I showed how to integrate MySQL with Hadoop. In this post I will show how …

Read More
MySQL and Hadoop integration

Dolphin and Elephant: an Introduction This post is intended for MySQL DBAs or Sysadmins who need to start using Apache Hadoop and want to integrate those 2 solutions. In this post I will cover some …

Read More