September 21, 2014

Join my Oct. 2 webinar: ‘Implementing MySQL and Hadoop for Big Data’

Implementing MySQL and Hadoop for Big DataMySQL DBAs know that integrating MySQL and a big data solution can be challenging. That’s why I invite you to join me this Wednesday (Oct. 2) at 10 a.m. Pacific time for a free webinar in which I’ll walk you through how to implement a successful big data strategy with Apache Hadoop and MySQL. This webinar is specifically tailored for MySQL DBAs and developers (or any person with a previous MySQL experience) who wants to know about how to use Apache Hadoop together with MySQL for Big Data.

The webinar is titled, “Implementing MySQL and Hadoop for Big Data,” and you can register here.

Storing Big Data in MySQL alone can be challenging:

  • Single MySQL instance may not scale enough to store hundreds or terabyte or even a petabyte of data.
  • “Sharding” MySQL is a common approach, however, it can be hard to implement.
  • Indexes for terabytes of data may be a problem (updating index of that size can slow down the insert significantly).

Apache Hadoop together with MySQL can solve many big data challenges. In the webinar I will present:

  • And introduction to Apache Hadoop and its components including HFDS, Map/Reduce, Hive/Impala, Flume, and Scoop
  • What are the common application for Apache Hadoop
  • How to integrate Hadoop and MySQL using Sqoop and MySQL Applier for Hadoop.
  • Clickstream logs statistical analysis and other examples of big data implementation
  • ETL and ELT process with Hadoop and MySQL
  • Star Schema implementation example for Hadoop
  • Star Schema Benchmark results with Cloudera Impala and columnar storage.

I look forward the webinar and hope to see you there! Additionally, if you have questions in advance, please also ask those below, too.

About Alexander Rubin

Alexander joined Percona in 2013. Alexander worked with MySQL since 2000 as DBA and Application Developer. Before joining Percona he was doing MySQL consulting as a principal consultant for over 7 years (started with MySQL AB in 2006, then Sun Microsystems and then Oracle). He helped many customers design large, scalable and highly available MySQL systems and optimize MySQL performance. Alexander also helped customers design Big Data stores with Apache Hadoop and related technologies.

Comments

  1. mandm says:

    Hi Alexandar,
    i really liked your talk on sqoop, is there a tutorial that you like for mysql to sqoop integration?
    I want to really try out analyzing data which is in a mysql over to hadoop

  2. mandm says:

    HI,
    I tried loading data from a mysql db to Hadoop cluster (EMR) using sqoop and it worked fine, but when i try to use HIVE to load that data into a table in S3 i get this error

    hive> LOAD DATA INPATH ‘/user/hadoop/wordcount/part-m-0000*’ INTO TABLE wordcount2;
    FAILED: SemanticException [Error 10028]: Line 1:17 Path is not legal ”/user/hadoop/wordcount/part-m-0000*”: Move from: hdfs://10.xx.xx.xx:9000/user/hadoop/wordcount/part-m-0000* to: s3://iform-dev-s3-bucket-1/samples is not valid. Please check that values for params “default.fs.name” and “hive.metastore.warehouse.dir” do not conflict.

    i think during the presentation you mentioned that it was possible to store data directly into S3 using sqoop with EMR?
    am i doing something incorrect?

  3. Hi Mandm,

    Please try /user/hadoop/wordcount/ instead of /user/hadoop/wordcount/part-m-0000*. Hive takes a “directory” name. You can also use hadoop fs -cp to copy from the hdfs into S3 (it will use map reduce) and then create external table on top of hive.

  4. This is a link to the sqoop documentation: http://sqoop.apache.org/docs/1.99.2/

  5. sandhaya says:

    Hi,

    Nice article related to big data hadoop in this blog post.In this advanced technology, new tools are very important for us to handle big data problem and for this Apache Hadoop concept is important to learn.I have’nt seen your webinar but after went through your blog ,i wanted that webinar so that i could also learn somethingfrom your Online Hadoop Training courses.Hadoop is quite complex and challenging but it can lead to new career in Big data.

    Thanks for the post
    Time and Effort would be well appreciated.

Speak Your Mind

*