Where the open source community meets: Secure your spot for Percona Live Amsterdam! - Register

Downloads

Blog

Contact Us

Join my Oct. 2 webinar: ‘Implementing MySQL and Hadoop for Big Data’

October 1, 2013

Author

Alexander Rubin

Insight for DBAs

Insight for Developers

MySQL

Share this Post:

MySQL DBAs know that integrating MySQL and a big data solution can be challenging. That’s why I invite you to join me this Wednesday (Oct. 2) at 10 a.m. Pacific time for a free webinar in which I’ll walk you through how to implement a successful big data strategy with Apache Hadoop and MySQL. This webinar is specifically tailored for MySQL DBAs and developers (or any person with a previous MySQL experience) who wants to know about how to use Apache Hadoop together with MySQL for Big Data.

The webinar is titled, “Implementing MySQL and Hadoop for Big Data,” and you can register here.

Storing Big Data in MySQL alone can be challenging:

Single MySQL instance may not scale enough to store hundreds or terabyte or even a petabyte of data.

“Sharding” MySQL is a common approach, however, it can be hard to implement.

Indexes for terabytes of data may be a problem (updating index of that size can slow down the insert significantly).

Apache Hadoop together with MySQL can solve many big data challenges. In the webinar I will present:

And introduction to Apache Hadoop and its components including HFDS, Map/Reduce, Hive/Impala, Flume, and Scoop

What are the common application for Apache Hadoop

How to integrate Hadoop and MySQL using Sqoop and MySQL Applier for Hadoop.

Clickstream logs statistical analysis and other examples of big data implementation

ETL and ELT process with Hadoop and MySQL

Star Schema implementation example for Hadoop

Star Schema Benchmark results with Cloudera Impala and columnar storage.

I look forward the webinar and hope to see you there! Additionally, if you have questions in advance, please also ask those below, too.

0 0 votes

Article Rating

Subscribe

5 Comments

Oldest

Newest Most Voted

mandm

12 years ago

Hi Alexandar,
i really liked your talk on sqoop, is there a tutorial that you like for mysql to sqoop integration?
I want to really try out analyzing data which is in a mysql over to hadoop

0

Reply

Author

Alexander Rubin

12 years ago

Reply to mandm

This is a link to the sqoop documentation: http://sqoop.apache.org/docs/1.99.2/

0

Reply

mandm

12 years ago

HI,
I tried loading data from a mysql db to Hadoop cluster (EMR) using sqoop and it worked fine, but when i try to use HIVE to load that data into a table in S3 i get this error

hive> LOAD DATA INPATH ‘/user/hadoop/wordcount/part-m-0000*’ INTO TABLE wordcount2;
FAILED: SemanticException [Error 10028]: Line 1:17 Path is not legal ”/user/hadoop/wordcount/part-m-0000*”: Move from: hdfs://10.xx.xx.xx:9000/user/hadoop/wordcount/part-m-0000* to: s3://iform-dev-s3-bucket-1/samples is not valid. Please check that values for params “default.fs.name” and “hive.metastore.warehouse.dir” do not conflict.

i think during the presentation you mentioned that it was possible to store data directly into S3 using sqoop with EMR?
am i doing something incorrect?

0

Reply

Author

Alexander Rubin

12 years ago

Reply to mandm

Hi Mandm,

Please try /user/hadoop/wordcount/ instead of /user/hadoop/wordcount/part-m-0000*. Hive takes a “directory” name. You can also use hadoop fs -cp to copy from the hdfs into S3 (it will use map reduce) and then create external table on top of hive.

0

Reply

sandhaya

12 years ago

Hi,

Nice article related to big data hadoop in this blog post.In this advanced technology, new tools are very important for us to handle big data problem and for this Apache Hadoop concept is important to learn.I have’nt seen your webinar but after went through your blog ,i wanted that webinar so that i could also learn somethingfrom your Online Hadoop Training courses.Hadoop is quite complex and challenging but it can lead to new career in Big data.

Thanks for the post
Time and Effort would be well appreciated.

0

Reply