Choosing your Analytics Platform

MySQL and NoSQL
3 November 9:00am - 12:00pm @ Sentosa 5-8

Duration: 
3 hours tutorial
There are many different choices available to you for your analytical database needs. Some of these you can use without moving from your existing environment. Others require some thought and technology to make it easy to move the data to where you need it. For example, you can use a separate MySQL server to act as a specialized interface to your data through replication, or you can load or replicate your data to an external database, such as Cassandra, MongoDB, or data warehouse solution such as Vertica, Teradata, Hadoop or others. But how you do choose the right solution to match your data and analytics requirements? In this tutorial we'll take a completely hands-on approach, using live instances to demonstrate the different techniques available to analyse your data: • What are the options for analytics and how do they differ? We’ll look at relational databases, NoSQL, SQL data warehouses, Hadoop using MySQL, Cassandra, Amazon Elastic MapReduce, and Amazon RedShift as working examples. • How to measure your analytics data volumes? • How do you want to query the data? • What latency on the analytics do you need; live or staged? • How do move, export and import the information? • Can replication ever be made to work effectively? • What topologies should you use to make your data and analytics work for you? The hands on portions will show how to load data from MySQL into EMR and Redshift as well as how to write queries against them using publically available, non-trivial data sets such as the IMDB movies archive. We’ll supply the Amazon instances for the lab work. You just need to login from your laptop using ssh and the public key we’ll provide to you.


Speakers

Senior Product Line Manager, VMware
Biography: 
A professional writer and technologist for over 20 years, MC Brown is the author and contributor to over 26 books covering an array of topics, including programming, system management, networking, data centres and web technologies. His expertise spans myriad development languages and platforms, with Systems using Perl, Python, Java, JavaScript, C, C++, Shellscript, Windows, Solaris, Unix, HP-UX, Open Source, Linux, BeOS, Mac OS/X and many more. A former LAMP Technologies Editor for LinuxWorld magazine, and a regular contributor to ServerWatch.com, LinuxPlanet, ComputerWorld and IBM developerWorks. As a Subject Matter Expert for Microsoft for Windows Server and server certification projects. He draws on a rich and varied background as founder member of a leading UK ISP, systems manager and IT consultant for an advertising agency and Internet solutions group, technical specialist for an intercontinental ISP network, and database designer and programmer. Most recently he has concentrated on building high quality user-focused information and products through his books, articles, and MySQL and the MySQL groups within Sun and Oracle. In addition to producing the content and the content-delivery systems, including building documentation, white papers, and marketing materials. Throughout his career he has acted as architectural advisor to a wide variety of products, focused on user-centric functionality and use cases with an eye on enhancing current feature-sets with an eye to future functionality and requirements in a flexible way to ensure both ease of use and development. These activities have led to a keen eye and experience in Big Data, Hadoop, MySQL, NoSQL, Oracle, virtualisation, datacentres, content delivery, data migration and replication technology for heterogeneous databases.
Senior Staff Engineer, VMware
Biography: 
Robert Hodges is the former CEO of Continuent, which was acquired by VMWare in late 2014. He works on cloud data management services for VMware’s cloud offering, vCloud Air. Robert has over three decades in the field of databases and has worked with technologies ranging from pre-relational systems to Hadoop and NoSQL. He has a wide range of technical interests and is currently focused on DBMS operation in hybrid clouds.