Taking a design requirement for a high-volume, high-availability architecture from requirements to finished product
Design is always based on requirements. But what if the requirements are unrealistic? Or requirements don't tell you the whole picture? Or there are requirements that are not obvious? Or advertised features don't work as expected? This talk shows you through a real-life architecture how not to take anything for granted - including requirements, MySQL features and performance.
Searching text data for words and substrings is deceptively difficult to do efficiently. What are the options, and how do they compare? This talk will compare several solutions for full text search, including MyISAM FULLTEXT, InnoDB FULLTEXT, Sphinx Search, Solr, Xapian, and trigraphs.
The MySQL manual states that InnoDB compression works well for read-mostly
workloads. We want to use it for OLTP workloads. We improved the
monitoring and performance of it to make that possible.
We added monitoring to track the number of compressions and compression
failures per table. This allowed us to efficiently determine how well each
table compresses and decide which ones to compress in production.
Another improvement was the reduction in the amount of logging performed
by InnoDB. By default, the entire compressed page image is written to...
Since Drizzle was forked from MySQL several years ago, some aspects continue to mirror MySQL while others are completely new and different. Consequently, even experienced MySQL DBAs need to learn Drizzle 7 from the bottom up: configuration, administration, plugins, and replication. This session introduces those aspects of Drizzle 7.1 assuming no prior knowledge of Drizzle. The information presented will allow a DBA new to Drizzle to get the database server up and running in a realistic environment, from which point they can study, learn, compare, and evaluate it more thoroughly.
The Gizzard Open Source framework is the engine that scales the databases for much of Twitter. Gizzard handles sharding, data replication, cluster expansion, data migration, and provides an API for applications to access the data.
Learn how Twitter uses Gizzard to manage its largest data stores, including:
How Gizzard systems are implemented: common parts, custom code
Hadoop has become a popular platform for managing large datasets of structured and unstructured data. It does not replace existing infrastructure, but instead augments them. Most companies will still use relational databases for transaction processing and low-latency queries, but can benefit from Hadoop for reporting, machine learning or ETL. This talk will include:
What is Hadoop and why do I care?
What do people do with Hadoop?
How can a MySQL DBA add Hadoop to their architecture?
Many Java developers using MySQL as a data backend rely on Hibernate to bridge their OO designs with the relational database world.
This talk will review Hibernate and some of it's related projects, with a focus on performance. We will also cover performance related considerations about Connector/J, discussing settings and usage scenarios that will be useful even for Java developers not using Hibernate.
Virident Systems will host a 50 minute customer discussion around using SSDs in web environments. It will cover use cases where SSDs are used today, how to select suitable SSDs and lessons learned in SSD implementation.
We’ll take close look on how to offload MySQL server with queries which may be efficiently handled on the Sphinx side. Real-world examples for handling non-Full-Text queries including geodistance search, time segments implementation, using multi-value attributes and more. We’ll also learn how to keep massive text collections away from MySQL databases and how to utilize power of Sphinx on Full-text queries.
If you are tired of managing your own cron jobs, email alerts and backup retention with a mess of various scripts then this talk is for you.
Introducing XtraBackup Manager, the OpenSource backup management software made especially for use with XtraBackup. This talk will cover the features and functionality offered by the tool as well as giving a basic demonstration of how easy it is to use and some explanation of how it works, before finally opening up the floor to questions and discussion.
Craigslist uses a variety of data storage systems in its backend systems: in-memory, SQL, and NoSQL. This talk is an overview of how craigslist works with a focus on the data storage and management choices that were made in each of its major subsystems. These include MySQL, memcached, Redis, MongoDB, Sphinx, and the filesystem. Special attention will be paid to the benefits and tradeoffs associated with choosing from the various popular data storage systems, including long-term viability, support, and ease of integration.
MariaDB had its first GA release in February 2010 (MariaDB 5.1, based on MySQL 5.1). Since then, we've released MariaDB 5.2 (based on MySQL 5.1), MariaDB 5.3 (based on MySQL 5.1) and MariaDB 5.5 (based on MySQL 5.5 with all features up to MariaDB 5.3). Two years and four major releases with a tonne of major features. Why should you care about it? This is not a talk about the community around MariaDB, but a feature-by-feature blowout as to why you should consider this database.
What if you had all the data you needed to measure system performance
and scalability at any tier, discover performance and stability
problems before they happen, and plan for capacity and performance by
modeling the system's behavior at greater load than you currently have?
Now it is as easy as running tcpdump and processing the result with a
tool. In this two-part talk you will first learn how to do black-box
performance analysis to discover hidden problems in your systems. In
the second part you will learn about mathematical performance and
To understand large and busy environments one needs a collection of tools and methodologies. We will discuss how performance analysis, management and diagnostics are done at Facebook - from use cases of our cluster-scale MySQL client pmysql, to how we use additional instrumentation both inside MySQL (table/user statistics, slocket) and outside - the web stack performance reporting. Additionally we will talk how Poor Man's Profiler - MySQL internals inspection - can be helpful in production environments.
I this talk I will cover Solid State Drives internals and how they affect database performance.
IO level benchmarks for SATA (Intel 320 SSD) and PCI-e (FusionIO, Virident) cards
to show absolute performance and give an idea on performance per $.
And finally how you can use MySQL and Percona Server with SSD,
what tuning parameters are most important and what performance may expect in real
MySQL has a long history at Google and, enhanced with several patches, has been providing us with the combination of speed, reliability and scalability which makes MySQL the preferred relational storage solution for many projects. In this presentation we want to talk about what makes MySQL attractive in our infrastructure, give an overview of the life of a MySQL installation at Google and share our experience in moving from MySQL 5.0 to 5.1. We will discuss the reasons behind this migration effort and the way we are trying to do it.
In this sessions we will look at different tuning aspects of MySQL Cluster.
As well as going through performance tuning basics in MySQL Cluster, we will look closely at the new parameters and status variables of MySQL Cluster 7.2 to determine issues with e.g disk data performance and query (join) performance.