Scaling MySQL infrastructure is challenging, traditional setup don't scale horizontally and require manual configuration management.
This talk is about how we scaled MySQL infrastructure at Indeed. Using HaProxy, we dynamically take backends in/out of rotation based on replication lag. Through this proxy, we load-balance reads across a pool of replicas, ensure replication lag is below a threshold, and easily take replicas out of rotation for maintenance, removing the work of manually updating application's configuration.
You will also learn about different routing strategy we use, such as fail-to-primary vs fail-open, and about surprising application connection pool's behaviors we learned along the way.
In master/slave replication cluster, there are many branch nodes. They are full-blown database servers yet its only purpose is to serve binlogs to its slaves.
Blackhole storage engine does not storage any data in the table, but allows the transactions to be recorded in the binlogs. Therefore it is suitable as binlog servers with reduced capacity requirements.
With the release of Percona server 5.7.20-19 in January 2018, MySQL server becomes a viable solution as it fixes two critical bugs related to blackhole engine.
This talk describes the implementation details.
Open source relational databases like MySQL and PostgreSQL power some of the world's largest websites, including Yelp. They can be used out of the box with few adjustments, and rarely require a dedicated Database Administrator (DBA) right away. This means that System Administrators, Site Reliability Engineers, or Developers are usually the first to respond to some of the more interesting issues that can arise as you scale your databases.
In this talk, I'll assume that you already have a database up and running and will first go over a broad set of basics to introduce you to MySQL Database Administration. Next, I will cover the InnoDB storage engine, high performance and availability, monitoring and database defense. Finally, I'll cover the wide array of online resources, books, open source toolkits and scripts from MySQL, Percona and the Open Source community that will make the job easier.
While not hands-on, I'll be encouraging questions and this is expected to be a very interactive tutorial!
During this tutorial, attendees will get their hands on virtual machines and migrate standard Master - Slave architectures to the new MySQL InnoDB Cluster (native Group Replication) with minimal downtime. After explaining what Group Replication is and how it works (the magic behind it), we will experiment with multiple use cases to understand MySQL Group Replication. We will also get the attendees more comfortable with this new technology.
During our experiments, we will try to:
- Cleanly stop a node
- Kill a node
- Re-join a node
- Produce conflicts and see how the cluster behaves
- Create data inconsistency
- Recover from full cluster outage
Finally, we will check how we can integrate MySQL InnoDB Cluster with external routing solutions like ProxySQL. We will also see how to use what the cluster exposes to performance_schema and sys schema to make the right choice.
We will highlight the new improvements made in MySQL 8.0 regarding Group Replication and InnoDB Cluster.
Orchestrator is a MySQL topology manager and a failover solution, used in production on many large MySQL installments. It allows for detecting, querying and refactoring complex replication topologies, and provides reliable failure detection and intelligent recovery and promotion.
This practical tutorial focuses on and demonstrates Orchestrator's failure detection and recovery, and provides real-world examples and cookbooks for handling failovers.
- Brief introduction to Orchestrator
- Brief overview of basic configuration
- Reliable detection
- The complexity of successful failover
- Orchestrator's approach to failover
- Failover meta: anti-flapping, acknowledgments, auditing, downtime, promotion rules
- Master service discovery schemes: VIP, DNS, Proxy, Consul
- Cookbooks and considerations for master service discovery and for failover configuration
We will run demos in class. As time allows, the attendees may have time for hands-on operations.
In this hands-on lab, you will learn to troubleshoot and fix common MySQL errors. You will be given a pre-configured EC2 instance to use, so please have an SSH client installed on your laptop. This tutorial is for beginner MySQL database administrators who are comfortable with the command line.
1. Instance Crashes and Hangs
* Basic troubleshooting methods
* Evaluating MySQL and system error messages
* Determining issue causes (OS, filesystem, MySQL configuration, MySQL crashes)
* Recovery methods
2. MySQL Replication
* High-level overview and diagnosing issues
* IO thread and SQL thread issues, relay log corruption, duplicate key errors and data drift
* Recovery methods including checksum and sync
3. Performance Issues and Bottlenecks
* Diagnosing performance issues via OS tools and MySQL utilities
* Determining problems queries and connections
* Reducing contention with MySQL configuration changes and commands
* Troubleshooting memory and NUMA issues
Are you dealing with the challenges of rapid growth? Are you thinking about how to scale your database layer? Should you use NoSQL? Should you shard your relational database? If you are facing these kinds of problems, this tutorial is for you.
Vitess is a database solution for deploying, scaling and managing large clusters of MySQL instances. It's architected to run as effectively in a public or private cloud architecture as it does on dedicated hardware. It combines and extends many important MySQL features with the scalability of a NoSQL database.
By the end of the session, attendees will understand the basics of how to get a Vitess cluster up and running. They will also understand how to perform most of the typical operations you face while running Vitess clusters in a production environment.
TiDB is an open-source distributed HTAP database with both MySQL and Spark SQL interface. TiDB aims to work as an HTAP database that enables real-time business analysis based on live transactional data. Ever since TiDB 1.0 was released in Oct. 2017, hundreds of users entrust TiDB to tackle the scalability problem of MySQL, and to run real-time OLAP queries within a single TiDB cluster deployment with the help of TiSpark, the TiDB connector for Spark. In this talk, I'll talk about the best practices in deploying and operating the TiDB cluster.
High Performance SSDs have been applied to database applications widely. Besides the random access ability provided by the hardware itself, we have made several achievements based on our unique architecture to drive database applications even faster and more easily to manage. In this topic, I will introduce what we have done and how users can benifit from our work.
Getting Percona XtraDB Cluster running is pretty simple these days. But what about after it is running? How do you handle high availability (HA)? How do you handle DDLs? How do you handle backups?
In this short three-hour tutorial, we will cover these aspects and a bit more advanced Percona XtraDB Cluster ideas and practices. We will cover setting up ProxySQL, and using it to handle Percona XtraDB Cluster node failure and recovery.
Table migrations remain a pain point for MySQL DBAs. There are more options than ever for running migrations, with the later versions' in-place alters and new third-party tools (like gh-ost). But with the increase in tools and procedures it's been shown that there is no one-size-fits-all tool. Depending on the table size, available disk space, database traffic, server performance and SLAs, some migration methods make more sense than others.
In this tutorial we will discuss and demonstrate the different tools and methods and the best practices and scenarios for each.
Optional Lab Requirements:
- MacOS or Linux laptop (or VM)
- MySQL Sandbox, Percona Toolkit, gh-ost, sysbench
- MySQL 5.7 generic binary (for MySQL Sandbox)
Migration Concepts and Types
- Straight and in-place ALTER TABLE
- Alter on replicas, then promote
Caveats and Best Practices
- Test each of the migration types in a database cluster
Performance Schema in MySQL is maturing from version to version. It includes extended lock instrumentation, memory usage statistics, new tables for server variables, first time ever instrumentation for user variables, prepared statements and stored routines.
Version 8.0 adds additional variables, replication, error messages, data locks instrumentation. A lot! Amazing! And complicated!
In this tutorial, we will try all these instruments out. We will provide a test environment and a few typical problems that would be difficult to solve before MySQL 5.7. Just few examples:
- "Where is memory going?"
- "Why are these queries hanging?"
- "How huge is the overhead of my stored procedures?"
- "Why are queries waiting for metadata locks?"
You will not only learn how to collect and use this information but will gain practical experience with it. You will also learn many details on how to setup Performance Schema.
This tutorial covers all parallel replication implementation in MariaDB 10.0 and 10.1 and MySQL 5.6, 5.7 and 8.0 (including how it works in Group Replication).
MySQL and MariaDB have different types of parallel replication. In this tutorial, we present the different implementations that allow us to understand their limitations and tuning parameters. We cover how to make parallel replication faster and what to avoid for maximizing its benefits. We also present tests from Booking.com workloads.
Some of the subjects that are covered are group commit and optimistic parallel replication in MariaDB, the parallelism interval of MySQL and its Write Set optimization, and the ?slowing down the master to speed up the slave? optimization.
After this tutorial, you will know everything you need to implement and tune parallel replication in your environment. But more importantly, we will show how you can test parallel replication benefit in a non-disruptive way before deployment.
This is a hands-on tutorial covers how to set up monitoring for MySQL database servers using the Percona Monitoring and Management (PMM) platform.
PMM is an open-source collection of tools for managing and monitoring MySQL and MongoDB performance. It provides thorough time-based analysis for database servers to ensure that they work as efficiently as possible.
You will learn about:
- MySQL monitoring best practices
- Metrics and time series
- Data collection, management and visualization tools
- Monitoring deployment
- How to use graphs to spot performance issues
- Query analytics
- Trending and capacity planning
- How to monitor HA
Please bring a laptop with the Virtualbox application.
Laurie Coffin welcomes everyone to Percona Live Open Source Database Conference 2018
As open source database adoption continues to grow in enterprise organizations, the expectations and definitions of what constitutes success continue to change. In today's environment, it's no longer a question of which database to use, but which databases do you need, what platforms will you deploy them on, and how do you get them to work together. A single technology for everything is no longer an option; welcome to the polyglot world.
At Percona, we see a lot of compelling open source projects and trends that we think the community will find interesting. Following Peter's keynote, we will have a round of lightning talks from projects that we think are stellar and deserve to be highlighted.
Automatization of Postgres Administration
Cloud services like Amazon RDS or Google Cloud SQL help to automate half of DBA tasks: launch database instances, provision replicas, create backups. But the other, very important part is almost not automated now: database tuning and query optimization.
High Performance, Scalable, and Available MySQL Clustering System for the Cloud
Vitess is now used in production at multiple companies. Vitess shines in this area by providing query logs, transaction logs, information URLs, and status variables that can feed into a monitoring system like Prometheus.
Ghostferry: the Swiss Army Knife of Live Data Migrations with Minimum Downtime
Inspired by gh-ost, our tool is named Ghostferry and allows application developers at Shopify to migrate data without assistance from DBAs. We plan to open source Ghostferry at the conference so that anyone can migrate their own data with minimal hassle and downtime.
What is a Self-Driving Database Management System?
People are touting the rise of ""self-driving"" database management systems (DBMSs). But nobody has clearly defined what it means for a DBMS to be self-driving. Thus, in this keynote, Andy provides the history of autonomous databases and what is needed to make a true self-driving DBMS.
The usual keynote revisited explaining the state of MySQL.
Mr Tomas Ulin will talk about the focus, strategy, investments and innovations evolving MySQL to power next generation Web, mobile, Cloud and embedded applications. He will also discuss the latest and the most significant MySQL database release ever in its history, MySQL 8.0.
If you are new to Percona XtraDB Cluster, or haven't heard about it before but would like to meet it, then this is the session for you. We will try to understand:
- What is Percona XtraDB Cluster?
- Is it useful for your use-case?
- What are the important features of Percona XtraDB Cluster (including recent 5.7 features)
This session will cover what makes Percona XtraDB Cluster enterprise-ready and one of the most popular products when it comes to the clustering solution space. I am sure you will fall in love with it.
In this session, Geir will describe the new key features that have all ready been announced for MySQL 8.0.
In addition to Data Dictionnary, CTEs and Windows function the session is covering:
* Move to utf8(mb4) as MySQL's default character set
* Language specific case insensitive collation for 21 languages (utf8)
* Invisible index
* Descending indexes
* Improve usability of UUID and IPV6 manipulations
* SQL roles
* SET PERSIST for global variable values
* Performance Schema, instrumenting data locks
* Performance Schema, instrumenting error messages
* Improved cost model with histograms
The presentation ends with some words on scalability, plugin infrastructure and GIS.
Twitter has been using their own fork of MySQL for many years. Last year the team decided to migrate to the community version of MySQL 5.7 and abandoned their own version. The road to the community version was full of challenges.
In this session we will present the motivation and how we came out with the decision. We will also discuss the challenges and surprises encountered and how we overcome them. Finally, we will talk about the lessons learned, recommendations and our future plans.
The interaction between applications and the database is one of the most intricate and important in the systems we build. This boundary region is special and complex for a number of reasons. It's the surface that separates stateful from stateless, off-the-shelf from custom, and battle-tested atoms from fast-changing molecules. But more than that, it's not crisply defined. There's no bright line that divides these worlds. Instead, there's a zone where each region is partially enmeshed. Consider schema and indexing design, for example, which is a combination of predefined and user-defined; the query language and its interplay with the planner is another. This is a zone of complexity and richness, where lots of gnarly things sprout, but also where there's a lot of opportunity to learn, so you can design and improve applications better. Baron will share lessons learned from his career of observing how applications and databases interact. You'll leave with insights that will help you see new ways to build better.
Many administrators responsible for databases confront two clashing phenomena:
· Data is coming at increasingly higher rates (from an expanding number of sources)
· The time required to process transactions and analyze data is rapidly shrinking
The most common approaches to address these issues and speed up databases are to deploy new hardware and refactor code. At times, however, these approaches are not viable - particularly in the short term - due to implementation risks, cost, and timelines.
In this session, you will learn:
· How parallelism in the I/O layer impacts performance, particularly in database servers
· How interrupt-based I/O limits throughput in systems with high core count
· The connection between I/O waits and CPU context switches
· The impact of parallelizing I/O on solving these problems
· Cloud-based VMs, storage cost, and database performance
· A software-based alternative to mitigating I/O problems
Designing highly available database systems isn't a new topic. On the contrary, with so many options available to architects now, choosing the best fit isn't always trivial. This talk will walk through various options including Standard Multi A/Z MySQL RDS, Aurora for RDS, and Percona XtraDB Cluster in EC2. Comparison points will cover failover processes, accompanying software (primarily ProxySQL), and general use cases.
This is not meant to be a deep dive into any particular design, but rather assist in choosing the proper architecture for a given use case.
This session will be interesting to everyone looking for the latest news about MySQL 8.0 performance:
- MySQL 8.0 is more and more close to GA now
- But what about MySQL 8.0 performance ? ;-)
The latest benchmark results obtained with MySQL 8.0 will be the center of the talk because every benchmark workload for MySQL is a "problem to resolve" and each resolved problem is a potential gain in your production!
Many important internal design changes are coming with MySQL 8.0:
- How to bring them in action most efficiently?
- What kind of trade-offs to expect, what is already good, and what is "not yet"?
- How well is MySQL 8.0 able to use the latest HW?
- Could you really speed-up your IO by deploying your data on the latest flash storage?
These and many other questions are answered during this talk, plus proven by benchmark results.
The latest developments and the enticing roadmap, show that MySQL Replication is addressing requirements in different areas such as operations, flexibility, elasticity, automation and seamless scale-out. Moreover, new replication features appear not only in MySQL 8 but also in MySQL 5.7, as the list of backports in the release changelogs show.
Come and join the engineers behind the product to get to know the latest and greatest replication features and how these enable the creation of rock solid, scalable and resilient database services able to keep up with the most demanding work and fault-loads. Take this opportunity to expend your MySQL knowledge and learn more about hot
topics such as Group Replication.
For the cloud environment, we hope MySQL cluster can do the failover and choose the new master node by the instance-self automatically, without third-party middleware. So we built the Raft protocol inside MySQL.
In MySQL-Raft version, every cluster usually has three nodes, one master and two slaves, but we can support more nodes. When master node is down, the cluster can choose the new master by Raft Protocol, and use Flashback to rollback the committed transactions if needed, to make sure all of the nodes are the same.
At Square we operate several thousand MySQL instances to power a financial network, from payments to payroll. In a word: money. "Mission-critical" isn't critical enough. Come learn how we operate MySQL with billions of dollars at stake. We'll look at everything: configuration, management, monitoring, tooling, security, high-availability, replicaiton, etc.
X-DB is Alibaba's next generation distributed and intelligent database which is ACID compliant, horizontally scalable, globally deployed and highly available. Motivated by the ideas of decoupling compute and storage, and intelligent database, we proposed a hardware/software co-designed architecture for X-DB to pursue extreme performance cost ratio, in order to support the world's largest and still fast-growing e-commerce platform. In this talk we'll introduce the work we have done in X-DB's SQL Engine.
? Plan Cache: a plus to MySQL Engine which boosts QPS by up to 170% on sysbench workloads and 34% on Alibaba's online purchasing system. We'll explain details of how it is created and used to skip heavy index dives at optimization stage, the efficient cache management, and the performance benefits.
? A state-of-art distributed SQL processing framework: the key component of X-DB built on top of MySQL Engine. We'll take a deep dive into the architecture, implementation details and leveraged technologies (i.e. high-performance RPC and scheduling sub-systems). We'll also share the achievement and lessons learnt so far, as well as our roadmap.
In this session, John Jainschigg, Master of the Universe, and Bill Bauman, Head of Innovation and Strategy at Opsview, will review the steps taken to build a simple Kubernetes infrastructure to support serverless workloads.
The infrastructure includes an installation of Percona Server for MySQL for writing form, structured data.
This is relatively high level, focused on the capabilities and possibilities of what can be accomplished using modern technologies including orchestrated containers and serverless infrastructure in your own datacenter. A key point of interest during the discovery phase was, how do we monitor this thing?
We're eager to discuss your ideas, questions and comments, so join us!
The presentation will be a real-life study on how we use PMM for monitoring of 120+ MySQL and ProxySQL-servers, as well as query optimisation
During the project, we found a few caveats, that others embarking this journey should be aware of. We've also found a few "hidden features", where we're able to use PMM in ways beyond the standard interfaces, due to the fact that it's all built on open and battle-tested software.
Keeping data safe is the top responsibility of anyone running a database. Learn how the Google Cloud SQL team protects against data loss. Cloud SQL is Google's fully-managed database service that makes it easy to set up and maintain MySQL databases in the cloud. In this session, we'll dive into Cloud SQL's storage architecture to learn how we check data down to the disk level. We will also discuss MySQL checksums and infrastructure Cloud SQL uses to verify that checksums for data files are accurate without affecting performance of the database.
The next version of MySQL will be a major release of new features and capabilities, including a new data dictionary hosted in InnoDB, new REDO logs design, new UNDO logs, new scheduler, descending indexes, and much more!
Learn all about the changes coming in the next version of InnoDB delivered with MySQL 8.0 !
MySQL replication allows you to write on one writer server and easily scale out reads by redirecting reads to reader servers. But how do we guarantee read consistency with their last write? Galera replication can guarantee that, while MySQL Group Replication and standard MySQL async replication cannot.
If you are running MySQL Server or Percona Server, version 5.7 or newer, with GTID enabled, ProxySQL 2.0 is now able to ensure read consistently with the last write. ProxySQL is able to stream GTID information from all the reader servers, and in real-time is able to determine which reader server(s) is able to execute the SELECT statement producing a resultset that is read consistent with the last write (and GTID) executed by each client.
This presentation will show the technical details that allow you to build an architecture with thousands of ProxySQL instances and MySQL servers, and how GTID information is processed in real-time with limited bandwidth footprint.
Accelerating MySQL with Just-In-Time (JIT) compilation is emerging as a quick and easy way to achieve greater efficiencies with MySQL. In this talk, l'll go over the benefits and caveats of using Dynimizer, a binary-to-binary JIT compiler, with MySQL workloads. I'll discuss how to identify situations where JIT compilation can help, how to get setup and running, and go over benchmark results along with other performance metrics. We'll also peek under the hood and take a look at what's happening at a lower level.
There are substantial improvements in the Optimizer in MySQL 8.0. Most noticeably, we have added support for advanced SQL features like common table expressions, windowing functions and grouping() function. We also made DBAs' lives easier with invisible indexes, and additional hints that can be used together with the query rewrite plugin.
On the performance side, cost model changes will make a huge impact. We have made JSON support even more powerful by adding JSON table function, aggregation functions and more. Come and learn about new features in MySQL 8.0!
The full title of this presentation should be: "Save some bandwidth by not transmitting the full resultset metadata over the wire when you don't need it. " Indeed, one the latest features in the MySQL protocol allows you to save some network bandwidth by not sending the metadata with the resultsets for which you know the metadata.
Join this talk to learn how to turn this on, and how much data does it save per query.
Existing tools like mysqldump and replication cannot migrate data between GTID-enabled MySQL and non-GTID-enabled MySQL -- a common configuration across multiple cloud providers that cannot be changed. These tools are also cumbersome to operate and error-prone, thus requiring a DBA's attention for each data migration. We introduced a tool that allows for easy migration of data between MySQL databases with constant downtime on the order of seconds.
Inspired by gh-ost, our tool is named Ghostferry and allows application developers at Shopify to migrate data without assistance from DBAs. It has been used to rebalance sharded data across databases. We plan to open source Ghostferry at the conference so that anyone can migrate their own data with minimal hassle and downtime. Since Ghostferry is written as a library, you can use it to build specialized data movers that move arbitrary subsets of data from one database to another.
This talk is about measuring and reducing noise in benchmark results. Properly tuning the operating system and hardware to achieve stable results in benchmarks becomes an art in itself these days. There may be many reasons for that:
- jitter in CPU and I/O schedulers
- dynamic CPU frequency scaling
- process address space randomization
- kernel configuration
If you are not seeing stable results in your performance comparisons, you are wasting your time. Since I do a lot of MySQL benchmarks as a part of my job, I have collected a number of recipes to measure and reduce system noise and achieve more stable numbers in benchmarks. I'm going to describe those recipes as well as the new sysbench module implemented to automate those tasks and simplify system tuning for other people.
Would you like to monitor your MySQL and MongoDB instances but don't know where to start? Come to this talk where we will explain what is PMM (Percona Monitoring and Management) and how it will bring improved visibility of your database environment!
In this session, we will explore how we managed to scale Percona XtraDB Cluster. What major issues we found and how we fixed them (a technical walk through). What added advantages this optimization had on overall product, and how Percona XtraDB Cluster is now truly an enterprise-ready clustering solution.
Orchestrator uses Raft consensus as of version 3.x. This setup improves the high availability of both the orchestrator service itself as well as that of the managed topologies and allows for easier operations.
This session will briefly introduce Raft consensus concepts, and elaborate on orchestrator's use of Raft: from leader election, through high availability, cross DC deployments and DC fencing mitigation, and lightweight deployments with SQLite.
Of course, nothing comes for free, and we will discuss considerations to using Raft: expected impact, eventual consistency and time-based assumptions.
Orchestrator/Raft is running in production at GitHub, Wix and other large and busy deployments.
Query tuning can be complex. It's often hard to know which knob to turn or button to press to get the biggest performance boost. In this presentation, Janis Griffin, database performance evangelist at SolarWinds, will share her secrets for determining the best approach for tuning queries by utilizing the performance schema (specifically instrumented wait events and thread states), query execution plans, SQL diagramming techniques and more.
Regardless of the complexity of your database or your skill level, this systematic approach will lead you down the correct tuning path with no guessing, saving countless hours of tuning queries and optimizing performance of your MySQL® databases.
- Learn how to effectively use the performance schema to quickly identify bottlenecks and get clues on the best tuning approach
- Quickly identify inefficient operations through review of query execution plans
- Learn how to use SQL diagramming techniques to find the best plan
Laurie Coffin welcomes everyone to Percona Live Open Source Database Conference 2018
How companies build applications and deploy databases has changed drastically over the last 5 years. Enterprises are moving applications and workloads to the cloud in order to take advantage of flexibility, match resource consumption to actual needs and reduce hardware and software expenses. This panel will discuss the rapid changes occurring with databases deployed in the cloud and what that means for the future of databases, management and monitoring and the role of the DBA and developer.
It's obvious that macro trends such as cloud computing, microservices, containerization, and serverless applications are fundamentally changing how we architect, build, deploy, and operate modern applications. We've already seen how these changes have affected our data platforms dramatically over the past few years. Where is this going? Are we about to see the total obsolescence of the basic administration things we do today, like backups and upgrades? What about schema design, query optimization, and indexing? Will there even BE a database as we know it in ten years? And what role will open source and free software play? Bring your bitpens and write Baron's predictions into the blockchain, because one thing's sure: he's going to say a lot of things that will be proven wrong.
MySQL is the backbone of Slack's data storage infrastructure, handling billions of queries per day across thousands of sharded database hosts. We are the midst of migrating this system to use Vitess' flexible sharding and topology management instead of simple application-based shard routing and manual administration. This effort aims to provide an architecture that scales to meet the growing demands of our largest customers and features while under pressure to maintain a stable and performant service.
This talk will present the core motivations behind our decision, why Vitess won out as the best option, and how we laid the groundwork for the migration within our development teams. We will then present some challenges and surprises (both good and bad) found during our transition and our contributions to the Vitess project that mitigated them. Finally, we will discuss the future plans for our migration and suggest improvements to the Vitess ecosystem to aid other adoption efforts.
The presentation will discuss so of the best practices in determining whether to put you MySQL instances on Amazon RDS, Amazon Aurora or just leave it on-premise. The session will go into details of the pros vs cons of each platform such as performance, versioning, limitations and more. After this session, you will be equipped with
The database team at GitHub is tasked with keeping the data available and with maintaining its integrity. Our infrastructure automates away much of our operation, but automation requires trust, and trust is gained by testing. This session highlights three examples of infrastructure testing automation that helps us sleep better at night:
- Backups: scheduling backups; making backup data accessible to our engineers; auto-restores and backup validation. What metrics and alerts we have in place.
- Failovers: how we continuously test our failover mechanism, orchestrator. How we setup a failover scenario, what defines a successful failover, how we automate away the cleanup. What we do in production.
- Schema migrations: how we ensure that gh-ost, our schema migration tool, which keeps rewriting our (and your!) data, does the right thing. How we test new branches in production without putting production data at risk.
Since the beginning, Facebook has used a conventional username/password to secure access to production MySQL instances. Over the last few years we've been working on moving to x509 TLS client certificate authenticated connections. Given the many types of languages and systems at Facebook that use MySQL in some way - this required a massive amount of changes for a lot of teams.
This talk is part technical overview of how our new solution works and part hard-learned tricks for getting an entire company to change their underlying mysql client libraries.
Starting with MySQL 5.7 a new Document Store feature has been introduced that makes working with JSON documents an integral part of the MySQL experience. The new X DevAPI gives MySQL users the best of both worlds - SQL and NoSQL - and allows an entirely new category of use cases for managing data. It is constantly evolving based on the community feedback and can be run on top of the brand new MySQL InnoDB Cluster feature. This session gives a broad, high level introduction as to what the Document Store is about, its client components, the latest developments, what you can do with it, why you'd want to and how. MySQL 8.0 as Document Store will change the way people use MySQL.
MariaDB has made it easy to switch from MySQL to MariaDB by aiming to be a drop-in replacement. MySQL doesn't make switching back nearly as easy, however. This talk will walk you through the basics of moving from MariaDB to MySQL and back, the best practices, and the problems you will encounter along the way.
We recently finished migrating from InnoDB to MyRocks in our user database (UDB) at Facebook. We have been running MyRocks in production for a while and we have learned several lessons. In this talk, I will share several interesting lessons learned from production deployment and operations, and will introduce future MyRocks development roadmaps.
Vitess is now used in production at multiple companies. This has led to many inquiries about Observability. Vitess shines in this area by providing query logs, transaction logs, information URLs, and status variables that can feed into a monitoring system like Prometheus.
This session will cover these features, along with a demonstration on how they can be used to troubleshoot production issues.
POLARDB provides read scale out on shared everything architecture. It features 100% backward compatibility with MySQL 5.6 and the ability to expand the capacity of a single database to over 100TB. Users can expand the computing engine and storage capability in just a matter of seconds! PolarDB offers a 6x performance improvement over MySQL 5.6 and a significant drop in costs compared to other commercial databases.
POLARDB leverages InnoDB's redo logs for physical replication. InnoDB stores physical page level operations in redo logs for crash recovery. POLARDB extends this functionality to deploy multiple read replicas for read load sharing.
In this talk we'll take a deep dive into InnoDB internals and explain the changes we made to the core InnoDB code. We'll touch upon design issues around logging, crash recovery, buffer pool management, MVCC, DDL synchronization etc.
This talk will be mostly about the core internals of InnoDB. Some basic knowledge of internals like redo logs, undo logs, read view (transaction isolation), purge and buffer pool management will be very helpful.
Kubernetes is the most popular container orchestrator and is enabling enterprises to rapidly containerize their application stacks. Kubernetes' adoption still faces many challenges, particularly when it comes to stateful applications.
The engineers at Kasten have open sourced Kanister to allow ops teams to incorporate their existing tools into Kubernetes. Kanister is a framework for domain experts to write blueprints specifying how to perform data management in Kubernetes. Each blueprint is specific to a data service, like MySQL and can be modified to integrate with your infrastructure. The talk will conclude with demos of backup and restore of MySQL and MongoDB using example blueprints included with Kanister.
This talk will be targeted towards anyone interesting in running stateful applications in Kubernetes. The audience will learn why the current primitives exposed by Kubernetes aren't sufficient for data operations and how Kanister fills in the gaps.
As any modern DBA we lean more towards development daily activities. We write more software and do less routine tasks.
Along the way to development workflow full automation we faced many problems. The session covers solutions of those:
* Git flow adaptation for highly restrictive compliance requirements;
* Unit testing. What to mock and how;
* Surviving dependencies hell;
* Packaging Python code.
In this talk we will review the new functionality released by Amazon Web Services, that allows us to import data from our non-RDS MySQL instances, to RDS instances (MySQL or Aurora). We'll see what works, what doesn't, and how to do it.
At the end of 2016, Oracle released a new Plugin called MySQL Group Replication, which is a new MySQL replication method that aims to provide better High Availability, and built-in failover with consistency guarantees.
I evaluated the initial GA versions back in early 2017. I presented my initial findings with several best practices and concerns with the current implementation which made me state that Group Replication was not quite ready yet.
(Un)lucky as I was, a large part of the attendees were Oracle developers and the months after this, many of these bugs and missing features were implemented in both MySQL 8.0 as well as backported to MySQL 5.7. (Thank you!)
This is a followup presentation on my previous analysis, where I will look into the changes since and re-evaluate the readiness of Group Replication for production usage and provide my insights and opinion on the state of GR.
How could Amazon Migration Service work in your environment to migrate away from that proprietary colossus?
How does it work?
Why would you use this tool, why would u avoid it.
What components does it have?
How does it perform?
These are all questions which are answered during this talk. My goal is to provide you with an overview of it's functionalities and explain you my findings.