Welcome to Percona Live!
It has been an exciting year in the open source database industry, with more choice, more cloud, and key changes in the industry. In his keynote address, Peter Zaitsev of Percona will dive into the key developments over the last year, including the most important open source database software releases, the significance of cloud-native solutions in a multi-vendor multi-cloud world, the new criticality of security challenges, the evolution of the open source software industry, and interesting new results from a new open source database survey.
In this keynote, Yoshinori will share interesting lessons learned from Facebook's production deployment and operations of MyRocks and future MyRocks development roadmaps.
MariaDB Server has come a long way since it forked MySQL in 2009. Nine releases later, it is now default in most major Linux distributions no longer a mere alternative to MySQL, but to Oracle, through its compatibility mode with support for PL/SQL and other non-MySQL functionality.MariaDB Foundation has also matured since it was established in 2012. With board members from MariaDB Corporation, Visma, Booking.com, IBM, Automattic, Tencent, Alibaba and Microsoft, MariaDB Foundation has created a vibrant community of contributors, with the number of submitted pull requests being four times that of MySQL over their respective lifetimes. Join MariaDB Foundation Senior Team Lead VicenÈ›iu Ciorbaru for this keynote, where he compares the mission of MariaDB Foundation with its achievements and highlights the latest contributions and functionality in its freshest stable release, 10.4.
TiDB is a popular open source distributed NewSQL database. It speaks the MySQL protocol, and the majority of its syntax - to your applications it looks like MySQL 5.7. Since its birth four years ago, TiDB has been used by more than 300 companies, scenario coverage: core banking systems / High concurrency internet service / Real-time analysis. In this talk, I will introduce the current status in the TiDB community, the core features of TiDB 3.0, and roadmap to make TiDB a true distributed HTAP database.
Customer obsession is one of the key leadership principles at Amazon. We innovate on customer's behalf and focus on solving their problems. Over the years, customer usage and dependencies on open source technologies have been steadily increasing; this is why we've long been committed to open source, and our pace of contributions to open source projects continues to accelerate. Attend this talk to learn about the motivation for open source at AWS and what drives our active participation in open-source communities.
As database solutions that are dedicated to solving the scale-out pain points in MySQL, Amazon Aurora MySQL and TiDB share a lot in common. They can both scale out your database storage capacity horizontally and their compatibility with standard MySQL enables you to re-use your existing code, applications, drivers and tools without changing a single line of code, in most cases.
This session covers more than the similarities. You can take away the comparison and contrast between the performance benchmarking results, the design and architecture, and the different use cases between Amazon Aurora MySQL and TiDB. Most importantly, Ed will demonstrate how you can combine the two to empower the Amazon Aurora MySQL users with a Hybrid Transnational and Analytical Processing (HTAP) database.
In an era where threats and challenges seemingly about protecting your big data and open source workloads, is no longer an optionâ€¦ It is vital to the success of your business. Join Veritas for this information-packed session, where we will discuss the why, how and business impacts of protecting today's business-critical big data and open sources workloads. We will also discuss how to move beyond basic replication to a true data protraction and application recovery strategy.
When you want to process your application logs, Elasticsearch bubbles up as the go-to technology. Elasticsearch is a popular, open source distributed search and analytics package. The Elasticsearch stack adds a usability layer to search, analyze and process your Apache Lucene data. Open Distro for Elasticsearch is a full open source package designed by AWS to enhance and protect the open source capabilities of the base Elasticsearch engine. This distribution bundles critical open source components including security, cluster diagnostics, alerting and SQL capabilities for Elasticsearch. My talk will guide you through Open Distro features and build tools. I will also cover the project's community driven approach to building a great open source search stack where you can join in and collaborate.
MySQL is the world's most popular open source database and Kubernetes is the most popular and rapidly-developing project currently. The purpose of this talk is to explain and demonstrate how running a complex stateful application such as a database is made easier using Kubernetes and that there are a number of options available.
MySQL deployment patterns covered will start with explaining simply how to run MySQL with a simple command using a helm chart; onto how a MySQL asynchronous replicated master/slave MySQL pattern works on Kubernetes; onto how several different MySQL operators can be used giving a detailed discussion; and demonstration will showcase the Oracle MySQL Operator which uses group replication and the MySQL router, and makes creating MySQL clusters, backups, and restorations trivial.
Last to be covered will be Vitess which is used for horizontal scaling of MySQL which has numerous benefits such as built-in sharding and shard management, connection-pooling, query sanitization.
It's easy for modern, distributed, high-scale applications to hide database performance and efficiency problems. Optimizing performance of such complex systems at scale requires some skill, but more importantly, it requires a sound strategy and good observability, because you can't optimize what you can't measure. This session explains a performance measurement and optimization process anyone can use to deliver results predictably, optimizing customer experience while freeing up computing resources and saving money.
The session begins with what to measure and how; how to analyze it; how to categorize problems into one of three types; and three matching strategies to use in optimization as a result. It is a recursive method that can be used at any scale, from a data center with many types of databases cooperating as one, to a single server and drilling down to a single query. Along the way, we'll discuss related concepts such as internally- and externally-focused golden signals of performance and resource sufficiency, workload quality of service, and more.
Facebook has been one of the largest MySQL users and contributors. Facebook has MySQL Production Engineering team and Engineering team. We worked together to add features we need, and improved quality so that we could run reliably in production. In this session, the speaker will introduce how we evolved MySQL for our workloads and modern hardware, including enhanced replication, space efficient storage engine, multi-tenancy.
In this session, we will review key elements to take into account when migrating MySQL into the Cloud. We will share our experience of working with many different customers across the globe describing most effective procedures.
- IaaS vs DBaaS
- Migrating data
- Replication between On-Premises and Cloud
- Testing Cloud environments
- Load Balancers
- High Availability
MySQL 8.0 is a major release of new features and capabilities, including a new data dictionary hosted in InnoDB, new REDO logs design, new UNDO logs, a new scheduler, descending indexes, and much more!
Learn all about the changes in InnoDB delivered with MySQL 8.0 and how they affect the performance and the manageability of your database!
Running an analytical (OLAP) workload on top of MySQL can be slow and painful. A specifically designed storage format ("Column Store") can significantly improve analytical queries' performance. There are a number of opensource column store databases around. In this talk, I will focus on two of them which can support MySQL protocol: MariaDB ColumnStore and ClickHouse.
I will show some realtime benchmarks and use cases, and demonstrate how MariaDB ColumnStore and ClickHouse can be used for typical OLAP queries. I will also do a quick demo.
Amazon DocumentDB (with MongoDB compatibility) is a fast, scalable, highly available, and fully managed document database service that supports MongoDB workloads. In this tech talk, we will introduce Amazon DocumentDB and its unique architecture that makes it easy to run, manage & scale MongoDB-workloads in the cloud. Amazon DocumentDB is designed from the ground-up and uses a unique, distributed, fault-tolerant, self-healing storage system that automatically scales storage. Additionally, with Amazon DocumentDB's architecture, the storage and compute are decoupled, allowing each to scale independently. Through common use cases, we'll discuss why this architecture helps developers reach faster to their business needs. We'll also talk about how developers can use the same MongoDB application code, drivers, and tools as they do today to run, manage, and scale workloads on Amazon DocumentDB without having to worry about managing the underlying infrastructure.
Percona XtraDB Cluster 8.0 (PXC-8.0) is the latest addition to PXC family.
Starting MySQL/PS-8.0 upstream has done a lot of significant changes including atomic DDL, replication channel, locking algorithm changes, security/ encryption, etc....
During this session, we will explore how these changes affect PXC-8.0, what has changed in PXC-8.0, new and deprecated features, important bugs and more. If you are already a PXC user or planning to consider it attend this session to findout more about PXC-8.0.
Building a robust and reliable distributed database is not easy. In TiDB, to ensure our users' data is always safe and the system is always stable, we use Chaos Engineering to help uncover system-level weaknesses before they appear in production environment.
In this talk, I will cover the different types of fault injection techniques our team uses to test TiDB, how we build our own Chaos Engineering platform, and how we integrate various Chaos Engineering techniques into our own automated testing framework, called Schrodinger, to support continuous testing of TiDB.
It's 2019 and there are so many choices for storing your data. There are old players on the market and there are some new kids on the block. Making the wrong choice for your database can effectively sink your product, project, and even your reputation! Should you go with SQL or NoSQL or Document database or something in between? What about polyglot persistence? Reactive, event-driven, non-blocking, async applications? What about the language bindings? What about support? Performance, Tuning, Tooling, Monitoring, Observability, Upgrades, Rollbacks, Migrations, Search & Indexing, Analytics, Availability, Durability, ACID or BASE? Building, running & maintaining storage infrastructure is non-trivial. We will look at the state of databases in 2019 and try to answer some of these questions.
PostgreSQL is undoubtedly the second most popular open source RDBMS and it is within the top five most popular DB engines as per db-engines.com. Why not take a rest from your favorite database server for a while, start learning some more about it, and aim for another logo on your resume?
In this session, we are going to explore this RDBMS, compare it with its arch-rival, MySQL, to both correlate concepts and understand where Postgres excels. Among the topics we will cover:
* Server architecture
* Replication and HA
* Postgres specific features
* and more!
You can have fully automated high availability PostgreSQL on your Kubernetes cluster ... today. The Patroni system for automating PostgreSQL deployment, failover, and migration is ready to use and in production in several places. In this live demo session, we will show you how you can make use of this technology.
We will run though setting up PostgreSQL clusters, both using basic Patroni and using a PostgreSQL Operator. We will then demonstrate failover and disaster recovery, go over some basic configuration options, show how security & authentication works, and explore some plugins and options. After that, you'll learn about the current state of the Patroni project as well as what the alternatives are.
If you need to administer more than one PostgreSQL replication cluster, you'll want to see how Patroni and Operators can make your daily DBA headaches and 2am wakeups go away.
Are you a user of the world's most popular Cloud provider, and someone who leverages their Database As A Service platforms of RDS or Aurora? Come to this talk to hear how PMM can provide rich visibility of these platforms for MySQL and PostgreSQL. We'll cover:
* Connecting PMM Server to RDS and Aurora for MySQL and PostgreSQL metrics activity
* Accessing the AWS CloudWatch API for fundamental resource consumption monitoring (CPU, Disk, Memory)
* Using PMM's Query Analytics against RDS MySQL or Aurora MySQL
* Section1: Terminology. Quick intro to shard/replicaset/instance/host and the scale of FB
* Section2: Lifecycle of an instance
* Section3: Tools that are used for moving instance between states: MPS Copy, rebuild_db etc
* Section4: Shard movements(OLM) and touch base on balancing
* Section5: Test Infrastructure: Shadow and Merlindb
We in Percona are constantly looking at how we can help our customers to have the right solution at the right time. We are constantly looking to identify the right tool for the job.
In our constant research, we identified that migrating from Oracle, MS SQL or any other closed source data platform, is becoming more and more relevant for large organizations.
The maturity reached by open source solutions, the large adoption of those solutions in medium-size companies or for new projects, in conjunction with the agility they have to adapt platforms to newer and more modern needs, had finally led large organizations/enterprises to understand that "It can be done".
Still, the journey from closed source to open source is not always a walk in the park.
There are many factors that must be considered and analyzed, to successfully migrate.
There are simple situations, where a simple data migration with few schema adjustments is enough, and much more complex scenarios where to migrate, the whole logic must be reorganized.
Finally, there are situations where migrating is simply not possible or valuable.
Having a clear path that can help you to assess what is what, and how much effort is needed for each case, is gold. Literally, because it will allow you to focus on what makes sense, and the right effort and resources.
The scope of this presentation is to illustrate how we in Percona perform that assessment, and what is our methodology to assist our customer to successfully decide how and if to migrate.
The evolution of MySQL replication shows that there is a lot of effort put into reducing operations cost and minimize administration overhead. Thus MySQL DBAs and DevOps can spend more time expanding their infrastructure, rather than catering for it.
Many different areas have been enhanced. For example security, operations, fail-over, observability, failure detection, consistency, split-brain protection and primary election, flexible replication workflows and more.
This session highlights the new replication features in MySQL 8.0. Those that were released pre and post-GA. Come and learn, directly from the engineers, how the new features help you operate, sustain, and extend your MySQL Replication infrastructure.
MariaDB 10.4 will come with new Galera Replication version 4. This presentation will outline the new features of Galera 4 Replication as present in MariaDB 10.4 and share the early user experiences with it.
Galera is a generic replication plugin, making it possible to deploy synchronous multi-master cluster topologies with database servers supporting Galera Replication plugin API (i.e write set replication, wsrep API) . Currently both MySQL and MariaDB servers have Galera Replication support, and today there are thousands of MySQL and MariaDB based cluster installations, around the world, processing production system loads in bare-metal or cloud deployments.
With Galera 4, MariaDB 10.4 cluster further extends the capabilities of the synchronous Galera replication. The most prominent feature in Galera 4 version, is streaming replication technology, which implements distributed transaction processing within the cluster. With streaming replication, a transaction can be launched to execute in all cluster nodes in parallel. With this, a large transaction can be executed in small fragments due out the transaction life time, and cluster will not choke with the replication of one large transaction write set, as happened in earlier Galera Cluster versions.
Streaming replication works as a foundation for many more features, to be released in short term. e.g. XA transaction support will now be possible thanks to streaming replication technology.
The changes that MongoDB is bringing to the world in 4.0 are tempting to many companies and administrators, but if you are on an older version such as 3.2, upgrading with minimum disruption can be challenging.
Some of the problems that a company might face are that older language drivers versions may not be compatible with newer versions of mongo, the upgrade path might change, and there is a lot of documentation to read before you embark on the upgrade path.
If in 3.2 the config servers could be both mirrored or replica sets, in 3.4, it's a must to have the config servers as replica sets. Upgrading without interruption can be difficult. With so many things to read about, it's always good to have a reference that will give you an understanding of things that you should consider. A form of cheat sheet.
In this 50 minutes session, I would like to present the steps needed to move from 3.2 to 3.4, which will include some tips to upgrade the config servers to replica sets with minimal disruption and from 3.4 to 3.6 and from 3.6 to 4.0, again, with minimal disruption.
The session will also mention best practices that will apply when running any version upgrade.
ProxySQL, the high performance, high availability, protocol-aware proxy for MySQL is now GA in version 2.0.
This version introduces several new features, like causal reads using GTID, better support for AWS Aurora, native support for Galera Cluster, LDAP authentication and SSL for client connections.
This session provides an overview of the most important new features.
The Panel discussion will feature topics such as views in cloud providers strip mining open source, how recent licence changes in MongoDB, Redis, Confluent etc impact the community. Impacts and views on Kubernetes and containerization how it will impact the database ecosystem.
With countless data breaches as well as massive outages across all cloud providers and several large companies. Most are caused by human error or configuration issues. How do we harden our environments and guard against these and finally, most exciting technologies our panellists see on the horizon.
Amazon Relational Database Service (RDS) is a fully managed relational database service that enables you to launch an optimally configured, secure, and highly available database with just a few clicks. It manages time-consuming database administration tasks, freeing you to focus on your applications and business. We review the latest available features and capabilities.
Join us for the MySQL Community Awards presented by Emily Slocombe!
Benchmarking a distributed database is no easy task. Frank will share some best practices, tricks of the trade, and general guidelines on how to benchmark distributed databases on Intel Optane, using TiDB as an example, to generate valuable results and worthwhile insights. He will also talk about free Intel tools that are available to help developers do effective benchmarking.
MailChimp has grown from a small company to serving millions of micro-businesses, in addition to SMBs and Enterprises. We have a fairly pedestrian approach to MySQL, but we now run hundreds and perhaps soon to be thousands of MySQL instances. Our present state is thanks to great full-stack engineering teamwork. This is a glimpse into what makes Mailchimp tick. Hint, it's not just technology. This talk is about "Momentum" and "Pragmatism", and creating a great outcome for our mission of empowering the underdog.
With the advent of the Health Insurance Portability and Accountability Act (HIPAA) of 1996 all entities that handle health information are required by law to secure all data which contains personally identifiable information (PII) and private health information (PHI). Fines for leaking this data can range from $100 to $50,000 per leaked record. A data breach or leak is extremely costly for both the patients as well as the companies that are entrusted with their PHI. In our presentation, we introduce Gonymizer, a tool that is written in Go at SmithRx to handle the anonymization of PHI and PII data from our production database instances.
This data is anonymized and loaded into non-production environments to allow us to use representative data to develop and test against. This makes anonymization of sensitive information quick and simple using a simple column map that is defined in a single JSON file for your dataset. There is a selection of custom processors that we have built to handle basic tasks, such as first and last name anonymization, changing data to fake locations such as street addresses, cities, zips, and states. The interface for building processors is also completely extendable and anyone with basic Go experience should be able to build processors that can anonymize your data efficiently. We will also show how this tool decreases our development time for new features as well as simplifying testing in a compliant environment with non-sensitive data sets (HIPAA, PCI, etc).
Toward the end of our presentation, we will be discussing how we built our infrastructure using Docker to containerize Gonymizer and schedule anonymization and loading of our test environments using Kubernetes. This talk is targeted for anyone working in the healthcare space where collected data contains PHI and/or PII and is regulated by HIPAA.
Vitess has continued to evolve into a massively scalable sharded solution for the cloud. It's is now used for storing core business data for companies like Slack, Square, JD.com, and many others.
This session will cover the high-level features of Vitess with a focus on what makes it cloud-native.
We'll conclude with a demo of the powerful materialized views feature that most sharded systems have yet to solve.
PMM 2.0 represents a significant advance in terms of monitoring for Open Source Databases. Come to this session in order to learn about the following:
* Query Analytics improvements
* Architectural changes
* API and GUI configuration
The MySQL replication and HA suite of tools and technologies ensure that we have efficient and safe replication at Facebook scale. In spite of different kinds of process and system failures, MySQL continues to safely replicate trillions of transactions a year. The MySQL Replication/HA stack at Facebook, which delivers this scale involves Facebook enhanced Semi-Sync replication plugin, Binlog Server and high-availability suite of tools called DBStatus, Logtailer and FastFailover. Come learn about these exciting technologies and how we handle these challenges of scale.
In the first portion of the 2 part talk, we will focus on Replication topologies, Semi-synchronous replication, 2-PC with Engine, Binlog transmittal, Binlog Server and specific MySQL server features that we developed at Facebook to improve Replication throughput and HA.
Introductory-level session for database administrators interested in taking their first steps in migrating an existing Oracle database to PostgreSQL.
- What are the steps?
- What are the major challenges?
- Which tools should we use?
The session will focus on the basic steps, procedures and tools every DBA should use for migrating Oracle databases to PostgreSQL and will cover: source database migration assessment, schema conversion, data replication and performance tuning.
1. Overview of the five steps for database migrations: assessment and migration planning, schema conversion, data migration, application conversion.
3. Why planning and assessment is important for successful migration execution.
4. Key take-away insights from planning your migration that will impact implementation.
5. Using migVisor for the migration assessment.
6. Using Ora2PG for schema conversion.
7. Data replication: concepts, best practices and available tools.
This session will be interesting to everyone looking for the latest news about MySQL 8.0 Performance:
- since MySQL 8.0 we moved to "continuous release" model
- so with every update, many new improvements are delivered
- but how does it also improve MySQL 8.0 Performance ? ;-)
- the latest benchmark results obtained with MySQL 8.0 will be in the center of the talk
- because every benchmark workload for MySQL is a "problem to resolve"
- and each resolved problem is a potential gain in your production!
- many important internal design changes are coming with MySQL 8.0
- how to bring them all in action most efficiently?
- what kind of trade-offs to expect, what is already good, and what is "not yet"?
- how well MySQL 8.0 is able to use the latest HW?
- could you really speed-up your IO by deploying your data on the latest flash storage?
- these and many other questions are answered during this talk + proven by benchmark results..
- and as usual, some surprises to expect ;-))
Open source software is not the new kid on the block anymore! But it takes some time and effort to take these independent software components and bundle them into working software that meets your organization's needs. At Walmart Labs, we are having fun automating the full lifecycle of a database using MariaDB in private and public clouds. In this talk, we will go over how we are building a fully automated database platform and the lessons learned from running these distributed systems in a cloud environment at scale.
If you are interested in open source technologies and curious to know how we manage 6000 (and growing!) computes in the cloud with a small team â€“ then come join us for a fun-filled tech talk!
Either because of a new feature, a bug, or just for archival purposes, it is often necessary to update or remove large amounts of documents in production.
The challenge with this type of operation is not only to design an efficient process query-wise, but to be able to execute it in production without debilitating the servers or causing secondaries to lag.
There are strategies that can be used to create highly controlled write processes that could run for days under the radar, getting the job done without greatly impacting your application's performance.
In this session, I'm going to share with you key points to consider when creating massive write operations in MongoDB, examples of real-life processes executed, and a few lessons learned.
Running MySQL in the cloud isn't magic - it can be brilliant, but it can also be a real challenge. We'll learn about some of the big wins that can be had in the cloud (such as elasticity and easy provisioning). But, we'll talk about the dark underbelly as well - and face some of the challenges to run a realistic MySQL installation in the cloud (errrr... performance?).
TiDB is an open-source NewSQL database that speaks the MySQL protocol. However, it does not have any source code in common with MySQL. This presentation will dive into how SQL processing occurs in TiDB, from the initial query parsing to retrieving rows from TiKV as part of the execution.
As consultants, we are often asked to perform a one-off assessment of existing environments. These requests involve basically three phases: data collection, analysis, and results presentation. In order to reduce the time spent on low-value tasks, we introduced automation to turn collected data into a deliverable and to assist consultants with the analysis and recommendations.
In this session we are going to share the process details, technologies involved, and the benefits introduced to our organization
Audience Takeaways/Take Back to Work:
- Discover a way to construct/manipulate dynamic documents
- Learn about the benefits of introducing automation into documentation processes
- Hear a few lessons learned during the development of the solution
"I migrated from a proprietary database software to PostgreSQL. I am curious to know whether I can get the same features I used to have in the proprietary database software."
The market coined the term "enterprise grade" or "enterprise ready" to differentiate products and service offerings for licensed database software. For example, there may be a standard database software or an entry-level package that delivers the core functionality and basic features. Likewise, there may be an enterprise version, a more advanced package which goes beyond the essentials to include features and tools indispensable for running critical solutions in production. With such a differentiation found in commercial software, we may wonder whether a solution built on top of an open source database like PostgreSQL can satisfy all the enterprise requirements.
So, in this talk, we shall discuss how you can build an Enterprise Grade PostgreSQL using open source solutions.
We'll discuss a list of Enterprise-grade features that include -
1. Securing your PostgreSQL database cluster
2. High Availability for your PostgreSQL setup
3. Preparing a Backup strategy and the tools available to achieve it
4. Scaling PostgreSQL using connection poolers and load balancers
5. Tools/extensions available for your daily DBA life and detailed logging in PostgreSQL.
6. Monitoring your PostgreSQL and real-time analysis.
This talk covers some of the challenges we sought to address by creating a Kubernetes Operator for Percona XtraDB Cluster, as well as a look into the current state of the Operator, a brief demonstration of its capabilities, and a preview of the roadmap for the remainder of the year. Find out how you can deploy a 3-node PXC cluster in under 6 minutes, how you can handle providing self-service databases on the cloud in a cloud-vendor agnostic way, and ask the Product Manager questions and provide feedback on what challenges you'd like us to solve in the Kubernetes landscape.
This talk will focus on the self-managed nature of Uber's database monitoring and how we've leveraged our open source time series database M3DB to support massive multi-region scale and high cardinality monitoring.
We'll cover how we monitor applications, databases, and their interactions and how we automatically setup application specific dashboards and alerts. This includes things like the ability to alert on metrics like P99 latency and slow queries for a given application at the per-table and per-query level.
Of course, automated and fine grained monitoring requires the ability to ingest, persist, and query massive amounts of high cardinality time series data. We'll talk about the architecture of M3DB and how we've leveraged it at Uber to scale our monitoring systems to billion of unique time series and 10s of millions of data points per second.
We'll conclude the talk with an overview of our Prometheus and Kubernetes integrations, explaining how you can start leveraging M3DB for your own workloads. Finally, we'll give a brief overview of our plans to evolve M3DB into a general purpose, horizontally scalable event store.
The MySQL replication and HA suite of tools and technologies ensure that we have efficient and safe replication at Facebook scale. In spite of different kinds of process and system failures, MySQL continues to safely replicate trillions of transactions a year. The MySQL Replication/HA stack at Facebook, which delivers this scale involves Facebook enhanced Semi-Sync replication plugin, Binlog Server and high-availability suite of tools called DBStatus, Logtailer and FastFailover. Come learn about these exciting technologies and how we handle these challenges of scale.
In the second portion of the 2 part talk, we will focus on automations we've been building on top of FB MySQL replication technologies. The automations maintains MySQL high availability by dealing with both day-to-day hiccups to large scale disasters. The automation areas we will cover include automatic master failover, data consistency invariants, replication failure domain, power-loss failure recovery and continuous disaster drills.
Modern relational databases are all tied together by a well-defined standard for SQL, the latest version published in 2016. But not all SQL implementations follow the standard and even when they do, they often embellish on it and users tend to use the features they see available to them. As a result, migration between databases tends to be a challenging task. In this talk I will give some examples of database migrations I have participated in over the last twenty years and what issues we ran into with each. I will also talk in a more general sense about:
â€¢ Database only migration vs application refactoring - which is likely to have the better ROI?
â€¢ Analysis tools to help perform migration like Amazon's DMS, SQLines and DBconvert
â€¢ Database agnostic replication
â€¢ What it means to get support in the Open Source world and how to get it.
All major considerations and decisions made by the MySQL Query Optimizer may be recorded in an Optimizer Trace. While EXPLAIN will show the query execution plan the Query Optimizer has decided to use, MySQL's Optimizer Trace will tell WHY the optimizer selected this plan.
This presentation will introduce you to the inner workings of the MySQL Query Optimizer by showing you examples with Optimizer Trace. We will cover the main phases of the MySQL optimizer and its optimization strategies, including query transformations, data access strategies, the range optimizer, the join optimizer, and subquery optimization. We will also show how optimizer trace gives you insight into the cost model and how the optimizer does cost estimations.
By attending this presentation you will learn how you can use information from Optimizer Trace to write better-performing queries. The presentation will also cover tools that can be used to help process the vast amount of information in an optimizer trace.
Most Enterprises today own information of critical value such as intellectual property, customers personal data or private financial data. This type of data should never be exposed to unauthorized malicious access.
Our session covers the best security practices for a MariaDB deployment, the latest security related features in the MariaDB Server as well as general information related to potential threats in Enterprise systems and our recommended defense mechanisms.
Subjects covered in this session:
- Potential threats and protection mechanisms
- Secure installation with mysql_secure_installation
- At Rest and in-transit data encryption
* MariaDB TLS support
* Securing client-server communication
* Securing data echange in Replication and Galera Cluster
* Data at Rest and Binlog Encryption
- User Management best practices
* Password validation plugins
* User Account Locking
* Expiration of User Passwords
* Blocking user accounts with --max-password-errors
- External authentication with PAM and Kerberos
- Role-based Access Control
- Monitoring activity using the MariaDB Audit Plugin
Redundancy and high availability are the basis for all production deployments. With MongoDB this can be achieved by deploying a replica set. In this talk, we'll explore how MongoDB replication works and what the components are of a replica set. Using examples of wrong deployment configurations, we will highlight how to properly run replica sets in production, whether it comes to on-premise deployment or in the cloud.
- How MongoDB replication works
- Replica sets components/deployment typologies
- Practices for wrong deployment configuration
- Hidden nodes, Arbiter nodes, Priority 0 nodes
- Availability zones and HA in a single region
- Monitoring replica sets status
GTIDs were introduced to solve replication problems and improve database consistency.
When, accidentally, transactions occur on a replica, this introduces GTIDs on that replica that don't exist on the master. When, on a master failover, this replica becomes the new master, and the corresponding binlogs of the errant GTIDs are already purged, replication breaks on the replicas of this new master, because those missing GTIDs can't be retrieved from the binlogs of this new master.
This presentation will talk about GTIDs and how to detect errant GTIDs on a replica (before the corresponding binlogs are purged) and how to look at the corresponding transactions in the binlogs. I'll give some examples of transactions that could happen on a replica that didn't originate from a primary node, explain how this is possible and share some tips on how to avoid this.
Basic understanding of MySQL database replication is assumed.
From the very beginning, TiDB was designed with the combination of Hybrid Transactional and Analytical Processing (HTAP) workloads in mind. However, since TiKV stores data in a row-oriented fashion, you may have been left wondering how well suited that is for analytics?
In this talk, I will introduce TiFlash which is a native extension of TiDB that offers column-oriented storage to speed up heavy-duty OLAP queries. I will then go over the architecture and some of the design decisions, and how you will be able to use it for your next generation data processing needs.
In this presentation, we'll cover the following areas:
1. RDBMS history and landscape (Oracle and MySQL) at PayPal
2. Current and future use cases of MySQL in PayPal
-- back office
-- internally facing applications
-- 3rd party applications
-- PayPal site applications
3. PayPal's effort of making MySQL as a first-class data store. We'll cover the active/active and active/passive architectural choices, managing a large number of application to database connections, enforcing security, backup, and recovery, etc.
4. Outlook of MySQL in PayPal
In the past few years, Postgres has advanced a lot in terms of features, performance, and scalability for many core systems. However, one of the problems that many enterprises still complain of is that its size increases over time which is commonly referred to as bloat. EnterpriseDB is leading a community effort to build a new storage system for Postgres called zheap, which will provide better control over bloat. This session will discuss:
* The objectives of the initiative including better control over â€œbloatâ€, eliminating write amplification in most cases and reduction in on-disk storage size of Postgres databases
* Technical architecture of the new storage system including contrasting the new design with the current implementation in Postgres today
* Current status of the project, including the introduction of a new storage management interface, undo handling and when zHeap will be released as part of Postgres.
During our presentation at Percona Live 2019 Intel and its software partners will introduce the audience to the work we're doing to enable an open-source framework, we call Cloud Native Database (CNDB). This is a collaborative effort between Intel, Rockset, PlanetScale, MariaDB, and Percona. Why is Intel giving this talk? We believe such an open-source CNDB is the perfect complement to the Intel Optane DC Persistent Memory, QLC-NAND-based NVMe, and Cascade Lake CPU products.
Over the last couple of years we've talked to numerous database practitioners, across many companies and industries. What emerged from these discussions is clarity on the demand for an open-source equivalent to a Cloud Native Database such as Amazon's Aurora, Facebook's MySQL, and Azure's CosmosDB. In this talk, you will learn about our effort to make such an open-source Cloud Native Database available to the community.
Through the presentation, the audience will be introduced to a set of principles and architectural elements that define what we mean by Cloud Native Database. We will discuss Rockset's RocksDB-Cloud library and how it works with Facebook's MyRocks storage engine. We also will cover PlanetScale's Vitess project and their use of Kubernetes for deployment of our Database-as-a-Service (DBaaS) mechanisms. Lastly, we share data on the performance and scale characteristics of the architecture and components that we have developed.
Have you struggled with identifying issues in MySQL?
Come listen to how Verisure experienced an issue that was identified and resolved using PMM (Percona Monitoring and Management). Verisure was able to find the offending query/tuning parameter, make modifications, and then observe the impact over time thanks to PMM.
The client stack provides the data plane required to access MySQL from client applications. At Facebook we have a fully featured client stack along with MySQL protocol proxies that we use to scale MySQL across regions. This talk will provide an overview of the MySQL client libraries in use at Facebook, our MySQL protocol proxies, and our integrations with internal service discovery, logging, and monitoring systems.
As more and more people are moving to PostgreSQL from Oracle, a pattern of mistakes is emerging. They can be caused by the tools being used or just by not understanding how PostgreSQL is different than Oracle. In this talk, we will discuss the top mistakes people generally make when moving to PostgreSQL from Oracle and what the correct course of action.
Optimizing MySQL performance and troubleshooting MySQL problems are two of the most critical and challenging tasks for MySQL DBA's. The databases powering your applications need to be able to handle heavy traffic loads while remaining responsive and stable so that you can deliver an excellent user experience. Further, DBA's are also expected to find cost-efficient means of solving these issues.In this presentation, we will discuss how you can optimize and troubleshoot MySQL performance and demonstrate how Percona Monitoring and Management (PMM) enables you to solve these challenges using free and open source software. We will look at specific, common MySQL problems and review the essential components in PMM that allow you to diagnose and resolve them.
When your SQL query reaches the DBMS, it's the optimizer's job to decide how to execute it for you to get the result as fast as possible. To make this decision optimizer can examine the actual table data, but with multi gigabyte and terabyte tables, the only practical solution is to use various data statistics that were collected in advance. The better the statistics and the more precisely it describes the actual data, the faster the plan will be because the optimizer image of reality will be closer to the actual reality.
In this talk, you'll learn what data statistics MariaDB and MySQL can collect, what statements do that, how to tell the optimizer to use it (it won't necessarily do it automatically!) and how it can make your queries many times faster.
And, of course, when not to use indexes, when up-to-date statistics is enough.
In this presentation, we will discuss how to create custom rules when the default rules are not enough for the application.
Have you needed to give a more permissive rule to a user just because of this user wanted to run a specific command?
Also, we will discuss how to use view for hiding fields from users when we don't want them to read all the collection. If you have concerns about security come to this talk.
There are few ways to take a backup. One of the most used tools is Percona Xtrabackup, MariaBackup, and MySQL Enterprise Backup.
In this talk, the audience will have an in-deep overview of:
- Differences between the tools
- Comparison of features
- Which tool work on which MySQL/MariaDB flavor
- Supported Storage Engines
TiDB has bet big on Kubernetes; it is our compatibility layer to operate on any cloud platform and deploy with a single command. This talk explains from experience the challenges of automating a distributed database on Kubernetes with TiDB-Operator. Then we will show how to architect a DBaaS (Database as a Service) to work on top of an operator.
The talk will include a short demo showing how a DBaaS can bring together open source tooling (Prometheus, Netdata, TiDB, Loki, Grafana, Sentry) to monitor and troubleshoot database performance.
Technology is ever evolving. New open source databases pop up and fade away.
Even the role of a database administrator changes as DBaaS adoption becomes status quo.
The burden of maintaining your skills and knowledge is on you.
Charles Duhigg, the author of The Power of Habit, writes "If you believe you can change - if you make it a habit - the change becomes real."
Through anecdotes of how a global managed services provider like Pythian enables learning for its employees, this session will provide strategies to make learning a habit.
These strategies will help you thrive in your career, whether you are an individual contributor, or a leader hoping to enable learning for your team.
Integrating the most suitable highly available multi-master, non clustered, freely accessible
relational database management system (RDBMS)
solution for large scale environments is a challenging task; it
resembles evaluating marathon champions trained by different olympic coaches.
This presentation will show the analogies and differences of
MySQL Group Replication and PostgreSQL Bi-Directional Replication (BDR),
two freely accessible and highly-available multi master non clustered RDBMS products
that have been successfully employed in large scale environments requiring high availability.
Both scenarios have demonstrated, through validated prototypes, to be successful,
satisfying data integrity, reliability, and scalability
with slightly different strategies and different upcoming pathways.
When it comes to choosing a distributed streaming platform for real-time data pipelines, everyone knows the answer: Apache Kafka! And when it comes to deploying applications at scale without needing to integrate different pieces of infrastructure yourself, the answer nowadays is increasingly Kubernetes. However, with all great things, the devil is truly in the details. While Kubernetes does provide all the building blocks that are needed, a lot of thought is required to truly create an enterprise-grade Kafka platform that can be used in production. In this technical deep dive, Viktor will go through challenges and pitfalls of managing Kafka on Kubernetes as well as the goals and lessons learned from the development of the Confluent Operator for Kubernetes.
Loki is a horizontally-scalable, highly-available log aggregation system inspired by Prometheus. It is designed to be very cost effective and easy to operate, as it does not index the contents of the logs, but rather labels for each log stream.
Loki initially targets Kubernetes logging, using Prometheus service discovery to gather labels and metadata about log streams. By using the same index and labels as Prometheus, Loki enables you to easily switch between metrics and logs, enhancing observability and streamlining the incident response process â€“ a workflow we have built into the latest version of Grafana.
In this talk, we will discuss the motivation behind Loki, its design and architecture, and what the future holds. It's early days, but so far the response to the project has been overwhelming, with more the 4.5k GitHub stars and over 12hrs at the top spot on Hacker News.
Loki is an opensource project, Apache licensed.
Backups are our last line of defense in the event of a data loss, be it a hardware failure, malicious attack or a test script run in production. At Facebook, not only do we backup every database, we also continuously restore our backups to have full confidence in our recovery capabilities. This talk will provide a high-level overview of our automation and tooling around MySQL backups and restores.
With the amount of stored data growing quickly, analytics work has turned into a very intensive workload that will hit our MySQL servers in very hard way.
During the last few years we have seen a bunch of new engines designed to digest big portions of data and help with analytical queries. Now there is a new contender in the arena that claims to be MySQL Compatible, High Impact and Open Source Analytics.
During the course of this session we will go through each of these topics and try to answer:
- Compatible - how hard it is to take data out of MySQL and how much lag it may suffer on real workload?
- High Impact - near real time query response, low cost of entry high ROI?
- Open Source - open source only?
Moreover, we will compare the solution against some other popular products already in the market like ColumnStore, Cassandra, Clickhouse and we'll see how TiDB behaves in comparison.
MySQL, as a database, strives on a filesystem but not all filesystems are equal! On Linux, ZFS is gaining attention and there are good reasons for that, especially if you happen to also run MySQL. In this talk, I'll describe the main characteristics and features of ZFS and draw parallels with the architecture of InnoDB. From easy backups to compression and improved caching, you'll see that MySQL has a lot to benefit from ZFS. I'll discuss the configurations of both MySQL and ZFS so they play well together and perform at their best. Finally, cost-saving MySQL/ZFS reference architectures using both, bare metal and clouds servers will be presented and reviewed.
Make the most of your ColumnStore columnar analytics engine!
Deep dive into best practices for columnar engines in general, what are the best use cases for columnar, and tips and trick for both ColumnStore and other analytics engines.
Analytics without data ingestion
Indexing a no-index engine
How and when to use cross engine joins
Enabling low latency data ingestion
Row+column hybrid approaches
Optimizing data loads
Splitting columns for performance
FoundationDB is a distributed database designed to handle large volumes of structured data across clusters of commodity servers. It organizes data as an ordered key-value store and employs ACID transactions for all operations. Document Layer is a stateless micro server on top of FoundationDB that allows management of JSON documents at large scale. It exposes the traits of FoundationDB through document data model, such as fully ordered documents, consistent indexes, and serializable transactions. It does all this while maintaining wire compatibility with MongoDBÂ® API. As the compatibility is done at wire level, existing MongoDBÂ® tools and drivers work seem less. In this talk, we explore how FoundationDB core strengths play well together with the document data model to make it an easier to use and reliable database.
Since version 8.0.14, MySQL supports LATERAL derived tables, sometimes called the for each loop of SQL. What are they? How do they work? Why do you need them? What can they do? How can you use them? Should you use them? What is all this talk about for each loops?
In this session we'll explain the concepts, look at examples of how and when to use this new feature, and talk about how LATERAL derived tables are optimized and executed by MySQL.
When evaluating if a new database is a fit for your organization, it can pay to be risk-averse. In this talk, I will demonstrate how you can replicate from MySQL to TiDB and evaluate for performance and correctness initially only as a read-only slave -- reducing the risk of evaluating TiDB.
I will then demonstrate how MySQL can be set up as a replica of TiDB, with the ability to fail-back if any problems are discovered when it becomes a master.
As we move through each step in validating TiDB, you may be familiar with some of the tools being used (mydumper, pt-upgrade, ProxySQL). One of the great parts about speaking the MySQL protocol is you benefit from the ecosystem surrounding it.
Would you like MySQL Protocol Compatibility and near real time dashboard analytics with that?
We are hoarding data in the hopes of making meaningful information out of them to make smart business decisions. But we also resort to inflating costs just to satisfy this need (or not). While there have been a number of open source roll your own options, it has been also operationally costly and sometimes stressful to maintain, not to mention the shortcomings of each. If you have been traditionally storing your data with MySQL and have been enduring running your queries from the same set of servers, fret no more - Clickhouse might just be what you need. In this presentation, we will discuss how to deploy, design, and maintain a Clickhouse analytics platform that continuously reads data from your MySQL servers in near real-time*.
* Depends if you have transformations or complex schema.
PostgreSQL performance varies between different releases. Every new version comes with added features and performance improvements that help the growing adoption of PostgreSQL. All these continuous These ongoing developments and improvements drive us to plan on upgrading our PostgreSQL environments, but that's not always an easy task. A few common concerns when it comes to upgrading PostgreSQL are extended downtime, converting the tables partitioned using triggers to declarative partitions, the search for the safest options to upgrade and the effort required in the upgrading process itself. In this talk we'll discuss the variety of options available that can help you upgrade your PostgreSQL servers in the best possible way.
This talk includes:
1. The most significant performance improvements in recent PostgreSQL versions including PostgreSQL 11 and 10.
2. A summary of features that have been implemented with every new version since PostgreSQL 9.1.
3. What is it that you need to consider before upgrading your PostgreSQL server.
4. What are the options you have available to help you upgrade your PostgreSQL server.
5. What are the solutions available to minimize the downtime during upgrades.
6. A list of parameters you need to consider in particular when upgrading PostgreSQL to 10 or 11.
As modern organizations have rapidly embraced containers in recent years, stateful applications like databases have proven tougher to transition into this brave new world than other workloads. When a persistent state is involved, more is required both of the container orchestration system and of the stateful application itself to ensure the durability and availability of the data.
This talk will walk through my experiences trying to reliably run CockroachDB, the open source distributed SQL database, on Kubernetes, optimize its performance, and help others do the same in their heterogeneous environments. We'll look at what kinds of stateful applications can most easily be run in containers, which Kubernetes features and usage patterns are most helpful for running them, and many, many pitfalls I encountered along the way. Finally, we'll ponder what's missing and what the future may hold for running databases in containers.
Do you already run stock PMM in your environment and want to learn how you extend the PMM platform? Come learn about:
1. Dashboard Customizations
How to create custom dashboard from existing graphs, or build Cross Server Dashboards
2. External Exporters - Monitor any service, anywhere!
Adding an exporter, view the data in data exploration, to deploying a working Dashboard
3. Working with custom queries (MySQL and PostgreSQL)
Execute SELECT statements against your database and store in Prometheus
Build Dashboards relevant to your environment
4. Customizing Exporter Options
Enable de-activated functionality that applies to your environment
5. Using Grafana Alerting
How to set up channels (SMTP, Slack, etc)
How to configure thresholds and alerts
6. Using MySQL/PostgreSQL Data Source
Execute SELECT statements against your database and plot your application metrics
Facebook has been running MySQL 5.6 for a very long time and are in the middle of developing support for MySQL 8.0. We will present the challenges we face, our progress thus far, and some of the problems we hit along the way.
The Player Accounts team at Riot Games needed to consolidate the player account infrastructure and provide a single, global accounts system for the League of Legends player base. To do this, they migrated hundreds of millions of player accounts into a consolidated, globally replicated composite database cluster in AWS. This provided higher fault tolerance and lower latency access to account data. In this talk, we discuss this effort to migrate eight disparate database clusters into AWS as a single composite MySQL database cluster replicated in four different AWS regions, provisioned with terraform, and managed and operated by Ansible.
This talk will briefly overview the evolution of the player accounts services from legacy isolated datacenter deployments to a globally replicated database cluster fronted by our account services and outline some of the growing pains and experiences that got us to where we are today.
In this talk, we discuss the use of the open source MySQL Community Edition and Percona Server projects and Intel's Cascade Lake server platform as the primary building blocks for hosting a Database-as-a-Service (DBaaS) deployment. Data demonstrating the advantage of this configuration will be presented. Emphasis will be on deployment of an offering that provides developers with self-service, on-demand delivery of databases that run over shared infrastructure in a multi-tenant environment. Demand for DBaaS from a diverse span of market segments has driven the large Cloud Service Providers to invest in the buildout such offerings.
Today, these offerings are widely available from Multiple Cloud Providers. Until recently, however, organizations wanting to provide DBaaS from within their own data centers were left without good options. This demand is set to be addressed with the emergence of several open source projects. A second, intersecting trend is the imminent release of Intel's next-generation Cascade Lake XEON platform and its support for byte-addressable, persistent memory via the IntelÂ® Optaneâ„¢ DC Persistent Memory product.
Our talk will dive into the intersection of these two trends starting with an overview of the Cloud Native DBaaS model. Next, a concrete description of a deployment model using Percona's MySQL and MyRocks distributions will be introduced. This model will be supported by a two-fold, data-driven discussion on performance and density. First, a performance characterization for both the InnoDB and MyRocks storage engines is presented with the focus on comparing/contrasting the use of NVMe vis-a-vis IntelÂ® Optaneâ„¢ DC Persistent Memory. For the latter, we include IntelÂ® Optaneâ„¢ DC Persistent Memory volumes in fsdax as well as sector modes. Secondly, data on database instance density using a single Intel Cascade Lake server outfitted with NVMe vis-a-vis IntelÂ® Optaneâ„¢ DC Persistent Memory will be presented. The talk will conclude with a discussion on the results and how they have influenced our plans for further work in MySQL CE/Percona Server open source projects going forward.
MySQL and MariaDB have a long list of old, potentially trivial bugs that are annoying if you hit them. No one's really bothered to fix them, so why not you? It doesn't take a large amount of C/C++ knowledge to pick out an old bug and build/test a MySQL/MariaDB patch to fix an old bug.
I'll talk about what is a sane choice of bug, how to use the existing mtr to help, small test cases, and packaging up and submitting your changes.
But doing this, you'll pay it forward, for all the positive MySQL/MariaDB experiences you've had.
MySQL JSON Support
Open source reject license
MySQL Group Replication
Distros dropping MongoDB package
This year has seen a good deal of changes and challenges in the MongoDB Space. But where are things going? We are going to cover the most common questions I get asked even today.
What are my risks if I keep using it?
Is MySQL/Postgres Proxy, replication, and HA finally catching up?
Why is JSON support not the only thing to consider?
Will MongoDB lose my data?
Past these I will be discussing the ecosystem as a whole, and how I see the next few years shaping out. While this talk is most helpful for business leaders and architects, for DBAs and engineers it will also help you decide what to focus on in your career for the emerging technological future.
MySQL Shell is the new client for MySQL. It understands standard and X protocol. It also allows to send JS, Python or SQL commands to the server. In this talk I will show you how to first configure the Shell for a nice, good looking experience with MySQL, show some of the basic commands and objects. After this brief overview I will show how it's possible to extend the Shell, and how I hacked it to create an Innotop clone inside the MySQL Shell. This session is a live demo with extra explanation, and the code will be shared during and after the presentation.
The goal of the talk is also to introduce to the Community how to hack the Shell and to call for contributions and feature requests.