MySQL is the world's most popular open source database and Kubernetes is the most popular and rapidly-developing project currently. The purpose of this talk is to explain and demonstrate how running a complex stateful application such as a database is made easier using Kubernetes and that there are a number of options available.
MySQL deployment patterns covered will start with explaining simply how to run MySQL with a simple command using a helm chart; onto how a MySQL asynchronous replicated master/slave MySQL pattern works on Kubernetes; onto how several different MySQL operators can be used giving a detailed discussion; and demonstration will showcase the Oracle MySQL Operator which uses group replication and the MySQL router, and makes creating MySQL clusters, backups, and restorations trivial.
Last to be covered will be Vitess which is used for horizontal scaling of MySQL which has numerous benefits such as built-in sharding and shard management, connection-pooling, query sanitization.
You can have fully automated high availability PostgreSQL on your Kubernetes cluster ... today. The Patroni system for automating PostgreSQL deployment, failover, and migration is ready to use and in production in several places. In this live demo session, we will show you how you can make use of this technology.
We will run though setting up PostgreSQL clusters, both using basic Patroni and using a PostgreSQL Operator. We will then demonstrate failover and disaster recovery, go over some basic configuration options, show how security & authentication works, and explore some plugins and options. After that, you'll learn about the current state of the Patroni project as well as what the alternatives are.
If you need to administer more than one PostgreSQL replication cluster, you'll want to see how Patroni and Operators can make your daily DBA headaches and 2am wakeups go away.
With the advent of the Health Insurance Portability and Accountability Act (HIPAA) of 1996 all entities that handle health information are required by law to secure all data which contains personally identifiable information (PII) and private health information (PHI). Fines for leaking this data can range from $100 to $50,000 per leaked record. A data breach or leak is extremely costly for both the patients as well as the companies that are entrusted with their PHI. In our presentation, we introduce Gonymizer, a tool that is written in Go at SmithRx to handle the anonymization of PHI and PII data from our production database instances.
This data is anonymized and loaded into non-production environments to allow us to use representative data to develop and test against. This makes anonymization of sensitive information quick and simple using a simple column map that is defined in a single JSON file for your dataset. There is a selection of custom processors that we have built to handle basic tasks, such as first and last name anonymization, changing data to fake locations such as street addresses, cities, zips, and states. The interface for building processors is also completely extendable and anyone with basic Go experience should be able to build processors that can anonymize your data efficiently. We will also show how this tool decreases our development time for new features as well as simplifying testing in a compliant environment with non-sensitive data sets (HIPAA, PCI, etc).
Toward the end of our presentation, we will be discussing how we built our infrastructure using Docker to containerize Gonymizer and schedule anonymization and loading of our test environments using Kubernetes. This talk is targeted for anyone working in the healthcare space where collected data contains PHI and/or PII and is regulated by HIPAA.
Vitess has continued to evolve into a massively scalable sharded solution for the cloud. It's is now used for storing core business data for companies like Slack, Square, JD.com, and many others.
This session will cover the high-level features of Vitess with a focus on what makes it cloud-native.
We'll conclude with a demo of the powerful materialized views feature that most sharded systems have yet to solve.
Tyler Duzan (Percona) delivers the talk, "Building a Kubernetes Operator for Percona XtraDB Cluster", on DAY 1 of the Percona Live Open Source Database Conference 2019, 5/29, at Austin, TX.
This talk covers some of the challenges we sought to address by creating a Kubernetes Operator for Percona XtraDB Cluster, as well as a look into the current state of the Operator, a brief demonstration of its capabilities, and a preview of the roadmap for the remainder of the year. Find out how you can deploy a 3-node PXC cluster in under 6 minutes, how you can handle providing self-service databases on the cloud in a cloud-vendor agnostic way, and ask the Product Manager questions and provide feedback on what challenges you'd like us to solve in the Kubernetes landscape.
During our presentation at Percona Live 2019 Intel and its software partners will introduce the audience to the work we're doing to enable an open-source framework, we call Cloud Native Database (CNDB). This is a collaborative effort between Intel, Rockset, PlanetScale, MariaDB, and Percona. Why is Intel giving this talk? We believe such an open-source CNDB is the perfect complement to the Intel Optane DC Persistent Memory, QLC-NAND-based NVMe, and Cascade Lake CPU products.
Over the last couple of years we've talked to numerous database practitioners, across many companies and industries. What emerged from these discussions is clarity on the demand for an open-source equivalent to a Cloud Native Database such as Amazon's Aurora, Facebook's MySQL, and Azure's CosmosDB. In this talk, you will learn about our effort to make such an open-source Cloud Native Database available to the community.
Through the presentation, the audience will be introduced to a set of principles and architectural elements that define what we mean by Cloud Native Database. We will discuss Rockset's RocksDB-Cloud library and how it works with Facebook's MyRocks storage engine. We also will cover PlanetScale's Vitess project and their use of Kubernetes for deployment of our Database-as-a-Service (DBaaS) mechanisms. Lastly, we share data on the performance and scale characteristics of the architecture and components that we have developed.
When it comes to choosing a distributed streaming platform for real-time data pipelines, everyone knows the answer: Apache Kafka! And when it comes to deploying applications at scale without needing to integrate different pieces of infrastructure yourself, the answer nowadays is increasingly Kubernetes. However, with all great things, the devil is truly in the details. While Kubernetes does provide all the building blocks that are needed, a lot of thought is required to truly create an enterprise-grade Kafka platform that can be used in production. In this technical deep dive, Viktor will go through challenges and pitfalls of managing Kafka on Kubernetes as well as the goals and lessons learned from the development of the Confluent Operator for Kubernetes.
As modern organizations have rapidly embraced containers in recent years, stateful applications like databases have proven tougher to transition into this brave new world than other workloads. When a persistent state is involved, more is required both of the container orchestration system and of the stateful application itself to ensure the durability and availability of the data.
This talk will walk through my experiences trying to reliably run CockroachDB, the open source distributed SQL database, on Kubernetes, optimize its performance, and help others do the same in their heterogeneous environments. We'll look at what kinds of stateful applications can most easily be run in containers, which Kubernetes features and usage patterns are most helpful for running them, and many, many pitfalls I encountered along the way. Finally, we'll ponder what's missing and what the future may hold for running databases in containers.