The main objective for TiDB 3.0 is to provide a distributed database that can achieve stability at a large scale. In this deep dive talk, Ed will share the relevant benchmarks that shows how we are achieving this goal and the shortcomings exposed from these benchmarks that our team is fixing and addressing. (No benchmarketing, we promise!)
In an era where threats and challenges seemingly about protecting your big data and open source workloads, is no longer an optionâ€¦ It is vital to the success of your business. Join Veritas for this information-packed session, where we will discuss the why, how and business impacts of protecting today's business-critical big data and open sources workloads. We will also discuss how to move beyond basic replication to a true data protraction and application recovery strategy.
MySQL is the world's most popular open source database and Kubernetes is the most popular and rapidly-developing project currently. The purpose of this talk is to explain and demonstrate how running a complex stateful application such as a database is made easier using Kubernetes and that there are a number of options available.
MySQL deployment patterns covered will start with explaining simple how to run MySQL with a simple command using a helm chart; onto how a MySQL asynchronous replicated master/slave MySQL pattern works on Kubernetes; onto how several different MySQL operators can be used giving a detailed discussion; and demonstration will showcase the Oracle MySQL Operator which uses group replication and the MySQL router, and makes creating MySQL clusters, backups, and restorations trivial.
Last to be covered will be project Vitess which is used for horizontal scaling of MySQL which has numerous benefits such as built-in sharding and shard management, connection-pooling, query sanitization.
It's easy for modern, distributed, high-scale applications to hide database performance and efficiency problems. Optimizing performance of such complex systems at scale requires some skill, but more importantly it requires a sound strategy and good observability, because you can't optimize what you can't measure. This session explains a performance measurement and optimization process anyone can use to deliver results predictably, optimizing customer experience while freeing up compute resources and saving money.
The session begins with what to measure and how; how to analyze it; how to categorize problems into one of three types; and three matching strategies to use in optimization as a result. It is a recursive method that can be used at any scale, from a data center with many types of databases cooperating as one, to a single server and drilling down to a single query. Along the way, we'll discuss related concepts such as internally- and externally-focused golden signals of performance and resource sufficiency, workload quality of service, and more.
In this session we will review key elements to take into account when migrating MySQL into the Cloud. We will share our experience of working with many different customers across the globe describing most effective procedures.
- IaaS vs DBaaS
- Migrating data
- Replication between On-Premises and Cloud
- Testing Cloud environments
- Load Balancers
- High Availability
MySQL 8.0 is a major release of new features and capabilities, including a new data dictionary hosted in InnoDB, new REDO logs design, new UNDO logs, a new scheduler, descending indexes, and much more!
Learn all about the changes in InnoDB delivered with MySQL 8.0 and how they affect the performance and the manageability of your database!
Running an analytical (OLAP) workload on top of MySQL can be slow and painful. A specifically designed storage format ("Column Store") can significantly improve analytical queries' performance. There are a number of opensource column store databases around. In this talk, I will focus on two of them which can support MySQL protocol: MariaDB ColumnStore and ClickHouse.
I will show some realtime benchmarks and use cases, and demonstrate how MariaDB ColumnStore and ClickHouse can be used for typical OLAP queries. I will also do a quick demo.
Amazon DocumentDB (with MongoDB compatibility) is a fast, scalable, highly available, and fully managed document database service that supports MongoDB workloads. In this tech talk, we will introduce Amazon DocumentDB and its unique architecture that makes it easy to run, manage & scale MongoDB-workloads in the cloud. Amazon DocumentDB is designed from the ground-up and uses a unique, distributed, fault-tolerant, self-healing storage system that automatically scales storage. Additionally, with Amazon DocumentDB's architecture, the storage and compute are decoupled, allowing each to scale independently. Through common use cases, we'll discuss why this architecture helps developers reach faster to their business needs. We'll also talk about how developers can use the same MongoDB application code, drivers, and tools as they do today to run, manage, and scale workloads on Amazon DocumentDB without having to worry about managing the underlying infrastructure.
Percona XtraDB Cluster 8.0 (PXC-8.0) is the latest addition to PXC family.
Starting MySQL/PS-8.0 upstream has done a lot of significant changes including atomic DDL, replication channel, locking algorithm changes, security/ encryption, etc....
During this session, we will explore how these changes affect PXC-8.0, what has changed in PXC-8.0, new and deprecated features, important bugs and more. If you are already a PXC user or planning to consider it attend this session to findout more about PXC-8.0.
Building a robust and reliable distributed database is not easy. In TiDB, to ensure our users' data is always safe and the system is always stable, we use Chaos Engineering to help uncover system-level weaknesses before they appear in production environment.
In this talk, I will cover the different types of fault injection techniques our team uses to test TiDB, how we build our own Chaos Engineering platform, and how we integrate various Chaos Engineering techniques into our own automated testing framework, called Schrodinger, to support continuous testing of TiDB.
It's 2019 and there are so many choices for storing your data. There are old players on the market and there are some new kids on the block. Making the wrong choice for your database can effectively sink your product, project and even your reputation! Should you go with SQL or NoSQL or Document database or something in between? What about polyglot persistence? Reactive, event-driven, non-blocking, async applications? What about the language bindings? What about support? Performance, Tuning, Tooling, Monitoring, Observability, Upgrades, Rollbacks, Migrations, Search & Indexing, Analytics, Availability, Durability, ACID or BASE? Building, running & maintaining storage infrastructure is non-trivial. We will look at the state of databases in 2019 and try to answer some of these questions.
PostgreSQL is undoubtedly the second most popular open source RDBMS and it is within the top five most popular DB engines as per db-engines.com. Why not take a rest from your favorite database server for a while, start learning some more about it, and aim for another logo on your resume?
In this session, we are going to explore this RDBMS, compare it with its arch-rival, MySQL, to both correlate concepts and understand where Postgres excels. Among the topics we will cover:
* Server architecture
* Replication and HA
* Postgres specific features
* and more!
You can have fully automated high availability PostgreSQL on your Kubernetes cluster ... today. The Patroni system for automating PostgreSQL deployment, failover, and migration is ready to use and in production in several places. In this live demo session, we will show you how you can make use of this technology.
We will run though setting up PostgreSQL clusters, both using basic Patroni and using a PostgreSQL Operator. We will then demonstrate failover and disaster recovery, go over some basic configuration options, show how security & authentication works, and explore some plugins and options. After that, you'll learn about the current state of the Patroni project as well as what the alternatives are.
If you need to administer more than one PostgreSQL replication cluster, you'll want to see how Patroni and Operators can make your daily DBA headaches and 2am wakeups go away.
Are you a user of the world's most popular Cloud provider, and someone who leverages their Database As A Service platforms of RDS or Aurora? Come to this talk to hear how PMM can provide rich visibility of these platforms for MySQL and PostgreSQL. We'll cover:
* Connecting PMM Server to RDS and Aurora for MySQL and PostgreSQL metrics activity
* Accessing the AWS CloudWatch API for fundamental resource consumption monitoring (CPU, Disk, Memory)
* Using PMM's Query Analytics against RDS MySQL or Aurora MySQL
We in Percona are constantly looking at how we can help our customers to have the right solution at the right time. We are constantly looking to identify the right tool for the job.
In our constant research we identified that migrating from Oracle, MS SQL or any other closed source data platform, is becoming more and more relevant for large organizations.
The maturity reached by open source solutions, the large adoption of those solutions in medium size companies or for new projects, in conjunction with the agility they have to adapt platforms to newer and more modern needs, had finally led large organizations/enterprises to understand that "It can be done".
Still the journey from closed source to open source is not always a walk in the park.
There are many factors that must be considered and analyzed, to successfully migrate.
There are simple situations, where a simple data migration with few schema adjustments is enough, and much more complex scenarios where to migrate, the whole logic must be reorganized.
Finally, there are situations where migrating is simply not possible or valuable.
Having a clear path that can help you to assess what is what, and how much effortis needed for each case, is gold. Literally, because it will allow you to focus on what makes sense, and the right effort and resources.
The scope of this presentation is to illustrate how we in Percona perform that assessment, and what is our methodology to assist our customer to successfully decide how and if to migrate.
The evolution of MySQL replication shows that there is a lot of effort put into reducing operations cost and minimize administration overhead. Thus MySQL DBAs and DevOps can spend more time expanding their infrastructure, rather than catering for it.
Many different areas have been enhanced. For example security, operations, fail-over, observability, failure detection, consistency, split-brain protection and primary election, flexible replication workflows and more.
This session highlights the new replication features in MySQL 8.0. Those that were released pre and post-GA. Come and learn, directly from the engineers, how the new features help you operate, sustain, and extend your MySQL Replication infrastructure.
MariaDB 10.4 will come with new Galera Replication version 4. This presentation will outline the new features of Galera 4 Replication as present in MariaDB 10.4 and share the early user experiences with it.
Galera is a generic replication plugin, making it possible to deploy synchronous multi-master cluster topologies with database servers supporting Galera Replication plugin API (i.e write set replication, wsrep API) . Currently both MySQL and MariaDB servers have Galera Replication support, and today there are thousands of MySQL and MariaDB based cluster installations, around the world, processing production system loads in bare-metal or cloud deployments.
With Galera 4, MariaDB 10.4 cluster further extends the capabilities of the synchronous Galera replication. The most prominent feature in Galera 4 version, is streaming replication technology, which implements distributed transaction processing within the cluster. With streaming replication, a transaction can be launched to execute in all cluster nodes in parallel. With this, a large transaction can be executed in small fragments due out the transaction life time, and cluster will not choke with the replication of one large transaction write set, as happened in earlier Galera Cluster versions.
Streaming replication works as a foundation for many more features, to be released in short term. e.g. XA transaction support will now be possible thanks to streaming replication technology.
The changes that MongoDB are bringing to the world in 4.0 are tempting to many companies and administrators, but if you are on an older version such as 3.2, upgrading with minimum disruption can be challenging.
Some of the problems that a company might face are that older language drivers versions may not be compatible with newer versions of mongo, the upgrade path might change, and there is a lot of documentation to read before you embark on the upgrade path.
If in 3.2 the config servers could be both mirrored or replica sets, in 3.4, it's a must to have the config servers as replica sets. Upgrading without interruption can be difficult. With so many things to read about, it's always good to have a reference that will give you an understanding of things that you should consider. A form of cheat sheet.
In this 50 minutes session, I would like to present the steps needed to move from 3.2 to 3.4, which will include some tips to upgrade the config servers to replica sets with minimal disruption and from 3.4 to 3.6 and from 3.6 to 4.0, again, with minimal disruption.
The session will also mention best practices that will apply when running any version upgrade.
ProxySQL, the high performance, high availability, protocol-aware proxy for MySQL is now GA in version 2.0 .
This version introduces several new features, like causal reads using GTID, better support for AWS Aurora, native support for Galera Cluster, LDAP authentication and SSL for client connections.
This session provides an overview of the most important new features.
When evaluating if a new database is a fit for your organization, it can pay to be risk-averse. In this talk, I will demonstrate how you can replicate from MySQL to TiDB and evaluate for performance and correctness initially only as a read-only slave -- reducing the risk of evaluating TiDB.
I will then demonstrate how MySQL can be set up as a replica of TiDB, with the ability to fail-back if any problems are discovered when it becomes a master.
As we move through each step in validating TiDB, you may be familiar with some of the tools being used (mydumper, pt-upgrade, ProxySQL). One of the great parts about speaking the MySQL protocol is you benefit from the ecosystem surrounding it.
Mailchimp has grown from small company to serving millions of micro-businesses, in addition to SMBs and Enterprises. We have a fairly pedestrian approach to MySQL, but we now run hundreds and perhaps soon to be thousands of MySQL instances. Our present state is thanks to great full-stack engineering teamwork. This is a glimpse into what makes Mailchimp tick. Hint, it's not just technology. This talk is about "Momentum" and "Pragmatism", and creating a great outcome for our mission of empowering the underdog.
With the advent of the Health Insurance Portability and Accountability Act (HIPAA) of 1996 all entities that handle health information are required by law to secure all data which contains personally identifiable information (PII) and private health information (PHI). Fines for leaking this data can range from $100 to $50,000 per leaked record. A data breach or leak is extremely costly for both the patients as well as the companies that are entrusted with their PHI. In our presentation we introduce Gonymizer, a tool that is written in Go at SmithRx to handle the anonymization of PHI and PII data from our production database instances.
This data is anonymized and loaded into non-production environments to allow us to use representative data to develop and test against. This makes anonymization of sensitive information quick and simple using a simple column map that is defined in a single JSON file for your dataset. There is a selection of custom processors that we have built to handle basic tasks, such as: first and last name anonymization, changing data to fake locations such as: street addresses, cities, zips, and states. The interface for building processors is also completely extendable and anyone with basic Go experience should be able to build processors that can anonymize your data efficiently. We will also show how this tool decreases our development time for new features as well as simplifying testing in a compliant environment with non-sensitive data sets (HIPAA, PCI, etc).
Toward the end of our presentation we will be discussing how we built our infrastructure using Docker to containerize Gonymizer and schedule anonymization and loading of our test environments using Kubernetes. This talk is targeted for anyone working in the healthcare space where collected data contains PHI and/or PII and is regulated by HIPAA.
Vitess has continued to evolve into a massively scalable sharded solution for the cloud. It's is now used for storing core business data for companies like Slack, Square, JD.com, and many others.
This session will cover the high level features of Vitess with a focus on what makes it cloud-native.
We'll conclude with a demo of the powerful materialized views feature that most sharded systems have yet to solve.
PMM 2.0 represents a significant advance in terms of monitoring for Open Source Databases. Come to this session in order to learn about the following:
* Query Analytics improvements
* Architectural changes
* API and GUI configuration
Introductory-level session for database administrators interested in taking their first steps in migrating an existing Oracle database to PostgreSQL.
- What are the steps?
- What are the major challenges?
- Which tools should we use?
The session will focus on the basic steps, procedures and tools every DBA should use for migrating Oracle databases to PostgreSQL and will cover: source database migration assessment, schema conversion, data replication and performance tuning.
1. Overview of the five steps for database migrations: assessment and migration planning, schema conversion, data migration, application conversion.
3. Why planning and assessment is important for successful migration execution.
4. Key take-away insights from planning your migration that will impact implementation.
5. Using migVisor for the migration assessment.
6. Using Ora2PG for schema conversion.
7. Data replication: concepts, best practices and available tools.
This session will be interesting to everyone looking for the latest news about MySQL 8.0 Performance:
- since MySQL 8.0 we moved to "continuous release" model
- so with every update many new improvements are delivered
- but how does it also improve MySQL 8.0 Performance ? ;-)
- the latest benchmark results obtained with MySQL 8.0 will be in center of the talk
- because every benchmark workload for MySQL is a "problem to resolve"
- and each resolved problem is a potential gain in your production !
- many important internal design changes are coming with MySQL 8.0
- how to bring them all in action most efficiently ?
- what kind of trade-offs to expect, what is already good, and what is "not yet" ?
- how well MySQL 8.0 is able to use the latest HW ?
- could you really speed-up your IO by deploying your data on the latest flash storage ?
- these and many other questions are answered during this talk + proven by benchmark results..
- and as usual, some surprises to expect ;-))
Open source software is not the new kid on the block anymore! But it takes some time and effort to take these independent software components and bundle them into working software that meets your organization's needs. At Walmart Labs, we are having fun automating the full lifecycle of a database using MariaDB in private and public clouds. In this talk, we will go over how we are building a fully automated database platform and the lessons learned from running these distributed systems in a cloud environment at scale.
If you are interested in open source technologies and curious to know how we manage 6000 (and growing!) computes in the cloud with a small team â€“ then come join us for a fun-filled tech talk!
Either because of a new feature, a bug, or just for archival purposes, it is often necessary to update or remove large amounts of documents in production.
The challenge with this type of operation is not only to design an efficient process query-wise, but to be able to execute it in production without debilitating the servers or causing secondaries to lag.
There are strategies that can be used to create highly controlled write processes that could run for days under the radar, getting the job done without greatly impacting your application's performance.
In this session, I'm going to share with you key points to consider when creating massive write operations in MongoDB, examples of real-life processes executed, and a few lessons learned.
Running MySQL in the cloud isn't magic - it can be brilliant, but it can also be a real challenge. We'll learn about some of the big wins that can be had in the cloud (such as elasticity and easy provisioning). But, we'll talk about the dark underbelly as well - and face some of the challenges to run a realistic MySQL installation in the cloud (errrr... performance?).
TiDB is an open-source NewSQL database that speaks the MySQL protocol. However, it does not have any source code in common with MySQL. This presentation will dive into how SQL processing occurs in TiDB, from the initial query parsing to retrieving rows from TiKV as part of execution.
As consultants, we are often asked to perform a one-off assessment of existing environments. These requests involve basically three phases: data collection, analysis and results presentation. In order to reduce the time spent on low-value tasks, we introduced automation to turn collected data into a deliverable and to assist consultants with the analysis and recommendations.
In this session we are going to share the process details, technologies involved and the benefits introduced to our organization
Audience Takeaways/Take Back to Work:
- Discover a way to construct / manipulate dynamic documents
- Learn about the benefits of introducing automation into documentation processes
- Hear a few lessons learned during the development of the solution
PostgreSQL performance varies between different releases. Every new version comes with added features and performance improvements that help the growing adoption of PostgreSQL. All these continuous These ongoing developments and improvements drive us to plan on upgrading our PostgreSQL environments, but that's not always an easy task. A few common concerns when it comes to upgrading PostgreSQL are extended downtime, converting the tables partitioned using triggers to declarative partitions, the search for the safest options to upgrade and the effort required in the upgrading process itself. In this talk we'll discuss the variety of options available that can help you upgrade your PostgreSQL servers in the best possible way.
This talk includes:
1. The most significant performance improvements in recent PostgreSQL versions including PostgreSQL 11 and 10.
2. A summary of features that have been implemented with every new version since PostgreSQL 9.1.
3. What is it that you need to consider before upgrading your PostgreSQL server.
4. What are the options you have available to help you upgrade your PostgreSQL server.
5. What are the solutions available to minimize the downtime during upgrades.
6. A list of parameters you need to consider in particular when upgrading PostgreSQL to 10 or 11.
This talk covers some of the challenges we sought to address by creating a Kubernetes Operator for Percona XtraDB Cluster, as well as a look into the current state of the Operator, a brief demonstration of its capabilities, and a preview of the roadmap for the remainder of the year. Find out how you can deploy a 3-node PXC cluster in under 6 minutes, how you can handle providing self-service databases on the cloud in a cloud-vendor agnostic way, and ask the Product Manager questions and provide feedback on what challenges you'd like us to solve in the Kubernetes landscape.
This talk will focus on the self-managed nature of Uber's database monitoring and how we've leveraged our open source time series database M3DB to support massive multi-region scale and high cardinality monitoring.
We'll cover how we monitor applications, databases, and their interactions and how we automatically setup application specific dashboards and alerts. This includes things like the ability to alert on metrics like P99 latency and slow queries for a given application at the per-table and per-query level.
Of course, automated and fine grained monitoring requires the ability to ingest, persist, and query massive amounts of high cardinality time series data. We'll talk about the architecture of M3DB and how we've leveraged it at Uber to scale our monitoring systems to billion of unique time series and 10s of millions of data points per second.
We'll conclude the talk with an overview of our Prometheus and Kubernetes integrations, explaining how you can start leveraging M3DB for your own workloads. Finally, we'll give a brief overview of our plans to evolve M3DB into a general purpose, horizontally scalable event store.
Modern relational databases are all tied together by a well-defined standard for SQL, the latest version published in 2016. But not all SQL implementations follow the standard and even when they do, they often embellish on it and users tend to use the features they see available to them. As a result, migration between databases tends to be a challenging task. In this talk I will give some examples of database migrations I have participated in over the last twenty years and what issues we ran into with each. I will also talk in a more general sense about:
â€¢ Database only migration vs application refactoring - which is likely to have the better ROI?
â€¢ Analysis tools to help perform migration like Amazon's DMS, SQLines and DBconvert
â€¢ Database agnostic replication
â€¢ What it means to get support in the Open Source world and how to get it.
All major considerations and decisions made by the MySQL Query Optimizer may be recorded in an Optimizer Trace. While EXPLAIN will show the query execution plan the Query Optimizer has decided to use, MySQL's Optimizer Trace will tell WHY the optimizer selected this plan.
This presentation will introduce you to the inner workings of the MySQL Query Optimizer by showing you examples with Optimizer Trace. We will cover the main phases of the MySQL optimizer and its optimization strategies, including query transformations, data access strategies, the range optimizer, the join optimizer, and subquery optimization. We will also show how optimizer trace gives you insight into the cost model and how the optimizer does cost estimations.
By attending this presentation you will learn how you can use information from Optimizer Trace to write better performing queries. The presentation will also cover tools that can be used to help process the vast amount of information in an optimizer trace.
Most Enterprises today own information of critical value such as intellectual property, customers personal data or private financial data. This type of data should never be exposed to unauthorized malicious access.
Our session covers the best security practices for a MariaDB deployment, the latest security related features in the MariaDB Server as well as general information related to potential threats in Enterprise systems and our recommended defense mechanisms.
Subjects covered in this session:
- Potential threats and protection mechanisms
- Secure installation with mysql_secure_installation
- At Rest and in-transit data encryption
* MariaDB TLS support
* Securing client-server communication
* Securing data echange in Replication and Galera Cluster
* Data at Rest and Binlog Encryption
- User Management best practices
* Password validation plugins
* User Account Locking
* Expiration of User Passwords
* Blocking user accounts with --max-password-errors
- External authentication with PAM and Kerberos
- Role-based Access Control
- Monitoring activity using the MariaDB Audit Plugin
Redundancy and high availability are the basis for all production deployments. With MongoDB this can be achieved by deploying a replica set. In this talk we'll explore how MongoDB replication works and what are the components of a replica set. Using examples of wrong deployment configurations, we will highlight how to properly run replica sets in production, whether it comes to on-premise deployment or in the cloud.
- How MongoDB replication works
- Replica sets components/deployment typologies
- Practices for wrong deployment configuration
- Hidden nodes, Arbiter nodes, Priority 0 nodes
- Availability zones and HA in single region
- Monitoring replica sets status
GTIDs were introduced to solve replication problems and improve database consistency.
When, accidentally, transactions occur on a replica, this introduces GTIDs on that replica that don't exist on the master. When, on a master failover, this replica becomes the new master, and the corresponding binlogs of the errant GTIDs are already purged, replication breaks on the replicas of this new master, because those missing GTIDs can't be retrieved from the binlogs of this new master.
This presentation will talk about GTIDs and how to detect errant GTIDs on a replica (before the corresponding binlogs are purged) and how to look the corresponding transactions in the binlogs. I'll give some examples of transactions that could happen on a replica that didn't originate from a primary node, explain how this is possible and share some tips on how to avoid this.
Basic understanding of MySQL database replication is assumed.
From the very beginning, TiDB was designed with the combination of Hybrid Transactional and Analytical Processing (HTAP) workloads in mind. However, since TiKV stores data in a row-oriented fashion, you may have been left wondering how well suited that is for analytics?
In this talk, I will introduce TiFlash which is a native extension of TiDB that offers column-oriented storage to speed up heavy-duty OLAP queries. I will then go over the architecture and some of the design decisions, and how you will be able to use it for your next generation data processing needs.
In this presentation, we'll cover the following areas:
1. RDBMS history and landscape (Oracle and MySQL) at PayPal
2. Current and future use cases of MySQL in PayPal
-- back office
-- internally facing applications
-- 3rd party applications
-- PayPal site applications
3. PayPal's effort of making MySQL as a first class data store. We'll cover the active/active and active/passive architectural choices, managing a large number of application to database connections, enforcing security, backup and recovery, etc.
4. Outlook of MySQL in PayPal
In the past few years, Postgres has advanced a lot in terms of features, performance, and scalability for many core systems. However, one of the problems that many enterprises still complain of is that its size increases over time which is commonly referred to as bloat. EnterpriseDB is leading a community effort to build a new storage system for Postgres called zheap, which will provide better control over bloat. This session will discuss:
* The objectives of the initiative including better control over â€œbloatâ€, eliminating write amplification in most cases and reduction in on-disk storage size of Postgres databases
* Technical architecture of the new storage system including contrasting the new design with the current implementation in Postgres today
* Current status of the project, including the introduction of a new storage management interface, undo handling and when zHeap will be released as part of Postgres.
Near the end of 2018, Amazon announced they would be completely off Oracle by the end of 2019; replacing Oracle's database products with their own AWS-based services. Interestingly, Amazon's move to run entirely on it's own, internally operated data services is consistent with a â€œdesireâ€ voiced by many large companies we've spoken to over the past 18 mos. A major impediment for these companies to move forward, however, is that they want their databases to operate in their own data-centers, be open-source, and deploy-able/operable as a Cloud-Native, Database-as-a-Service (DBaaS) offering. How can a company deploy/operate data services that satisfy the requirements they currently address with their proprietary database portfolio? In Amazon's case they are moving forward by replacing proprietary database software with a combination of AWS DynamoDB and Aurora for transactions. For data warehousing, Amazon is using a combination of AWS Redshift and Athena. Are their existing alternative, open-source projects for each of these components? In the case of DynamoDB and Aurora, yes, the Apache Cassandra and MySQL/MariaDB projects can be used for this purpose. For data warehousing, PrestoDB and the Apache Impala projects can be used in place of Athena and Redshift. All of these depend on a combination of the following additional data services: local caching, event streams, and a REST-based object store. Enter Rockset's RocksDB-Cloud. For the past year, we have been working to enable MariaDB and Percona's MySQL distribution to extend their respective distributions of Facebook's MyRocks storage. The RocksDB-Cloud library is configured to load the RocksDB-Cloud library, which is binary-compatible RocksDB; extending it with support for local caching as well as maintaining a consistent image over an event stream and an Object Store â€œbucket.â€ An instance of MariaDB/MySQL operates against local storage (aka the "cacheâ€). The instance's WAL is mapped on to an event stream and a consistent copy of its LSM-tree is maintained in the bucket. In 2019 we are looking to extend this approach to Cassandra and Postgres using the PGRocks and Facebook's Rocksandra projects respectively. Why Intel? The local cache and event stream capabilities of the architecture described above map nicely on to Intel's upcoming Optane DC Persistent Memory product. The use of a REST-based object store affords many opportunities to cost-optimize warm data that needs to remain online but accessed with decreasing frequency over time. As for data warehousing, AWS's Redshift and Athena exploit Amazon's s3 object store. The idea here is similar in that the REST-based object store presents an s3-compatible interface to Impala and PrestoDB as well as exploits the solution's local caching capability. What other technologies enable this solution? First and foremost is availability of low cost, relatively low latency, high bandwidth networking that can be employed as scalable cluster interconnect with full bisection bandwidth. Secondly, an Orchestration/Scheduling framework based on Kubernetes (k8s) has emerged as a key enabler. We combine k8s with cluster-wide software defined network (SDN) and software defined storage (SDS) implementations to fully exploit the resources available within the physical cluster. Finally, we employ the Vittes-Based Database-as-a-Service (DBaaS) framework to manage databases operating over the cluster. In this talk, we'll briefly review the architecture. We'll then discuss the current implementation with particular focus on the trade-offs we made for the local cache, the WAL/event-stream, and the REST-based object store. We'll end the talk with a discussion on plans for 2019 and beyond.
Have you struggled with identifying issues in MySQL?
Come listen to how Verisure experienced an issue that was identified and resolved using PMM (Percona Monitoring and Management). Verisure was able to find the offending query/tuning parameter, make modifications, and then observe the impact over time thanks to PMM.
As more and more people are moving to PostgreSQL from Oracle, a pattern of mistakes is emerging. They can be caused by the tools being used or just by not understanding how PostgreSQL is different than Oracle. In this talk, we will discuss the top mistakes people generally make when moving to PostgreSQL from Oracle and what the correct course of action.
Optimizing MySQL performance and troubleshooting MySQL problems are two of the most critical and challenging tasks for MySQL DBA's. The databases powering your applications need to be able to handle heavy traffic loads while remaining responsive and stable so that you can deliver an excellent user experience. Further, DBA's are also expected to find cost-efficient means of solving these issues.In this presentation, we will discuss how you can optimize and troubleshoot MySQL performance and demonstrate how Percona Monitoring and Management (PMM) enables you to solve these challenges using free and open source software. We will look at specific, common MySQL problems and review the essential components in PMM that allow you to diagnose and resolve them.
When your SQL query reaches the DBMS, it's the optimizer's job to decide how to execute it for you to get the result as fast as possible. To make this decision optimizer can examine the actual table data, but with multi gigabyte and terabyte tables, the only practical solution is to use various data statistics that were collected in advance. The better the statistics and the more precisely it describes the actual data, the faster the plan will be, because the optimizer image of reality will be closer to the actual reality.
In this talk you'll learn what data statistics MariaDB and MySQL can collect, what statements do that, how to tell the optimizer to use it (it won't necessarily do it automatically!) and how it can make your queries many times faster.
And, of course, when not to use indexes, when up-to-date statistics is enough.
In this presentation, we will discuss how to create custom rules when the default rules are not enough for the application.
Have you needed to give a more permissive rule to a user just because of this user wanted to run a specific command?
Also, we will discuss how to use view for hiding fields from users when we don't want them to read all the collection. If you have concerns about security come to this talk.
There are few ways to take a backup. One of the most used tools is Percona Xtrabackup, MariaBackup, and MySQL Enterprise Backup.
In this talk, the audience will have an in-deep overview of:
- Differences between the tools
- Comparison of features
- Which tool work on which MySQL/MariaDB flavor
- Supported Storage Engines
Technology is ever evolving. New open source databases pop up and fade away.
Even the role of a database administrator changes as DBaaS adoption becomes status quo.
The burden of maintaining your skills and knowledge is on you.
Charles Duhigg, the author of The Power of Habit, writes "If you believe you can change - if you make it a habit - the change becomes real."
Through anecdotes of how a global managed services provider like Pythian enables learning for its employees, this session will provide strategies to make learning a habit.
These strategies will help you thrive in your career, whether you are an individual contributor, or a leader hoping to enable learning for your team.
Integrating the most suitable highly available multi master, non clustered, freely accessible
relational database management system (RDBMS)
solution for large scale environments is a challenging task; it
resembles evaluating marathon champions trained by different olympic coaches.
This presentation will show the analogies and differences of
MySQL Group Replication and PostgreSQL Bi-Directional Replication (BDR),
two freely accessible and highly-available multi master non clustered RDBMS products
that have been successfully employed in large scale environments requiring high availability.
Both scenarios have demonstrated, through validated prototypes, to be successful,
satisfying data integrity, reliability and scalability
with slightly different strategies and different upcoming pathways.
When it comes to choosing a distributed streaming platform for real-time data pipelines, everyone knows the answer: Apache Kafka! And when it comes to deploying applications at scale without needing to integrate different pieces of infrastructure yourself, the answer nowadays is increasingly Kubernetes. However, with all great things, the devil is truly in the details. While Kubernetes does provide all the building blocks that are needed, a lot of thought is required to truly create an enterprise-grade Kafka platform that can be used in production. In this technical deep dive, Viktor will go through challenges and pitfalls of managing Kafka on Kubernetes as well as the goals and lessons learned from the development of the Confluent Operator for Kubernetes.
Loki is a horizontally-scalable, highly-available log aggregation system inspired by Prometheus. It is designed to be very cost effective and easy to operate, as it does not index the contents of the logs, but rather labels for each log stream.
Loki initially targets Kubernetes logging, using Prometheus service discovery to gather labels and metadata about log streams. By using the same index and labels as Prometheus, Loki enables you to easily switch between metrics and logs, enhancing observability and streamlining the incident response process â€“ a workflow we have built into the latest version of Grafana.
In this talk we will discuss the motivation behind Loki, its design and architecture, and what the future holds. It's early days, but so far the response to the project has been overwhelming, with more the 4.5k GitHub stars and over 12hrs at the top spot on Hacker News.
Loki is an opensource project, Apache licensed.
With the amount of stored data growing quickly, analytics work has turned into a very intensive workload that will hit our MySQL servers in very hard way.
During the last few years we have seen a bunch of new engines designed to digest big portions of data and help with analytical queries. Now there is a new contender in the arena that claims to be MySQL Compatible, High Impact and Open Source Analytics.
During the course of this session we will go through each of these topics and try to answer:
- Compatible - how hard it is to take data out of MySQL and how much lag it may suffer on real workload?
- High Impact - near real time query response, low cost of entry high ROI?
- Open Source - open source only?
Moreover, we will compare the solution against some other popular products already in the market like ColumnStore, Cassandra, Clickhouse and we'll see how TiDB behaves in comparison.
MySQL, as a database, strives on a filesystem but not all filesystems are equal! On Linux, ZFS is gaining attention and there are good reasons for that, especially if you happen to also run MySQL. In this talk, I'll describe the main characteristics and features of ZFS and draw parallels with the architecture of InnoDB. From easy backups to compression and improved caching, you'll see that MySQL has a lot to benefit from ZFS. I'll discuss the configurations of both MySQL and ZFS so they play well together and perform at their best. Finally, cost saving MySQL/ZFS reference architectures using both, bare metal and clouds servers will be presented and reviewed.
Make the most of your ColumnStore columnar analytics engine!
Deep dive into best practices for columnar engines in general, what are the best use cases for columnar, and tips and trick for both ColumnStore and other analytics engines.
Analytics without data ingestion
Indexing a no-index engine
How and when to use cross engine joins
Enabling low latency data ingestion
Row+column hybrid approaches
Optimizing data loads
Splitting columns for performance
FoundationDB is a distributed database designed to handle large volumes of structured data across clusters of commodity servers. It organizes data as an ordered key-value store and employs ACID transactions for all operations. Document Layer is a stateless micro server on top of FoundationDB that allows management of JSON documents at large scale. It exposes the traits of FoundationDB through document data model, such as fully ordered documents, consistent indexes, and serializable transactions. It does all this while maintaining wire compatibility with MongoDBÂ® API. As the compatibility is done at wire level, existing MongoDBÂ® tools and drivers work seem less. In this talk, we explore how FoundationDB core strengths play well together with the document data model to make it an easier to use and reliable database.
Since version 8.0.14, MySQL supports LATERAL derived tables, sometimes called the for each loop of SQL. What are they? How do they work? Why do you need them? What can they do? How can you use them? Should you use them? What is all this talk about for each loops?
In this session we'll explain the concepts, look at examples of how and when to use this new feature, and talk about how LATERAL derived tables are optimized and executed by MySQL.
Would you like MySQL Protocol Compatibility and near real time dashboard analytics with that?
We are hoarding data in the hopes of making meaningful information out of them to make smart business decisions. But we also resort to inflating costs just to satisfy this need (or not). While there have been a number of open source roll your own options, it has been also operationally costly and sometimes stressful to maintain, not to mention the shortcomings of each. If you have been traditionally storing your data with MySQL and have been enduring running your queries from the same set of servers, fret no more - Clickhouse might just be what you need. In this presentation, we will discuss how to deploy, design, and maintain a Clickhouse analytics platform that continuously reads data from your MySQL servers in near real-time*.
* Depends if you have transformations or complex schema.
Or perhaps I should say this is my PostgreSQL "hello World".
In this presentation I will illustrate the ways, the accidents, and the surprises I had in my journey as a MySQL DBA to implement a spectacular solution with PostgreSQL as a total newbie.
I will start from the basics, covering my journey in:
- Basic configuration
- Security definition
- Create a database and tables
- The magic behind indexes
As modern organizations have rapidly embraced containers in recent years, stateful applications like databases have proven tougher to transition into this brave new world than other workloads. When persistent state is involved, more is required both of the container orchestration system and of the stateful application itself to ensure the durability and availability of the data.
This talk will walk through my experiences trying to reliably run CockroachDB, the open source distributed SQL database, on Kubernetes, optimize its performance, and help others do the same in their heterogeneous environments. We'll look at what kinds of stateful applications can most easily be run in containers, which Kubernetes features and usage patterns are most helpful for running them, and many, many pitfalls I encountered along the way. Finally, we'll ponder what's missing and what the future may hold for running databases in containers.
Do you already run stock PMM in your environment and want to learn how you extend the PMM platform? Come learn about:
1. Dashboard Customizations
How to create custom dashboard from existing graphs, or build Cross Server Dashboards
2. External Exporters - Monitor any service, anywhere!
Adding an exporter, view the data in data exploration, to deploying a working Dashboard
3. Working with custom queries (MySQL and PostgreSQL)
Execute SELECT statements against your database and store in Prometheus
Build Dashboards relevant to your environment
4. Customizing Exporter Options
Enable de-activated functionality that applies to your environment
5. Using Grafana Alerting
How to set up channels (SMTP, Slack, etc)
How to configure thresholds and alerts
6. Using MySQL/PostgreSQL Data Source
Execute SELECT statements against your database and plot your application metrics
The Player Accounts team at Riot Games needed to consolidate the player account infrastructure and provide a single, global accounts system for the League of Legends player base. To do this, they migrated hundreds of millions of player accounts into a consolidated, globally replicated composite database cluster in AWS. This provided higher fault tolerance and lower latency access to account data. In this talk, we discuss this effort to migrate eight disparate database clusters into AWS as a single composite MySQL database cluster replicated in four different AWS regions, provisioned with terraform, and managed and operated by Ansible.
This talk will briefly overview the evolution of the player accounts services from legacy isolated datacenter deployments to a globally replicated database cluster fronted by our account services and outline some of the growing pains and experiences that got us to where we are today.
In this talk, we discuss the use of the open source MySQL Community Edition and Percona Server projects and Intel's Cascade Lake server platform as the primary building blocks for hosting a Database-as-a-Service (DBaaS) deployment. Data demonstrating the advantage of this configuration will be presented. Emphasis will be on deployment of an offering that provides developers with self-service, on-demand delivery of databases that run over shared infrastructure in a multi-tenant environment. Demand for DBaaS from a diverse span of market segments has driven the large Cloud Service Providers to invest in the buildout such offerings.
Today, these offerings are widely available from Multiple Cloud Providers. Until recently, however, organizations wanting to provide DBaaS from within their own data centers were left without good options. This demand is set to be addressed with the emergence of several open source projects. A second, intersecting trend is the imminent release of Intel's next-generation Cascade Lake XEON platform and its support for byte-addressable, persistent memory via the IntelÂ® Optaneâ„¢ DC Persistent Memory product.
Our talk will dive into the intersection of these two trends starting with an overview of the Cloud Native DBaaS model. Next, a concrete description of a deployment model using Percona's MySQL and MyRocks distributions will be introduced. This model will be supported by a two-fold, data-driven discussion on performance and density. First, a performance characterization for both the InnoDB and MyRocks storage engines is presented with the focus on comparing/contrasting the use of NVMe vis-a-vis IntelÂ® Optaneâ„¢ DC Persistent Memory. For the latter, we include IntelÂ® Optaneâ„¢ DC Persistent Memory volumes in fsdax as well as sector modes. Secondly, data on database instance density using a single Intel Cascade Lake server outfitted with NVMe vis-a-vis IntelÂ® Optaneâ„¢ DC Persistent Memory will be presented. The talk will conclude with a discussion on the results and how they have influenced our plans for further work in MySQL CE/Percona Server open source projects going forward.
MySQL and MariaDB have a long list of old, potentially trivial bugs that are annoying if you hit them. No one's really bothered to fix them, so why not you? It doesn't take a large amount of C/C++ knowledge to pick out an old bug, and build/test a MySQL/MariaDB patch to fix an old bug.
I'll talk about what is a sane choice of bug, how to use the existing mtr to help, small test cases, and packaging up and submitting your changes.
But doing this, you'll pay it forwards, for all the positive MySQL/MariaDB experiences you've had.
MySQL JSON Support
Open source reject license
MySQL Group Replication
Distros dropping MongoDB package
This year has seen a good deal of changes and challenges in the MongoDB Space. But where are things going? We are going to cover the most common questions I get asked even today.
What are my risks if I keep using it?
Is MySQL/Postgres Proxy, replication, and HA finally catching up?
Why is JSON support not the only thing to consider?
Will MongoDB lose my data?
Past these I will be discussing the ecosystem as a whole, and how I see the next few years shaping out. While this talk is most helpful for business leaders and architects, for DBAs and engineers it will also help you decide what to focus on in your career for the emerging technology future.
MySQL Shell is the new client for MySQL. It understands standard and X protocol. It also allows to send JS, Python or SQL commands to the server. In this talk I will show you how to first configure the Shell for a nice, good looking experience with MySQL, show some of the basic commands and objects. After this brief overview I will show how it's possible to extend the Shell, and how I hacked it to create an Innotop clone inside the MySQL Shell. This session is a live demo with extra explanation, and the code will be shared during and after the presentation.
The goal of the talk is also to introduce to the Community how to hack the Shell and to call for contributions and feature requests.