ProxySQL is an open source proxy for MySQL that is able to provide HA and high performance with no changes in the application, using several built-in features and integration with clustering software. Those are only a few of the features you'll learn about in this hands on tutorial.
We at Rentalcars.com have been using Couchabase since 2013. In this tutorial, I will show how it works and how you can perform various administrative operations without any downtime.
The tutorial will be a hands on lab, carried out in a VirtualBox environment, sources for which will be provided at the beginning of the tutorial. Attendees will be expected to be equipped with a laptop that is running a 64-bit OS.
The session will be laid out like this:
1. Concept of an Engagement Database
2. Couchbase Architecture Overview
3. Planning a new cluster
4. Installing a new Cluster.
5. Expanding an existing Cluster
6. Multi-Dimensional Scaling
7. Designating specific roles to nodes.
8. Data Modelling & Working with Data
9. Creating Map-Reduce Views
10. Working with N1QL, the SQL super-set for Couchbase
11. Troubleshooting Exercises
InnoDB is the most commonly used storage engine for MySQL and Percona Server for MySQL. It is the focus of most of the storage engine development by the MySQL and Percona Server for MySQL development teams.
In this tutorial, we will look at the InnoDB architecture, including new feature developments for InnoDB in MySQL 5.7 and Percona Server for MySQL 5.7. We will explain how to use InnoDB in your database environment to get the best application performance and provide specific advice on server configuration, schema design, application architecture and hardware choices.
This tutorial has been updated from previous versions to cover new MySQL 5.7 and Percona Server for MySQL 5.7 InnoDB features.
Laurie Coffin welcomes everyone to the Percona Live Europe Open Source Database Conference.
Join Peter Zaitsev, CEO of Percona, as he discusses the growth and adoption of open source databases and tools and Percona’s commitment to remaining an unbiased champion of the open source database ecosystem. At Percona, we see a lot of compelling open source projects and trends that we think the community will find interesting. Following Peter’s keynote we will have a round of lightning talks from projects that we think are stellar and deserve to be highlighted.
How can you optimize database performance if you can’t see what’s happening? Percona Monitoring and Management (PMM) is a free, open source platform for managing and monitoring MySQL, MariaDB, MongoDB and ProxySQL performance. PMM uses Metrics Monitor (Grafana + Prometheus) for visualization of data points, along with Query Analytics, to help identify and quantify non-performant queries and provide thorough time-based analysis to ensure that your data works as efficiently as possible. Michael Coburn will provide a brief demo of PMM.
The inability to control the traffic sent to MySQL is one of the worse nightmares for a DBA. Scaling out and high availability are only buzz words if the application doesn't support such architectures. ProxySQL is able to create an abstraction layer between the application and the database: controlling traffic at this layer hides the complexity of the database infrastructure from the application, allowing both HA and scale out. The same layer is able to protect the database infrastructure from abusive traffic, acting as a firewall and cache, and rewriting queries.
Cloudflare operates multiple DNS services that handle over 100 billion queries per day for over 6 million internet properties. We collect and aggregate logs for these queries for customer analytics, DDoS attack analysis and ad-hoc debugging. I'll briefly cover how we securely and reliably ingest these log events, and use ClickHouse as an OLAP system to both serve customer real-time analytics and other queries.
Drawing from our own experience at GitHub, we argue that open sourcing your database infrastructure/tooling is not only a good, but a smart business decision, that may reward you in unexpected ways. Here are our observations.
A major objective of creating MyRocks at Facebook was replacing InnoDB as our main storage engine, with more space optimisations, and without big migration pains. We have made good progress and we extended our goals to cover more use cases. In this keynote, we will share our MyRocks production deployment status and our MyRocks development plans.
From its humble beginnings in 2012, the Prometheus monitoring system has grown a substantial community with a comprehensive set of integrations. This talk will provide an overview of the core ideas behind Prometheus and its feature set.
This talk is intended to give an overview of the feature differences between Percona Server for MySQL, Oracle MySQL Community Edition and MariaDB Server. We'll also dive into how some of these features are useful to a database developer or DBA in troubleshooting issues, increasing query performance and improving reliability.
ProxySQL is a very powerful platform that allows us to manipulate and manage our connections and queries in a simple but effective way.
Historically MySQL falls short in sharding capabilities. This significant area often caused developers to implement sharding at the application level, or DBAs/SAs to move on to another solution. ProxySQL has an elegant and simple solution that allows us to implement MySQL sharding capabilities without the need to perform significant (or any) changes in the code.
This brief presentation will illustrate how to successfully configure and use ProxySQL to perform sharding. We will cover very simple approaches based on connection user/IP/port, to complicated ones that see the need to read values inside queries.
Please note this presentation requires at least some basic ProxySQL knowledge.
Docker is becoming more mainstream and adopted by users as a method to package and deploy self-sufficient applications in primarily stateless Linux containers. It's a great toolset on top of OS-level virtualization (LXC, a.k.a containers) and plays well in the world of micro services.
However, Docker containers are transient by default. If a container is destroyed, all data created is also lost. For a stateful service like a database, this is a major headache to say the least.
There are a number ways to provide persistent storage in Docker containers. In this presentation, we will talk about how to setup a persistence data service with Docker that can be torn down and brought up across hosts and containers.
We will touch upon orchestration tools, shared volumes, data-only-containers, security and configuration management, multi-host networking, service discovery and implications on monitoring when we move from host-centric to role-centric services with shorter life cycles.
Elasticsearch is a distributed, RESTful search and analytics engine built on top of Apache Lucene. After the initial release in 2010 it has become the most widely used full-text search engine, but it is not stopping there.
The revolution happened and now it is time for evolution. We dive into the following questions:
How did numbers and metrics become first class data in a search engine?
How do shard allocations (which were hard to debug even for us) work and how can you find out what is going wrong with them?
How can you search efficiently across clusters and why did it take two implementations to get this right?
What are current problems and their solution around resiliency and strictness?
Why are types finally disappearing and how are we avoid upgrade pains as much as possible?
How can upgrades be improved so that nobody is stuck on old or even ancient versions?
Attendees learn both about new and upcoming features as well as the motivation and engineering challenges behi
Getting data out of your traditional database stores into a other database type can be problematic, especially if you want to do it in real-time.
Using Tungsten Replicator it's possible to move data from your existing Oracle and MySQL stores into a variety of targets, including Elasticsearch, Kafka and Hadoop.
In this session we'll look at the mechanics of each process and how to combine the core replication technology with filters and deployment models to enable complex data movement and concentration.
Replication is one of the features that made MySQL a popular RDBMS. It is easy to setup, and by default it allows read-write access on both the master and slave. It is also easily creates complicated deployments, such as circular replication.
By default, MySQL Replication is asymmetrical, but it has semi-sync replication plugin. Since version 5.7 it supports multi-master slaves. All these features implement a quick start, but there is also a huge risk of making the wrong decision.
In this session, I will demonstrate why one or another replication solution can fail with data loss or perform slowly. I will show methods that will help you to diagnose and resolve these issues.
This session uses built-in, then command-line tools, because knowledge of how they work is essential for effective troubleshooting.
Slack is embarking on a major migration of the mysql infrastructure at the core of our service to use Vitess' flexible sharding and management instead of our simple application-based shard routing and manual administration. This effort is driven by the need for an architecture that scales to meet the growing demands of our largest customers and features under the pressure to maintain a stable and performant service that executes billions of MySQL transactions per hour. This talk will present the driving motivations behind the change, why Vitess won out as the best option, and how we went about laying the groundwork for the switch. Finally, we will discuss some challenges and surprises (both good and bad) found during our initial migration efforts, and suggest some ways in which the Vitess ecosystem can improve that will aid future migration efforts.
Prometheus is an open source time series database and monitoring system. It is very simple to use and has no external dependencies. It has a powerful query language to retrieve and evaluate metrics.
However, Prometheus storage engine is designed for keeping mainly the short-term data. This gap can be filled in by InfluxDB time-series database having it to store the long-term data and perform downsampling.
Learn the pros and cons of those time-series databases and how to hook up Prometheus with InfluxDB for the long-term storage of your metrics, maintain retention and get trends in system performance with Grafana.
MongoDB and Elasticsearch are both NoSQL "databases", or more correctly NoSQL data stores that are often compared and contrasted on a head-to-head basis.
But if comparing that way, one could easily miss out on the opportunity to use both together as individual and independent data stores that serve specific purposes to deliver the best overall solution for your application flow and performance needs.
In this talk, Kimberly will discuss the overall aspects of each technology, best use cases, the strengths and weaknesses of each, scaling, and provide examples for each with details for the underlying technology with architectural information and basic functioning of these two data stores.
Join her as she will offer opinions on the best times to use separately as independent data stores plus the chance to combine the two to get the absolute performance often needed by today's applications and the large amounts of data required.
Come hear how performance improves 3x-10x in the latest Percona XtraDB Cluster 5.7, along with other security changes!
We will highlight:
* Dramatically improved OLTP concurrency throughput under multiple threads
* The value of disabling binlogs in PXC 5.7, and the tradeoffs to consider
* How Percona XtraDB Cluster performance compares to MariaDB Cluster and InnoDB Group Replication
* Integration with ProxySQL using proxysql-admin
Grafana is the leading open-source graph and dashboard builder for visualizing time series and is a great tool for monitoring databases. Learn how to create dashboards and graphs in Grafana and how to use them to gain insight into the behaviour of your systems.
I will be demoing the new MySQL data source in Grafana that can be used to visualize any data that you have in your database and I will round off the session with a sneak peek of the upcoming major release - Grafana 5.0.0.
The presentation will be a real-life study on how we utilize ProxySQL for connection pooling, database failover and load balancing the communication between our (third party) PHP-application and our master-master MySQL-cluster.
Also, we will show monitoring and statistics using Percona Monitoring and Management (PMM).
ProxySQL is a very powerful tool, with extended capabilities, and we want to show that it?s possible to utilize this to gain functionality (seamless database backend switch) and correct problems (applications missing connection pooling).
SRE is becoming quite the ubiquitous term, but what about DBRE? Let's explore the paths to this craft and how to culturally evolve and support it. Focus on organizational scale, self-service, and force multipliers in recoverability, observability, availability, security, release management, and infrastructure.
- Comparison of old and new paradigms
- What is DBRE and reliability engineering
- The path to DBRE as a career
Paradigm shifts causing this
- Polyglot persistence
- Cloud and virtualization
- Infrastructure as code
- Continous delivery
- Protect the data
- Self-service for scale
- Elimination of toil
- Databases are not special snowflakes
- Reduce the barriers between software and DB ops
DBRE core competencies
- Catastore decision making
- Data distribution
- Failover and availability
- Scaling patterns
- Build guard rails
- Build for services and people
Load balancing MySQL connections and queries using HAProxy has been popular in the past years. Recently however, we have seen the arrival of MaxScale, MySQL Router, ProxySQL and now also Nginx as a reverse proxy.
For which use cases do you use them and how well do they integrate in your environment? This session aims to give a solid grounding in load balancer technologies for MySQL and MariaDB.
We will review the main open-source options available: from application connectors (php-mysqlnd, jdbc), TCP reverse proxies (HAproxy, Keepalived, Nginx) and SQL-aware load balancers (MaxScale, ProxySQL, MySQL Router).
We will also look into the best practices for backend health checks to ensure load balanced connections are routed to the correct nodes in several MySQL clustering topologies. You'll gain a good understanding of how the different options compare, and enough knowledge to decide which ones to explore further.
In this presentation I take a deep dive into the open source storage engine inside InfluxDB. More than just a single storage engine, InfluxDB is two engines in one: the first for time series data and the second, an index for metadata. I'll delve into the optimizations for achieving high write throughput, compression and fast reads for both the raw time series data and the metadata.
In this session we will discuss a new way of compressing data in the Percona XtraDB storage engine, compare it with existing InnoDB table compression from both a performance and data size point of view, and show how we can significantly increase the compression ratio using predefined dictionaries. This talk will cover the most typical usage scenarios for DBAs and reveal some design internals for developers.
In this talk you will discover how you can fully manage your mysql setup in Puppet, including creation of users, databases, and even replication setup.
Puppet is one of the lead open source automation system, but somehow lots of people are still fragile about using it to manage their databases. We will expose how we are using it internally and for our customers, to maintain over time multiple identical mysql stacks accross environments.
The topic of this presentation is how to use ElasticSearch (ES) in order to speedup otherwise slow MySQL analytic queries. For this, we will look at an actual business case for a ride-sharing application, with the actual trip data stored in MySQL.
First, we will examine the ES requirements and explain how to import this data in ES and sync both databases.
Some specifics of ElasticSearch will be discussed, notably the Query DSL as well as the differences with MySQL and how to convert SQL queries to the ElasticSearch DSL.
Then, we will show how to design analytic queries in ES using aggregation and geolocalization, giving some real world examples (e.g. average earning per driver with pickups in the brooklyn area) and performance comparisons with the equivalent queries in MySQL.
Finally, we will discuss how to scale this implementation and make it highly available using ES clustering and sharding features.
You have been asked to administer a Cassandra installation. In this talk, I will take you through basic Cassandra operations from the point of view of a MySQL DBA.
In the beginning, I was simply looking for "where is the equivalent of a .my.cnf?". Over time, that expanded to "how can I be sure this is healthy?", "Why are the users complaining about timeouts connecting?", and "How do I make this cluster replicate into another datacenter?". I will cover all of this and more.
Hopefully, my frustrations will save you some time and pain.
The EMBL-EBI is an international scientific organisation using open source and commercial technologies: in the database backend, open source SQL and NoSQL DBMSs such as PostgreSQL, MySQL and MongoDB are in use alongside the well consolidated Oracle RDBMS.
Throughout the years the overall number of instances has grown notwithstanding a program of retirement of any instance reaching the end of its lifecycle: on this basis the investigation of the costs related to support, maintenance, operations across all database technologies is of strategic importance for the institute.
We will discuss the activity we conducted over the last 12 months that lead to the migration of ca 30 Oracle instances to open source suitable alternative platforms. Some POC and side investigations have been dedicated to solve specific issues.
We will illustrate overall approach, use cases, tools, technical challenges and adopted solutions and lesson learned, in the context of the evolving EMBL-EBI infrastructure
Did you know that Percona Monitoring and Management (PMM) ships with support for AWS RDS and Aurora out of the box? It does!
In this session we'll discuss:
* How to configure PMM (metrics and queries) against RDS and Aurora using an EC2 instance
* How to configure PMM against CloudWatch metrics
* Configuration parameters for RDS/Aurora/CloudWatch for maximum visibility in PMM
* Interesting components of the Metrics Monitor interface - what are the dashboards, what are the key graphs
* Overview of Query Analytics (QAN) against RDS/Aurora
On the roadmap for Q4 2017 is agent-less monitoring of RDS/Aurora and tighter CloudWatch/Prometheus integration!
Prometheus is a monitoring system with a custom time series database at its core. Prometheus 2.0 features the 3rd major iteration of this database. This talk will look at how it has evolved, and how it fits into the goal of doing metrics-based monitoring.
This talk will cover the Amazon Migration Tool. The talk will cover the possibilities, potential pitfalls prior to migrating and a high-level overview of its functionalities.
Are you running on Amazon, or planning to migrate there? In this talk, we are going to cover the different technologies for running databases on Amazon Cloud environments.
We will focus on the operational aspects, benefits and limitations for each of them.
There are lots of database systems out there, and every year their number grows. Many of them were created with a great idea in mind. Sharding a relational database, implementing git capabilities in a database, designing data and queries around the actor model, building a single software which is both a database and an application server... are just some of these ideas. I will talk about my favourite ones.
In the last two decades, both researchers and vendors have built advisory tools to assist database administrators (DBAs) in various aspects of system tuning and physical design. But these tools are incomplete because they still require humans to make the final decisions about any changes to the database and are reactionary measures that fix problems after they occur. What is needed is a truly "self-driving" database management system (DBMS) that is able to optimize the system for the current workload and predict future workload trends so that the system can prepare itself accordingly. It enables new optimizations that are not possible today because the complexity of managing these systems has surpassed the abilities of human experts. In this talk, I present Peloton, the first self-driving DBMS that we are building at CMU. Peloton's autonomic capabilities are now possible due to advancements in deep learning, as well as improvements in hardware and adaptive database architectures.
This lighting talk is a technical pitch why you should use OpenPOWER for MySQL/MariaDB/Percona Server or even MongoDB and if you want PostgreSQL. It explains how a fully open source stack can be build from the machine upto the application. It will also say why POWER is a better architecture to run bigger databases.
000webhost handles millions of user queries on close to million unique databases. In this talk we will present the obstacles we encountered on such scale. We will show why we moved away from the traditional webserver->dbserver to a more dynamic architecture using HAProxy, ProxySQL and MariaDB in LXC containers. We will present our model, query routing logic and the outcome from the collaboration with ProxySQL developer Rene.
Laurie Coffin welcomes everyone to the Percona Live Europe Open Source Database Conference.
You may know Continuent Tungsten for our highly advanced MySQL replication tool or for our state-of-the-art MySQL clustering solution, Tungsten Clustering. Our solutions are used by leading SaaS vendors, e-commerce, financial services and telco customers. But there are more, many more, Tungsten deployments out there. Tungsten Replicator is also an Oracle replication solution, the "Oracle GoldenGate without the price tag”. Tungsten Replicator can also be used for real-time data loading into analytics, from MySQL and Oracle into Cassandra, Elasticsearch, Kafka, Redshift and Vertica. And there could be more... How about Tungsten Backup? Using the power of the Tungsten Transaction History Log (THL) , we may create the ultimate continuous backup solution with flexible point-in-time recovery. Would you be interested, especially for free? What about the ultimate proxy, a stand-alone Tungsten Connector? To support our Clustering solution, Continuent has developed one of the most advanced proxies available. Could it be time to unleash our Connector for public use?
A Q&A with the Authors of the newly released O'Reilly title: Database Reliability Engineering. Join Laine and Charity as they discuss their new book, "Database Reliability Engineering", which focuses on designing and operating resilient database systems and uses open-source engines such as MySQL, PostgreSQL, MongoDB, and Cassandra as examples throughout.
Pepper.com is purposely different than other platforms that list daily deals. Around the clock, the community seeks and finds the best offers in fashion, electronics, traveling and much more. With 500 million page views, more than 25 million users and over 70,000 user-submitted deals per month across communities in the United States, Europe and Asia, Pepper has quickly become the largest community deal platform worldwide. The minute-by-minute back and forth with customers and the application – tracking new postings, rankings, messages, etc. – means database responsiveness and uptime are crucial to maintaining an excellent user experience. Pavel will describe how Pepper optimizes their database performance to make sure their web applications remain responsive and meet users’ expectations.
When Intel launched the Xeon Scalable Processors in July 2017 the database benchmark used was HammerDB. HammerDB is an open source graphical benchmarking tool that enables the comparison between both open source and commercial databases on multiple platforms for OLTP and Query based workloads.
This presentation takes a real-world example of comparing MariaDB with a commercial database on Linux on Intel to show how to understand the benchmarks used and how to tune, configure and present findings on both performance and cost in a clear and concise way to evaluate the move to an open source database platform.
Based on the findings this session will share key learnings on current optimal platforms and storage technologies for database as well as the Intel focus on applying technologies to open source database acceleration such as FPGA and SSD and non-volatile memory.
Insights and previews will also be given into ongoing HammerDB development.
At the Wikimedia Foundation (host of Wikipedia and many other open collaborative projects) we work on a limited budget, donated by our many generous donors. As many other companies that are not Facebook- or Google-sized, we have to do more with less both in terms of budget and our small number of Ops in order to serve the over 400 thousand requests per second and the 1200 million monthly users. We made several mistakes (and a few successes) along the road regarding architecture and hardware decisions, especially for the database-distributed components, storage model, hardware chosen, server size, technology adoption, etc. Now we want to share those with you.
As service providers, one of our responsibilities is helping clients understand what causes contributed to a production downtime incident, and how to avoid (as much as possible) them from happening again. We do this with Incident Reports, and one common recommendation we make is to have a historical monitoring system in place. All our clients have point-in-time monitoring solutions in place, solutions that can alert them when a system is down or behaving in unacceptable ways. But historical monitoring is still not common, and we believe a lot of companies can benefit from deploying one of them.
In most cases, we have recommended Percona Monitoring and Management (PMM), as a good and Open Source solution for this problem. In this session, we will talk about the reasons why we recommend PMM as a way to prevent incidents, and also to investigate their possible causes when one has happpened.
At LifeStreet we needed to scale our real time ad analytics platform to multiple petabytes. We evaluated and used a number of Open Source and Commercial Solutions, but they were not efficient enough or too expensive. When Yandex has released ClickHouse to Open Source we quickly realized its potential, and started our implementation project. It was a long way but it finally worked out great.
In this presentation I will talk about our experiences from application developer's viewpoint - what worked well and not so well, what challenges we had to overcome as well as share share the best practices for building large scale platform based on ClickHouse.
PostgreSQL has had JSONB support since 2012 with 9.2, but is it fast enough to beat MongoDB? In this talk, we compare the performance of using schemaless documents with both PostgreSQL and MongoDB for high performance workloads.
Today, the world of IOT is still in a primordial stage: several vendors offer ?platforms? hoping to cover all the aspects of IOT projects in an attempt to simplify the flow and management of IOT data.
FogLAMP is the effort of organizations active in IOT, IIOT (Industrial Internet of Things) and Fog Computing to provide a fully open source stack operating from the Edge and integrated with other Cloud and Enterprise solutions. FogLAMP?s main objective is to simplify the management of IOT data, whether this data is simply stored and forwarded, or consumed and analyzed at the Edge.
We will talk about what is FogLAMP and how it works. We will explore the pluggable architecture and how the modularity of the product allows developers and architects to build IOT projects. We will also see FogLAMP in action in a demo with sensor data collected from the Edge, analyzed locally and pushed to OSIsoft PI System, used to collect, analyze, visualize time-series data.
Anyone looking for a high availability master-slave management solution for MySQL may come across ProxySQL and Orchestrator. This combination of products solves many problems, but still requires some manual labour when the configuration changes, when there is a network split and other scenarios. In this talk I will discuss the standard architecture, the solutions it provides and what it's missing. I will then share an automation solution developed at Wix, that solves those problems using Consul, to combine everything together.
We've migrated our platform (Kinja) from a datacenter-based approach to AWS, including migration of standalone MySQL hosts to RDS/Aurora.
I'd like to talk about our findings, what kind of problems we were hitting during this transition, giving you a hands-on experience about how you should change your thinking when you decide to move into a managed database service because it's a kind of different way compared to what you used to have.
I'd like to show you our best practices, I'd like to show some characteristics of Aurora, I'd like to show you a few of our utilties that we had to create to make daily operations possible.
ClickHouse is an open source column-oriented database management system capable of real time generation of analytical data reports using SQL queries.
Acting as a data bridge between MySQL protocol and ClickHouse protocol, ProxySQL now enables MySQL clients to execute queries in ClickHouse through it.
In this session we will show how to configure ClickHouse as a backend for ProxySQL, and how a MySQL client (for example in PHP) will be able to execute data reports in ClickhHouse
Instead of using ETL Tools, which consume tons of memory on their own system, you will learn how to do ETL jobs directly in and with a database: PostgreSQL.
PostgreSQL Management of External Data (SQL/MED) is also known as Foreign Data Wrapper (FDW). With FDW, there is nearly no limit of external data, that you could use directly inside a PostgreSQL database.
This talk will show you how to use them with examples accessing several data sources.
When storing time-series data, many developers start with some well-trusted system like Postgres, but as their data hits a certain scale, give up its query power and ecosystem by migrating to a NoSQL or other "modern" time-series architecture.
In this talk, I describe why this trade-off is unnecessary, and how we've built TimescaleDB, an efficient, scalable time-series database engineered up from Postgres. The nature of time-series workloads--appending data about recent events--presents different demands than transactional (OLTP) workloads. We've architected our time-series database to take advantage of and embrace these differences.
TimescaleDB improves insert rates by 15X over Postgres, even on a single node. By right-sizing chunks, it avoids the "performance cliff" Postgres experiences once reaching table sizes of 50+ million of rows, while offering compelling complex query performance improvements. TimescaleDB is packaged as a Postgres extension, released under the Apache
MariaDB has had its first 10.3 alpha release. This talk will go into what new features the MariaDB team has planned for this release.
Notable features include the "AS OF" syntax, as well as a subset of PL/SQL syntax that Oracle supports.
I will walk through Yandex's development of ClickHouse, and how its iterative approach to organizing data storage has resulted in a powerful and extremely fast open source system.
Yes, you read that right. Microsoft loves MySQL and PostgreSQL! Azure Database for MySQL and PostgreSQL are Microsoft's first foray into OSS databases in Azure as fully-managed PaaS offerings. Come and learn about the platform architecture that powers Azure Database for MySQL and PostgreSQL in Azure and where Microsoft is headed next in this space!
ProxySQL is a widely used technology for MySQL load balancing and query routing. But there is one unique thing that makes it really outstanding: with ProxySQL you can connect to other databases using MySQL protocol.
We will walk through real customer's use cases where application utilized different database technologies using single MySQL protocol and and ProxySQL.
At a blistering pace and for a variety of reasons, companies are migrating their on-premise database infrastructures to cloud-based solutions?to save costs on hardware, tame the impact of disaster recovery or even to improve security. Zalando is not an exception and more than two years ago we migrated our first production services to AWS.
In addition to the fully managed database services like RDS and Aurora, Amazon offers a wide spectra of EC2 instances with different performance and price. Without a lot of experience in running cloud databases it's not easy to make a right choice and as a result you will either have a pure database performance or you will overpay for over-provisioned resources.
In this talk I will explain why we decided to run most of our databases on EC2 Instances instead of RDS, how we chose EC2 Instance types and EBS Volume sizes, which AWS CloudWatch metrics MUST be monitored (and why), what problems we hit and how to avoid them.
ClickHouse is an open source DBMS for high-performance analytics, originally developed at Yandex for the needs of Yandex.Metrica web analytics system. It is capable of storing petabytes of data and processing billions of rows per second per server, all while ingesting new data in real-time.
I will talk about architectural decisions we made with ClickHouse, their consequences from the point of view of an application developer and how to determine if ClickHouse is a good fit for your use case.
I will cover the following topics:
* Overview of storage engine and query execution engine.
* Data distribution and distributed query processing.
* Replication and where it sits on the consistency-availability spectrum.
This talk is intended to give an overview of the feature differences between Percona Server for MongoDB and MongoDB Community Edition. We'll also dive into how some of these features are useful to a database developer or DBA in troubleshooting issues, increasing query performance, improving reliability, and improving security.
Everyone already knows the Jsonb data type: one of PostgreSQL's most attractive features that allows efficient work with semi-structured data without sacrificing strong consistency and ability to use all the power of proven relational technology. But what exactly is inside Jsonb? Are there any caveats, and how can you accidentally bring down performance?
We will discuss all these questions together with advantages and disadvantages of using Jsonb in different situations in comparison with other solutions and existing standards. I'll show some important best practices about how to write compact queries to work with Jsonb, and avoid common mistakes/performance problems.
Database management systems (DBMSs) are the most important component of any data-intensive application. They can handle large amounts of data and complex workloads. But they're difficult to manage because they have hundreds of configuration "knobs" that control factors such as the amount of memory to use for caches and how often to write data to storage. Organizations often hire experts to help with tuning activities, but experts are prohibitively expensive for many.
In this talk, I will present OtterTune, a new tool that can automatically find good settings for a DBMS?s configuration knobs. OtterTune differs from other DBMS configuration tools because it leverages knowledge gained from tuning previous DBMS deployments to tune new ones. Our evaluation shows that OtterTune recommends configurations that are as good as or better than ones generated by existing tools or a human expert.
Percona XtraDB Cluster is a very robust, high performing and widely used solution that answers high availability needs. But it can be very challenging when deploying the cluster over a geographically dispersed area.
This presentation will briefly discuss the right approach to successfully deploying Percona XtraDB Cluster when in the need to cover multiple geographical sites, close and far.
- What is Percona XtraDB Cluster and what happens in a set of nodes during commit
- Clarify what geo-dispersed means
- What to keep in mind
- How to correctly measure metrics
- Use sync the right way (sync/async)
- Use tools like replication_manager
"It's just a log, right?" How hard can it be, how can you possibly mess this up?
Wrong. Logs can impact your reliability, performance and quality of sleep in a million ways small and large. In this session we'll cover some of the lessons every engineer should know (and often learns the hard way), such as why good logging solutions are so expensive, why treating your logs as strings can be costly and dangerous, how logs can impact code efficiency and add/fix/change race conditions in your code. And what's the difference between a log line and an event, anyway?
We'll talk about how to craft a good, helpful log line or event and how to spot a bad one. We'll also talk about trends in debugging for complex systems, like the drive for structured logs/events and what comes next.
We all know those conference talks that bleat on about doing the right thing at the right time. This talk aims to reveal some of the anti-best practices to illustrate how some installations of MySQL are doomed from day 0. Infrastructure choice, queries and everything in between deserve special attention if you really want to fail fast.
This talk will cover:
- Picking the worst hardware you can for your mission critical database
- Schema over-engineering
- Query disasters
- Split brain scenarios we all want to see
- Replication disaster zone
- Highly available fails
- Accumulate these tips for instant dismissal
This talk is intended to give a basic overview of the encryption requirements of several current compliance standards (EU GDPR, PCI DSS, HIPAAA/HITRUST, and SOC II TSP) and how the "at rest" encryption component can be met in a technology-agnostic way.
Have you wanted to deploy some cool new database to production, and simply can't make use of transparent data encryption? Come to this talk to find out how to use LUKS/dm-crypt to perform at-rest encryption, and where this fits into your overall compliance stance.
Icinga is a popular open source successor of Nagios that checks hosts and services, and notifies you of their statuses. But covering availability is not enough for a comprehensive database monitoring. On top of that, you need metrics for performance and growth to deal with your scaling needs. Adding conditional behaviours and configuration in Icinga is not just intuitive, but also intelligently adaptive at runtime. This makes it easy to deal with a bunch of different database flavours at once.
If we are talking about any MySQLish, PostgreSQL, MongoDB, or whatever open source database, Icinga is able to give you all the needed information. The talk will give you a detailed introduction into Icinga's abilities and shows practical guidelines for a successful database monitoring in a live demo.
This talk is an unbiased look at understanding the high-level uses and differences between open source databases. We'll see what relational, document, key-value and columnar databases are meant for, and when you should avoid them.
Cloudflare operates multiple DNS services that handle over 100 billion queries per day for over 6 million internet properties. We collect and aggregate logs for these queries for customer analytics, DDoS attack analysis and ad-hoc debugging. Due to the scale at which we operate, we?ve had to be creative in our implementation. In this talk, I?ll go into more detail on the architecture we use for log ingestion and insertion into a ClickHouse cluster, as well as how we aggregate the data over time for longevity. I?ll also touch on the tools we use downstream of ClickHouse to visualize and analyze the data ad-hoc.