This hands-on tutorial is intended to help you navigate your way through the steps that lead to becoming a MongoDB DBA.
You will start by making sure you know about all the reasons to use MongoDB, and what a typical topology looks like for a replica set - including setting up your first one!
We will not talk about sharding in this session (please see the afternoon one for that). Instead, the focus will be on a few areas:
- What is MongoDB
- How different is MongoDB from MySQL?
- The (most) common MongoDB topologies
- CRUD: data management
- Schema design patterns
- Replica Set upgrade
- Securing your setup
- Common issues: How to detect, verify and address them using logs, Percona Toolkit and PMM
In this training session, we will combine an extended deep dive on the Aurora architecture with a hands on lab using Aurora PostgreSQL. As part of the deep dive we will cover the unique features and changes that work together to produce improved scalability, availability and durability.
In addition, you will get a hands-on experience creating an Aurora PostgreSQL cluster, configuring that cluster for high availability and read scaling as well as point in time recovery, and cloning your cluster. We will also walk through failover, promotion and the best practices around parameter settings based on application workload.
This hands-on tutorial is for when a single replica set it not enough. As some people may not have attended the morning session, there will be talks on backup types and import/export patterns. However, these will be extended to also consider sharding.
We will also talk about multiple aspects of sharding and using the right engine for your workload. If you want to know the basics about MongoDB you should also attend the morning session.
What to expect in this session:
- Types of sharding and which MongoDB uses
- Rules for picking a good shard key
- MongoDB engines and how they matter as much as the shard key itself
- Example of a multiple DC data local cluster (GDPR, EU privacy as a reason for it)
- Example of a multiple region cluster for DR
- How to fix things when you pick a bad shard key
- Implementing schema rules to prevent bad actors
- Backup considerations with sharding
- When do you scale up vsx scale out?
- Troubleshooting a cluster, it's easier than you think when you know the rope
Postgres 10 was one of the most significant releases in years, adding several long-awaited features: Logical Replication, an overhauled Partitioning System, enhanced multi-server management options, improved parallel query support and better support for multi-column statistics, (among other changes).
This tutorial will be split into two parts. The first will provide an overview of where the project is today and a look at many of the new features that came out in Postgres 10. For part two, we will provide a live demonstration of logical replication, and discuss various trade-offs between this new replication option and the other existing solutions.
Attendees are encouraged to bring a laptop with Postgres 10 installed and to play along as we walk through our live demonstrations. This will also give you a good basis for setting up your own experiments to help you get hands-on experience with this new service.
In the last 20 years, researchers and vendors have built advisory tools to assist DBAs in tuning and physical design. Most of this previous work is incomplete because they require humans to make the final decisions about any database changes and are reactionary measures that fix problems after they occur. What is needed for a "self-driving" DBMS are components that are designed for autonomous operation. This will enable new optimizations that are not possible today because the complexity of managing these systems has surpassed the abilities of humans.
In this talk, I present the core principles of an autonomous DBMS based on reinforcement learning. These are necessary to support ample data collection, fast state changes, and accurate reward observations. I will discuss techniques on how to build a new autonomous DBMS or retrofit an existing one. Our work is based on our experiences at CMU from developing an automatic tuning service (OtterTune) and our self-driving DBMS (Peloton).
Braze is a lifecycle engagement platform used by consumer brand to deliver great customer experiences to over 1 billion monthly active users. In this talk, co-founder and CTO Jon Hyman will go over multiple production use cases Braze uses for buffering data to Redis for efficient real-time processing.
Braze processes more than a third of a trillion pieces of data each month when generating time series analytics for its customers. Jon will describe how each of these events gets buffered to Redis hashes, and some to Redis sets, before ultimately flushed to Braze's analytics database hundreds of thousands of times per minute. This talk will also discuss how Redis sets are the cornerstore of Canvas, Braze's user journey orchestration product used by brands such as OKCupid, Postmates, and Microsoft.
Lastly, Jon will cover how Braze has written its own application-based sharding for Redis in order to support the millions of operations per second that Braze needs to handle its daily volume.
Many administrators responsible for databases confront two clashing phenomena:
· Data is coming at increasingly higher rates (from an expanding number of sources)
· The time required to process transactions and analyze data is rapidly shrinking
The most common approaches to address these issues and speed up databases are to deploy new hardware and refactor code. At times, however, these approaches are not viable - particularly in the short term - due to implementation risks, cost, and timelines.
In this session, you will learn:
· How parallelism in the I/O layer impacts performance, particularly in database servers
· How interrupt-based I/O limits throughput in systems with high core count
· The connection between I/O waits and CPU context switches
· The impact of parallelizing I/O on solving these problems
· Cloud-based VMs, storage cost, and database performance
· A software-based alternative to mitigating I/O problems
For the cloud environment, we hope MySQL cluster can do the failover and choose the new master node by the instance-self automatically, without third-party middleware. So we built the Raft protocol inside MySQL.
In MySQL-Raft version, every cluster usually has three nodes, one master and two slaves, but we can support more nodes. When master node is down, the cluster can choose the new master by Raft Protocol, and use Flashback to rollback the committed transactions if needed, to make sure all of the nodes are the same.
Azure provides fully managed, enterprise-ready community MySQL and PostgreSQL services that are built for developers and devops engineers. These services use the community database technologies you love and enable you to focus on your apps instead of management and administration burden. In this session, we will walk you through service capabilities such as built-in high availability, security, and elastic scaling of performance that allow you to optimize your time and save costs. We will demonstrate how the service integrates with the broader Azure platform enabling you to deliver innovative and new experience to your users. The talk will cover best practices and real customer examples to demonstrate the benefits and how you can easily migrate your databases to the managed service.
The earliest relational databases were monolithic on-premise systems that were powerful and full-featured. Fast forward to the Internet and NoSQL: BigTable, DynamoDB and Cassandra. These distributed systems were built to scale out for ballooning user bases and operations. As more and more companies vied to be the next Google, Amazon, or Facebook, they too "required" horizontal scalability.
But in a real way, NoSQL and even NewSQL have forgotten single node performance where scaling out isn't an option. And single node performance is important because it allows you to do more with much less. With a smaller footprint and simpler stack, overhead decreases and your application can still scale.
In this talk, we describe TimescaleDB's methods for single node performance. The nature of time-series workloads and how data is partitioned allows users to elastically scale up even on single machines, which provides operational ease and architectural simplicity, especially in cloud environments.
Keeping data safe is the top responsibility of anyone running a database. Learn how the Google Cloud SQL team protects against data loss. Cloud SQL is Google's fully-managed database service that makes it easy to set up and maintain MySQL databases in the cloud. In this session, we'll dive into Cloud SQL's storage architecture to learn how we check data down to the disk level. We will also discuss MySQL checksums and infrastructure Cloud SQL uses to verify that checksums for data files are accurate without affecting performance of the database.
ClickHouse is an open source analytical DBMS. It is capable of storing petabytes of data and processing billions of rows per second per server, all while ingesting new data in real-time.
I will talk about ClickHouse internal design and unique implementation details that allow us to achive the maximum performance of query processing and data storage efficiency.
Accelerating MySQL with Just-In-Time (JIT) compilation is emerging as a quick and easy way to achieve greater efficiencies with MySQL. In this talk, l'll go over the benefits and caveats of using Dynimizer, a binary-to-binary JIT compiler, with MySQL workloads. I'll discuss how to identify situations where JIT compilation can help, how to get setup and running, and go over benchmark results along with other performance metrics. We'll also peek under the hood and take a look at what's happening at a lower level.
ClickHouse is very fast and feature rich open source analytics DBMS with multi-petabyte scale. It gained a lot of attention over the last year, thanks to excellent results in benchmarks, conference talks and first successful projects.
After the initial wave of early adopters, the second wave is coming: many companies started to consider ClickHouse as their analytics backend. In this talk I'll review the state of ClickHouse worldwide adoption, share insights about business problems ClickHouse helps to solve efficiently, highlight possible implementation challenges and discuss best practices.
Spider is a Storage Engine of MySQL and MariaDB for database sharding. This Storage Engine is 10 years old and bundled in MariaDB 10.0 and later. Spider supports the following things.
- Dividing a huge data into multiple servers. (sharding)
- Dividing a lot of accesses into multiple servers. (sharding)
- Joining tables on multiple servers. (cross shard join)
- Parallel processing on multiple servers. (parallel processing)
- Use bunch of databases as a database. (federating)
- Design and control data redundancy per table and partition. (redundancy)
- Design and control fault tolerance per table and partition. (fault tolerance)
- Parallel fulltext search on multiple servers. (fulltext searching)
- Parallel spacial search on multiple servers. (spacial searching)
- Dividing NoSQL access into multiple servers. (NoSQL)
In this session, I will introduce Spider and the latest Spider 3.3(GA) features.
This year the Cassandra team in Instagram has been working on a very interesting project to make Apache Cassandra's storage engine pluggable, and implemented a new RocksDB-based storage engine into Cassandra. The new storage engine can improve the performance of Apache Cassandra significantly, make Cassandra 3-4 times faster in general, and even 100 times faster in some use cases.
In this talk, we will describe the motivation and different approaches we have considered, the high-level design of the solution we choose, also the performance metrics in benchmark and production environments.
The database team at GitHub is tasked with keeping the data available and with maintaining its integrity. Our infrastructure automates away much of our operation, but automation requires trust, and trust is gained by testing. This session highlights three examples of infrastructure testing automation that helps us sleep better at night:
- Backups: scheduling backups; making backup data accessible to our engineers; auto-restores and backup validation. What metrics and alerts we have in place.
- Failovers: how we continuously test our failover mechanism, orchestrator. How we setup a failover scenario, what defines a successful failover, how we automate away the cleanup. What we do in production.
- Schema migrations: how we ensure that gh-ost, our schema migration tool, which keeps rewriting our (and your!) data, does the right thing. How we test new branches in production without putting production data at risk.
Time-series data is now everywhere and increasingly used to power core applications. It also creates a number of technical challenges: to ingest high volumes of data; to ask complex, queries for recent and historical time intervals; to perform time-centric analysis and data management. And this data doesn't exist in isolation: entries are often joined against other relational data to ask key business questions.
In this talk, I offer an overview of how we re-engineered TimescaleDB, a new open-source database designed for time series workloads, engineered up as a plugin to PostgreSQL, in order to simplify time-series application development. Unlike most time-series newcomers, TimescaleDB supports full SQL while achieving fast ingest and complex queries. This enables developers to avoid today's polyglot architectures and their corresponding operational and application complexity.
As any modern DBA we lean more towards development daily activities. We write more software and do less routine tasks.
Along the way to development workflow full automation we faced many problems. The session covers solutions of those:
* Git flow adaptation for highly restrictive compliance requirements;
* Unit testing. What to mock and how;
* Surviving dependencies hell;
* Packaging Python code.
As a distributed key-value storage engine, TiKV supports strong data consistency, auto-horizontal scalability, and ACID transaction. Many users are now using TiKV directly in production as the replacement of other key-value storage, some of them even have scaled TiKV to 100+ Nodes.
In this talk, I will talk about how we make it possible. The details include but not limited to:
1. Why did we choose RocksDB as the backend storage engine? How to optimize it?
2. How to use the Raft consensus algorithm to support data consistency and horizontal scalability?
3. How to support distributed transaction?
4. How to use Prometheus to monitor the systems and troubleshoot?
5. How to test TiKV to verify its correctness and guarantee its stability?