How many times have we seen this headline lately: “X million records leaked in data breach”? The answer is, too many!
In fact, because “millions of records” are starting to leak so frequently, larger news outlets are not covering the “smaller” leaks as often as they were a few years ago. These days it’s all about whether the leaks have 100s of millions, or billions, of records exposed.
There is a lot of confusion on why, or even how, these things happen. Some of this stems from the choice of words used to describe these events.
Let’s look at the use of the word “breach” for example. The definition of a breach is “To make a gap in and break through (a wall, barrier, or defense).” This implies that someone did something bad or nefarious to penetrate the database or application defenses. They “hacked” a breach open, exploited it, and snuck away with your data. This leads many people to adopt mitigation strategies such as detection of potential hackers, strengthening of defenses, and layering in multiple detection/barriers, etc.
In contrast, a leak is defined as accidentally losing contents, especially liquid or gas, through a hole or crack. The image that comes to mind here is that your database or application has a bug or a hole in it, and data is just spilling out, like a leaky pipe losing water. The fear is that some enterprising hacker just stumbles upon this data laying around and starts charging money for it on the dark web. Popular mitigation strategies here are similar to above, there is a reliance on detection and defense (ie. leak only encrypted data). Of course, here testing and early detection of code and systems plays a big role too.
Another image that pops into peoples’ minds when they hear ‘leak’ is that someone inside their organization is sharing something they shouldn’t. For example, you might see in the news that “someone within the government leaked this information to the press”. An individual opened up the floodgates, giving away information they shouldn’t have. The idea that someone trusted can take a backup of a database, copy it to the cloud, and start selling access is a fear that many companies have.
I submit however that the use of the words “leak” and “breach” is not accurate. It actually sends people down the wrong path, meaning they can fail to prevent and solve issues from happening. Most database security issues are not a result of hackers exploiting bugs or punching their way through defenses. Nor are they from a bug or hole dumping data where it should not be. They are also not the nefarious undertakings of an inside man.
So What Is Causing These Issues?
Put simply, the vast majority of database leaks we hear about are caused by a lack of security-focused database setup and configurations. Time after time we see the words “unsecured”, “publically available”, and “cloud” peppered throughout these press reports.
Here is a common scenario: Timmy the developer needs to get a new app up ASAP. He builds a test environment in his favorite cloud and launches a POC (proof of concept). He deploys his favorite database using something readily available. He uses the default setup as it’s not in production and he just needs something quick and dirty. The POC is a huge hit, they add onto it and quickly move it to production. Not going back to “productionize” the database environment or to plug holes or gaps in the system is often a reason for future issues.
Of course, there are other scenarios that leave databases exposed and subject to theft. But, almost all of them are a result of overlooked or missed steps caused by human error.
Many of these issues can be traced back to:
- Old and under-maintained software. Have you not updated your software in the last five years?!
- Un-configured or misconfigured database settings (most databases are not secure out of the box).
- Basic care and maintenance of the database being ignored
- Front door access left wide open (because security is hard)
Of course, no one sets out to mismanage their databases, but these are impacted drastically by:
- Underqualified and under-trained staff being responsible for maintaining the database infrastructure. More and more tooling, automation, and push-button deployments are becoming the new normal. This leads to bad setups, missing the most basic of security (like a password reset). The industry is going towards empowering the developers, but database setup, security, and performance is not normally a top priority for application developers.
- Overworked staff spread too thin to keep things even remotely up to date. Some studies suggest that 80% of companies are running one or more outdated or buggy releases. Companies are now running 100s or 1000s of databases in their data centers. It is easy to miss a few databases when you are dealing with them at scale.
- A misunderstanding of who is responsible for security in the cloud. Shared responsibility is a critical concept to fully understand.
- More, more, more. Just the sheer amount of data and volume of databases out there means incidents will inevitably increase in number.
Until companies get a grip on ensuring robust initial database setup and configurations are met, the software is updated, and database maintenance is prioritized, significant data breaches will continue to hit the headlines.
Join Stephen Thorn and Michal Nosek, Percona Technical Experts on Wed, August 19, 2020, at 11:00 AM EDT for their upcoming webinar MongoDB Encryption at Rest. This hands-on workshop will walk us through the process of setting up data-at-rest encryption in Percona Server for MongoDB (PSMDB). Data-at-rest encryption is one of the methods used to secure database deployments from unauthorized data access. It’s also commonly required for enterprise-grade database deployments due to different regulations and compliance requirements.