PostgreSQL: Bye-Bye MD5 Authentication. What’s Next?

Introduction

MD5 has been the most popular algorithm for encoding passwords in PostgreSQL and other database systems. It is simple to implement, causes less overhead and less latency for completing the authentication, and this is why it has been the most preferred method. However, a discussion thread in the PG community has given a signal to discontinue support for MD5 authentication, which is indeed a big change. In this blog post, I will cover the basics of MD5 authentication, its dark side, implementation in PostgreSQL and a potential solution in the form of scram-sha-256.

Background

It is not unusual to see the following entries in pg_hba.conf, but they will soon become a page of history.

host all all 0.0.0.0/0 md5
host all all 127.0.0.1 md5
host all all 192.168.0.0/24 md5

host all all 0.0.0.0/0 md5

host all all 127.0.0.1 md5

host all all 192.168.0.0/24 md5

In future versions, PostgreSQL may issue warnings or even discard authentication requests by displaying an error. MD5 usage will gradually phase out as planned. The following is a proposal in a community.(Reference link).

1. In v18, continue to support MD5 passwords, but place several notes in the documentation and release notes that unambiguously indicate that MD5 password support is deprecated and will be removed in a future release.
2. In v19, allow upgrading with MD5 passwords and allow authenticating with them, but disallow creating new ones (i.e., restrict/remove password_encryption and don’t allow setting pre-hashed MD5 passwords).
3.In v20, allow upgrading with MD5 passwords, but disallow using them for authentication. Users would only be able to update these passwords to SCRAM-SHA-256 after upgrading.
4. In v21, disallow upgrading with MD5 passwords. At this point, there should be no remaining MD5 password support in Postgres.

Note :- Till time, this is just a proposal and not a confirmation. However, MD5 will be surely unsupported in future versions.

Before understanding MD5 in PostgreSQL, we will review its basics.

MD5

Precursor – MD4

Type – 128-bit encryption

The algorithm receives a text or file as input and produces a 128-bit hash(or a 32-character-long hex code) as output. As it can receive text files as an input, it is also used for calculating checksums (md5sum).

In the MD5 authentication process, the text is divided into 512-bytes of blocks and passed to the hashing function that ultimately produces a 32-digit hexadecimal number. The image below describes the process in detail.

MD5 is not reversible; no program or method can unhash an MD5 hash. In other words, it is not possible to generate the original text or message from the available hash value. Whenever we use MD5 for authentication, it converts the supplied password to an MD5 hash and compares it with the stored MD5 value. Whenever MD5 matches, the authentication succeeds, otherwise it rejects the access.

Wow. The password-cracking does not work here, so it is immune to attacks.

Right? Unfortunately, no! MD5 is not safe from vulnerability attacks. Let us understand it.

Collision attacks in MD5

As we saw earlier, MD5 is a hashing algorithm that generates a hex digest. In every hashing algorithm, there is a concept called collision which means different inputs generating the same hash value. MD5 is no exception. Discovering another string that matches my password’s hash is equivalent to believing my data is lost. The first incident was reported in 1996.

Sounds horrible! Isn’t it? Indeed, it is. But, this is expected.

How?

This is because any text is converted to a fixed 32-digit hexadecimal number, while there is no limit on the size of the input text.

percona@XYZ ~ $ md5 -s "Wishing you clarity, strength, and steady progress today, turning challenges into opportunities and efforts into success."
d686cad09daac517b2859305181067ac

percona@XYZ ~ $ md5 -s "May today bring fresh energy, clear thoughts, and quiet confidence. Take one step at a time, trust your experience, learn from setbacks, and celebrate progress. Consistent effort, patience, and curiosity will guide you forward, helping you create meaningful results and lasting impact in work and life for years to come."
2f61acd9c2bd4397406b5b056c0b7edc

percona@XYZ ~ $ md5 -s "Wishing you clarity, strength, and steady progress today, turning challenges into opportunities and efforts into success."

d686cad09daac517b2859305181067ac

percona@XYZ ~ $ md5 -s "May today bring fresh energy, clear thoughts, and quiet confidence. Take one step at a time, trust your experience, learn from setbacks, and celebrate progress. Consistent effort, patience, and curiosity will guide you forward, helping you create meaningful results and lasting impact in work and life for years to come."

2f61acd9c2bd4397406b5b056c0b7edc

As we can see, different texts of different lengths generate fixed-size hashes; many strings or patterns can generate the same hash, while the text can range from several bytes to GBs. Considering the facts, it is absolutely impossible to identify the number of occurrences that generate the same hash; it could be 1000, 10000 or even billions and more. Virtually, we cannot predict how many strings or texts would generate the same hash.

To understand it more clearly, refer to the example below of 2 different strings(TEXTCOLLBYfGiJUETHQ4hAcKSMd5zYpgqf1YRDhkmxHkhPWptrkoyz28wnI9V0aHeAuaKnak and TEXTCOLLBYfGiJUETHQ4hEcKSMd5zYpgqf1YRDhkmxHkhPWptrkoyz28wnI9V0aHeAuaKnak) that generate the same hash. The strings were taken from the page.

percona@XYZ ~ $ md5 -s TEXTCOLLBYfGiJUETHQ4hAcKSMd5zYpgqf1YRDhkmxHkhPWptrkoyz28wnI9V0aHeAuaKnak
faad49866e9498fc1719f5289e7a0269
percona@XYZ ~ $ md5 -s TEXTCOLLBYfGiJUETHQ4hEcKSMd5zYpgqf1YRDhkmxHkhPWptrkoyz28wnI9V0aHeAuaKnak
faad49866e9498fc1719f5289e7a0269

percona@XYZ ~ $ md5 -s TEXTCOLLBYfGiJUETHQ4hAcKSMd5zYpgqf1YRDhkmxHkhPWptrkoyz28wnI9V0aHeAuaKnak

faad49866e9498fc1719f5289e7a0269

percona@XYZ ~ $ md5 -s TEXTCOLLBYfGiJUETHQ4hEcKSMd5zYpgqf1YRDhkmxHkhPWptrkoyz28wnI9V0aHeAuaKnak

faad49866e9498fc1719f5289e7a0269

However, by changing the letter in the same place, the string(TEXTCOLLBYfGiJUETHQ4hCcKSMd5zYpgqf1YRDhkmxHkhPWptrkoyz28wnI9V0aHeAuaKnak) generates a different hash.

percona@XYZ ~ $ md5 -s TEXTCOLLBYfGiJUETHQ4hCcKSMd5zYpgqf1YRDhkmxHkhPWptrkoyz28wnI9V0aHeAuaKnak
8c88b1b008b35ed2f3f4100e7faea8c7

1 2	percona@XYZ ~ $ md5 -s TEXTCOLLBYfGiJUETHQ4hCcKSMd5zYpgqf1YRDhkmxHkhPWptrkoyz28wnI9V0aHeAuaKnak 8c88b1b008b35ed2f3f4100e7faea8c7

The prediction of the string is not as simple as it looks for the first time. As of now, it is not possible to crack MD5 using GPU or quantum computing as it takes several years. But, the experts are very confident that quantum computers will become powerful enough to crack MD5 in the next 10 years.

The majority of attackers these days use dictionary attacks or any trivial methods, and they succeed because the access is just one hash match away.

PostgreSQL and MD5

By what we saw in the last section, you must be convinced that MD5 is an unreliable encryption method. So, shouldn’t it have been removed long back?

Also, isn’t it too late to take a bold decision on its discontinuation? Many PostgreSQL systems might have already been affected by various hacks and crashes and been subjected to data stealing.

Should we stop using PostgreSQL and move to any other database if it’s not too late?

Well, it’s time for a litmus test. Let us test the fear.

postgres=# create user md5_test with password 'TEXTCOLLBYfGiJUETHQ4hAcKSMd5zYpgqf1YRDhkmxHkhPWptrkoyz28wnI9V0aHeAuaKnak';
CREATE ROLE
postgres=# /q

postgres=# create user md5_test with password 'TEXTCOLLBYfGiJUETHQ4hAcKSMd5zYpgqf1YRDhkmxHkhPWptrkoyz28wnI9V0aHeAuaKnak';

CREATE ROLE

postgres=# /q

postgres@XYZ:~$ export PGPASSWORD=TEXTCOLLBYfGiJUETHQ4hEcKSMd5zYpgqf1YRDhkmxHkhPWptrkoyz28wnI9V0aHeAuaKnak
postgres@XYZ:~$ psql -U md5_test -h localhost postgres
psql: error: connection to server at "localhost" (127.0.0.1), port 5432 failed: FATAL:&nbsp; password authentication failed for user "md5_test"<br>connection to server at "localhost" (127.0.0.1), port 5432 failed: FATAL:&nbsp; password authentication failed for user "md5_test"

postgres@XYZ:~$ export PGPASSWORD=TEXTCOLLBYfGiJUETHQ4hEcKSMd5zYpgqf1YRDhkmxHkhPWptrkoyz28wnI9V0aHeAuaKnak

postgres@XYZ:~$ psql -U md5_test -h localhost postgres

psql: error: connection to server at "localhost" (127.0.0.1), port 5432 failed: FATAL:  password authentication failed for user "md5_test"<br>connection to server at "localhost" (127.0.0.1), port 5432 failed: FATAL:  password authentication failed for user "md5_test"

What! All in vain. What we have learnt so far makes no sense.

Wait, refrain from making quick judgments! It does not generate the same hash that we obtained in the previous section(faad49866e9498fc1719f5289e7a0269).

postgres=# select rolname, rolpassword from pg_authid where rolname='md5_test';
 rolname  |             rolpassword             
----------+-------------------------------------
 md5_test | md5c5e4bbac9b29d19bf08a6fbb87abe3f0
(1 row)

postgres=# select rolname, rolpassword from pg_authid where rolname='md5_test';

rolname | rolpassword

----------+-------------------------------------

md5_test | md5c5e4bbac9b29d19bf08a6fbb87abe3f0

(1 row)

The PG community was not unaware of this issue, and it was addressed. In PostgreSQL, MD5 authentication string is stored using a combination of username and password. The below formula is the most suitable one for MD5 hash with PG(Reference link).

'md5' + md5(password, username)

1	'md5' + md5(password, username)

We can validate the formula as below.

percona@XYZ ~ % export USER=md5_test
percona@XYZ ~ % export PASSWORD=TEXTCOLLBYfGiJUETHQ4hAcKSMd5zYpgqf1YRDhkmxHkhPWptrkoyz28wnI9V0aHeAuaKnak
percona@XYZ ~ % echo md5`md5 -s $PASSWORD$USER`
md5c5e4bbac9b29d19bf08a6fbb87abe3f0

percona@XYZ ~ % export USER=md5_test

percona@XYZ ~ % export PASSWORD=TEXTCOLLBYfGiJUETHQ4hAcKSMd5zYpgqf1YRDhkmxHkhPWptrkoyz28wnI9V0aHeAuaKnak

percona@XYZ ~ % echo md5`md5 -s $PASSWORD$USER`

md5c5e4bbac9b29d19bf08a6fbb87abe3f0

So, can we consider MD5 a secured algorithm in PostgreSQL? Unequivocally, yes.

However, an MD5-hash leak may make the database vulnerable against attacks as it’s not very difficult for attackers to generate a hash-matching string. So, the deprecation of MD5 is a wise decision.

Solution: scram-sha-256

PostgreSQL supports various methods of authentication.

GSSAPI
Radius
LDAP
Cert
PAM
Password(md5, password, scram-sha-256)

Some methods, such as peer(not for a client-server model), trust, ident, are not suggested unless it’s absolutely inevitable as they make databases more open to intruders.

An additional layer of authentication is always preferred as long as it is in a secure network. We should keep in our minds that this could become a single point of exposure.

Those who don’t want an additional tier of authentication may use scram-sha-256. It is stronger over md5.

scram-sha-256

In version 10, the community introduced a new method for password authentication: scram-sha-256. The major advantage is its length, which is 256 bits. Also, there are some additional security features that make it more adoptive. The below picture shows the string along with a description of every part.

All the parts are used for authentication. At the time of authentication, a client reads the server key and generates a client key and passes it back to the server. The server verifies and allows access to the client.

The image below shows the operations of scram-sha-256.

scram-sha-256 doesn’t send plaintext passwords over the network, also the authentication requires participation from the client as well. Due to these features, scram-sha-256 is more reliable and secure.

In recent PG versions, the default authentication method is scram-sha-256. For the users who use older versions, they should switch to scram-sha-256 or upgrade PostgreSQL to the latest version.

Downsides of scram-sha-256

Having said that scram-sha-256 is an answer to pass-the-hash attack, it is still a troublemaker on occasions.

Authentication requires higher time than MD5 as it takes 6 rounds of communication to finish the process.
It performs a series of operations that are CPU-intensive. As a result, CPU usage spikes.
While using Pgbouncer in the transaction mode, the latency spike becomes evident.
Using a pooler is recommended when a connection churn is very high because establishing connection would increase the CPU utilisation.

Despite having so many odds, it offers enhanced security and multiple-layers of communication, which makes it trustworthy. To study scram-sha-256 in detail, kindly refer to the page.

Another question, is it possible to crack scram-sha-256? We certainly cannot rule out the possibility. In theory, every software program is susceptible to hack attacks. But, if we take these words in letters and spirit, we need to go back to pens and notebooks or even stone-carving methods to store our data. In fact, the majority of systems keep running without any security breaches, but this cannot be an argument of its acceptance in spite of the fact that it is well-designed and more reliable. Now, what?

While the most secured databases in the world are vulnerable to attacks, we need to have the number of incidents and probability of such occurrences. As long as no incidents are reported, my database is safe. This might sound diplomatic or compromising, but it is the most convenient position.

Conclusion

MD5 has been very popular, but it has many flaws that makes it less reliable and vulnerable. In PostgreSQL, MD5 authentication was implemented to overcome the issues found in the traditional MD5 algorithm. However, the possibility of getting hacked cannot be ruled out, so it is inevitable to phase it out gradually. scram-sha-256 is a more robust and preferable option and secured over MD5.

Those who still rely on MD5 authentication, it is highly recommended to move to scram-sha-256 or any other reliable method. To ensure that your database is up to date and secured, kindly connect to Percona PostgreSQL experts.

MySQL 5.7
Support

Compare Percona to Leading Database Solutions

Software
Downloads

Valkey Contribution

Product Documentation

Resource Hub

Why Percona for MongoDB?

Why Percona for PostgreSQL?

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us