Data masking lets you hide sensitive fields (emails, credit-card numbers, job titles, etc.) while keeping data realistic for reporting, support, or testing. It is particularly useful when you collaborate with external entities and need to share your data for development reasons. You also need to protect your data and keep your customers’ privacy safe. Last but not least, data masking is required to achieve compliance with data privacy regulations like GDPR, HIPAA, and CCPA.

Percona Server for MySQL 8.4 includes a data masking component with ready-made masking and synthetic-data generation functions you can use directly in SQL. Below, I show a small, practical workflow: install/enable the component, create a sample table, create a view/stored procedure that presents masked data to a restricted user, and dump/export masked data.

For my tests, I’m using Percona Server for MySQL 8.4.

The examples use the Percona masking functions such as mask_inner(), mask_pan(), gen_rnd_email(), and dictionary helpers; these are part of the Percona masking functionality.

Install the data masking component

Here is the minimal install sequence for the component: create the internal masking_dictionaries table and then install the component. Also note, some dictionary functions require the MASKING_DICTIONARIES_ADMIN privilege. Example:

Following MySQL best practices, many admins disable the root user, which previously caused these functions to stop working. The server now uses the built-in mysql.session user to execute dictionary queries.

For this to work, you need to grant the mysql.session user SELECT, INSERT, UPDATE, and DELETE privileges on the masking_dictionaries table.

Example data: Create a table and insert some rows

Simple masking examples

Let’s see how to mask common fields.

Percona Server for MySQL provides many functions. A few useful ones:

  • mask_inner(str, margin_left, margin_right [, mask_char]) — masks the inner part of a string.
  • mask_outer(str, left_mask, right_mask [, mask_char]) — masks both ends.
  • mask_pan(str) / mask_pan_relaxed(str) — mask primary account numbers (credit cards).
  • gen_rnd_email() — generates a randomized email (useful for synthetic exports)
  • dictionary functions + masking_dictionary_term_add() — let you supply dictionaries for generated terms.

There are many more. See the functions reference for the full list here: https://docs.percona.com/percona-server/8.4/data-masking-function-list.html

Example: Mask email but keep the domain

We want to show something like a****@example.com. Use substring functions plus mask_inner():

If you don’t need to keep the domain, another function is available for generating a completely random email address: gen_rnd_email([name_size, surname_size, domain])

Example: Mask credit card (PAN), leaving the last four digits

Use mask_pan():

Example: Mask job title with synthetic dictionary terms

You can populate a dictionary called jobs and then use gen_dictionary(‘jobs’):

In the same way, you can define, for example, a dictionary of types of berries:

You are free to create whatever dictionary you like. The only limit is your creativity.

Percona Server for MySQL adds dictionary support and helper functions. From version 8.4.4, an in-memory cache has been added to avoid querying the dictionary table constantly. So, if you modify the dictionary table directly, you need to run the following command to resync the cache.

Create a view that shows masked data

Here is a common case when a user has access to the database, but they’re not granted direct access to the base tables. In such a case, we may create a view and grant the user access only to the view.

A view is a simple, robust way to let a low-privileged user query masked results while protecting the real table.

Reconnect to the database using the new masked_user credentials.

Let’s run queries now against the created view and the base table. The user can see masked/synthetic values only. This pattern keeps the real table protected while letting downstream users run queries and reports.

Can we use stored procedure for masking data?

Sure, you can. Similarly to the view, you can also rely on a stored procedure for masking your data. We can implement the same masking as before and provide only the EXECUTE grant for the stored procedure to the unprivileged user.

Reconnect to the database using the masked_user credentials and call the procedure.

Is it preferable to use a view or a stored procedure? It depends on the goal you have.

Using a view, you can permit the user to run SELECT statements on top of the view, adding filter conditions or doing JOINs with other tables. Using the stored procedure instead, you are permitting the user to get only a fixed result, all the time, the full masked table, or whatever the result provided by the stored procedure’s code is. In order to provide more flexibility, you can eventually add input parameters to the stored procedure to implement filter conditions, sorting, and limiting the result, and so forth. You can put a lot of complexity in the procedure’s logic if you like, but in the end, the view should provide more flexibility.

Dump/export masked data

We can not use mysqldump directly for dumping the values from the view. Indeed, mysqldump can only dump the definition of the view (the CREATE VIEW statement). Data is not dumped from a VIEW

The right approach here is to use SELECT … INTO OUTFILE from the view.

If you prefer CSV or other flat files:

Make sure secure_file_priv and file system permissions allow writing to that path, and the masked_user has the FILE grant. If not, add the grant to the user and retry:

Use mysqldump with proxysql

To be honest, there’s a way we can use mysqldump for dumping masked data. Instead of dumping directly from MySQL Server, you can point mysqldump to ProxySQL, which will perform the data masking on the fly.

ProxySQL can be configured to match incoming queries and replace them with modified or different SQL code. For example, we can replace the typical query executed by mysqldump SELECT /*!40001 SQL_NO_CACHE */ * FROM tablename with a custom one using the masking functions we need.

The goal of this article is not to explain ProxySQL; the assumption is that you know how it works and how to configure and use it. We also assume ProxySQL is already installed on our system.

For more details about ProxySQL, look at the official documentation on the website: http://www.proxysql.com

In the ProxySQL configuration, we can simply create a query rule for rewriting on-the-fly the mysqldump’s query for the customers table.

Then we can run mysqldump connecting through the proxy as follows:

You can take a look now inside the dump file and see how INSERTs have been generated as masked.

In this case, the dump is also completely transparent because we did not rename the fields returned for the table.

Additional tips & caveats

  • Privilege model: grant only what’s needed. If you give masked_user access to the base table, masking via view becomes meaningless. The dictionary administration functions require MASKING_DICTIONARIES_ADMIN.
  • Determinism: functions like gen_rnd_email() and gen_dictionary() may generate different results across queries; design accordingly when you need reproducible masked exports.
  • Caching: Percona Server 8.4 introduced a dictionary term cache to speed dictionary lookups. If you change dictionary tables directly (not via provided functions), the cache can become out of sync — use the provided masking_dictionaries_flush() or configure the cache flush settings. The system variable dictionaries_flush_interval_seconds  can be used to set the number of seconds between updates to the internal dictionary cache to match changes in the dictionaries table
  • Static vs dynamic masking: the approach above is dynamic masking (real rows unchanged; masked on SELECT). For test environments, you may prefer static masking (create a masked copy of the dataset). In such a case, you need to rely on exporting data as described using the SELECT … INTO OUTFILE statement for all the tables you have to recreate as masked. Eventually consider using mysqldump with ProxySQL.
  • Replication: if you have a replicated environment, the real data is replicated as usual and not masked on the replicas. Keep in mind that data masking only deals with SELECT statements; the real data is untouched.
  • Challenges: If you have more tables to be masked, you have to define the masking view or stored procedure for every single table. In the case of the dump, you have to manage the export one table at a time, unless you’re using a ProxySQL based solution.
    Another potential challenge is when a field used in JOIN conditions must be masked. In such a case, even the field in the joined table must be masked exactly the same way. Unfortunately, current functions do not support this case, and you have to implement custom code to manage it. Anyway, we hope it is not a very frequent case.

Conclusion

Data Masking is nowadays an important task for securing data and complying with privacy regulations. Percona Server for MySQL 8.4 provides a specific component with functions to implement data masking with minimal effort. You can safely share the data with your external partners to develop your business, securing your customers’ privacy.

For more details, look at the documentation:


MySQL 8.0 reaching end of life in 2026 is a big moment for your database plans. If your priority is staying safe and stable on 8.0, see how Percona’s EOL support can keep your environment protected.

 

Expert Support for Your Approach to MySQL 8.0 EOL

Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments