WebSiteHome

Data Masking

This feature was implemented in Arpia MyData Storage for MySQL 8.0.17-8.

The Data Masking plugin is a free and Open Source implementation of the MySQL’s data masking plugin. Data Masking provides a set of functions to hide sensitive data with modified content.

Data masking can have either of the characteristics:

  • Generation of random data, such as an email address
  • De-identify data by transforming the data to hide content

The data masking functions have the following categories:

  • General purpose
  • Special purpose
  • Generating Random Data with Defined characteristics
  • Using Dictionaries to Generate Random Data

General purpose

The general purpose data masking functions are the following:

FunctionDescription
mask_inner(string, margin1, margin2 [, character])Returns a result where only the inner part of a string is masked. A different masking character can be specified.
mask_outer(string, margin1, margin2 [, character])Masks the outer part of the string. The inner section is not masked. A different masking character can be specified.

Examples

An example of mask_inner:

mysql> SELECT mask_inner('123456789', 1, 2);

+-----------------------------------+
| mask_inner('123456789', 1, 2)     |
+-----------------------------------+
|1XXXXXX89                          |
+-----------------------------------+

mysql> SELECT mask_outer('123456789', 2, 2);

+------------------------------------+
| mask_outer('123456789', 2, 2).     |
+------------------------------------+
| XX34567XX                          |
+------------------------------------+

Special Purpose

The special purpose data masking functions are as follows:

ParameterDescription
mask_pan(string)Masks the Primary Account Number (PAN) by replacing the string with an “X” except for the last four characters. The PAN string must be 15 characters or 16 characters in length.
mask_pan_relaxed(string)Returns the first six numbers and the last four numbers. The rest of the string is replaced by “X”.
mask_ssn(string)Returns a string with only the last four numbers visible. The rest of the string is replaced by “X”.

Examples

An example of mask_pan.

mysql> SELECT mask_pan (gen_rnd_pan());

An example of mask_pan_relaxed:

+------------------------------------+  
| mask_pan(gen_rnd_pan())            |  
+------------------------------------+  
| XXXXXXXXXXX2345                    |  
+------------------------------------+

mysql> SELECT mask_pan_relaxed(gen_rnd_pan());

+------------------------------------------+
| mask_pan_relaxed(gen_rnd_pan())          |
+------------------------------------------+
| 520754XXXXXX4848                         |
+------------------------------------------+

An example of mask_ssn:

mysql> SELECT mask_ssn('555-55-5555');

+-------------------------+
| mask_ssn('555-55-5555') |
+-------------------------+
| XXX-XX-5555             |
+-------------------------+

Generate random data for specific requirements

These functions generate random values for specific requirements.

ParameterDescription
gen_range(lower, upper)Generates a random number based on a selected range and supports negative numbers.
gen_rnd_email()Generates a random email address. The domain is example.com.
gen_rnd_pan([size in integer])Generates a random primary account number. This function should only be used for test purposes.
gen_rnd_us_phone()Generates a random U.S. phone number. The generated number adds the 1 dialing code and is in the 555 area code. The 555 area code is not valid for any U.S. phone number.
gen_rnd_ssn()Generates a random, non-legitimate US Social Security Number in an AAA-BBB-CCCC format. This function should only be used for test purposes.

Examples

An example of gen_range(lower, upper):

mysql> SELECT gen_range(10, 100);

+--------------------------------------+
| gen_range(10,100)                    |
+--------------------------------------+
| 56                                   |
+--------------------------------------+

An example of gen_range(lower, upper) with negative numbers:

mysql> SELECT gen_range(-100,-80);

+--------------------------------------+
| gen_range(-100,-80)                  |
+--------------------------------------+
| -91                                  |
+--------------------------------------+

An example of gen_rnd_email():

mysql> SELECT gen_rnd_email();

+---------------------------------------+
| gen_rnd_email()                       |
+---------------------------------------+
| [email protected]                  |
+---------------------------------------+

An example of mask_pan(gen_rnd_pan()):

mysql> SELECT mask_pan(gen_rnd_pan());
An example of gen_rnd_us_phone():

+-------------------------------------+
| mask_pan(gen_rnd_pan())             |
+-------------------------------------+
| XXXXXXXXXXXX4444                    |
+-------------------------------------+

mysql> SELECT gen_rnd_us_phone();

+-------------------------------+
| gen_rnd_us_phone()            |
+-------------------------------+
| 1-555-635-5709                |
+-------------------------------+

An example of gen_rnd_ssn():

mysql> SELECT gen_rnd_ssn()

+-----------------------------+
| gen_rnd_ssn()               |
+-----------------------------+
| 995-33-5656                 |
+-----------------------------+

Use dictionaries to generate random terms

Use a selected dictionary to generate random terms. The dictionary must be loaded from a file with the following characteristics:

  • Plain text
  • One term per line
  • Must contain at least one entry

Copy the dictionary files to a directory accessible to MySQL. Percona Server for MySQL* 8.0.21-12 enabled using the secure-file-priv option for gen_dictionary_load(). The secure-file-priv option defines the directories where gen_dictionary_load() loads the dictionary files.

📘

Percona Server for MySQL 8.0.34 deprecates the gen_blacklist() function. Use gen_blocklist() instead

ParameterDescriptionReturns
gen_blacklist(str, dictionary_name, replacement_dictionary_name)Replaces a term with a term from a second dictionary. Deprecated in Percona Server for MySQL 8.0.34.A dictionary term
gen_blocklist(str, dictionary_name, replacement_dictionary_name)Replaces a term with a term from a second dictionary.A dictionary term
gen_dictionary(dictionary_name)Randomizes the dictionary termsA random term from the selected dictionary.
gen_dictionary_drop(dictionary_name)Removes the selected dictionary from the dictionary registry.Either success or failure
gen_dictionary_load(dictionary path, dictionary name)Loads a file into the dictionary registry and configures the dictionary name. The name can be used with any function. If the dictionary is edited, you must drop and then reload the dictionary to view the changes.Either success or failure

Example

An example of gen_blocklist():

mysql> SELECT gen_blocklist('apple', 'fruit', 'nut');

+-----------------------------------------+
| gen_blocklist('apple', 'fruit', 'nut')  |
+-----------------------------------------+
| walnut                                  |
+-----------------------------------------+

An example of gen_dictionary():

mysql> SELECT gen_dictionary('trees');

+--------------------------------------------------+
| gen_dictionary('trees')                          |
+--------------------------------------------------+
| Norway spruce                                    |
+--------------------------------------------------+

An example of gen_dictionary_drop():

mysql> SELECT gen_dictionary_drop('mytestdict')

+-------------------------------------+
| gen_dictionary_drop('mytestdict')   |
+-------------------------------------+
| Dictionary removed                  |
+-------------------------------------+

An example of gen_dictionary_load(path, name):

mysql> SELECT gen_dictionary_load('/usr/local/mysql/dict-files/testdict', 'testdict');

+-------------------------------------------------------------------------------+
| gen_dictionary_load('/usr/local/mysql/mysql/dict-files/testdict', 'testdict') |
+-------------------------------------------------------------------------------+
| Dictionary load successfully                                                  |
+-------------------------------------------------------------------------------+

Using Data Masking Function in Arpia Platform

Use the data masking functions in a columns in a node in the knowledge grid using SQL Columns.

Using the data masking functions: