You are here

You are here

Data masking doesn't have to be that hard: 5 things to know

Reiner Kappenberger Director, Voltage Data Security, CyberRes

Since going into effect May 2018, the European Union's General Data Protection Regulation (GDPR) has required that all personal data held by a third party be protected. The definition of protected varies, from "not retaining sensitive data"—read "anonymization"—to "encrypting data so that the content cannot be compromised following a breach."

However, for companies that want most of the benefits of encryption while retaining the functionality of the original data, the law describes another option. Scrambling the data in a format-preserving way—called "pseudonymization," "tokenization," "encryption," or "data masking"—has the benefit of protecting the data while allowing companies to perform analysis on some characteristics.

In many ways, data masking delivers similar benefits to network segmentation. If an attacker manages to gain access to one user's systems, network segmentation prevents easy movement to other systems and limits the breach to a small subset of devices. Data masking works similarly. When you deliver only masked data to most workers, a breach of one system does not give the attacker unfettered access to all the data.

However, implementation of data masking is littered with potential pitfalls. Here are five issues to consider.

1. Not as easy as it sounds

Data masking can seem easy, but several challenges make a secure, yet usable, implementation difficult. Ensuring that all data is masked and that some database has not escaped notice can be difficult.

There are a variety of techniques you can use to mask data. Static data masking lets you create a copy of a database that has random values that preserve the format of expected data. However, the resulting dataset does not retain the informational characteristics of the original data. A variant of this type of data masking—deterministic data masking—replaces data with alternate values so that every instance of a particular value has the same substitute. While this preserves the ability to analyze the data, it is sometimes possible to unmask some of it.

Finally, you can do dynamic data masking changes data on the fly, as it is accessed from the database, so that workers can use and analyze the data in many of the same ways as unmasked data. In the Forrester Wave for Dynamic Data Masking Solutions, the research company highlights the importance of having the correct policies for dynamic data masking to both protect the data and preserve employees' ability to use the data.

2. Know what data you have

Data discovery and the classification of information are critically important and form the foundation for setting a data-masking policy.

Overall, current privacy laws—including the GDPR and the California Consumer Privacy Act (CCPA)—focus on a simple idea: If there is no reason to keep the data around, then companies should get rid of it. For that to work, companies need to know what data they have. Data discovery is a key component of creating a data-masking environment. Without knowing what data is currently produced, why it is needed, and where it resides, companies cannot adequately protect the information.

3. Close all the side doors

Next, focus on your employees. Users strive to be productive, so policies that get in the way will drive them to bypass security measures. Any implementation of data masking must close both the back doors and the side doors. Otherwise, workers will find them, use them, and put your data at risk.

Access to unmasked data should be limited to small cadres of privileged users, dependent on their role. Restricting access can also help reduce the cost of managing the masked data.

4. Keep data functionality

Most data masking, especially format-preserving encryption, gives companies the benefit of using the masked data for a limited subset of analysis functions. Does your sales team need to know which products consumers are viewing the most? Does your marketing team need to know the characteristics of users that viewed a particular ad? Such questions can be answered without unmasking the users involved.

The promise of retaining the utility of data is what makes different types of data masking beneficial to your business. As you evaluate different technologies, make sure that they deliver the value your business needs. For cloud-native companies, the technologies also need to support your providers.

5. Data minimization can reduce costs

Finally, reducing the amount of private data that your company manages can save money in many ways. A smaller volume of data reduces that amount of overhead that data security teams need to spend on managing sensitive data. Managing keys and secrets, for example, is a major headache for many technologies. Ultimately, you don't manage keys.

Do the right thing on data retention

One more thing: Recent analyses have found that the more data a company stores—especially in a private data center—the greater the cost and environmental impact. Conserving energy by reducing data retention helps both your company bottom line and the environment.

The Forrester Wave for Dynamic Data Masking Solutions is a good guide for comparing different data-masking technologies. 

Keep learning

Read more articles about: SecurityData Security