Pseudonymization, Anonymization and the GDPR

There has been more and more talk in the news about how EU regulators continue to unleash massive fines on tech giants such as Google and Facebook for violations of the GDPR. For this reason, I’d like to discuss ways organizations can avoid making those unwanted headlines.

Recap—Which Organizations are Subject to the GDPR?

First, let’s quickly review how organizations find themselves subject to the GDPR. The GDPR’s requirements apply to anyone processing the personal data of individuals located within the EU. Personal data includes any information about an identified or identifiable person. An identifiable person is a person that can be identified, directly or indirectly, by reference to an identifier, such as name, identification number, location data, or online identifiers.

To process personal data, The GDPR requires organizations to embed appropriate technical and organizational measures, which can include data protection safeguard measures such as anonymization and pseudonymization.

What is Anonymization?

Under GDPR, an organization may also implement the safeguard technique of anonymization. Recital 26 of the GDPR defines anonymized data as “personal data rendered anonymous in such a manner that the data subject is not or no longer identifiable.” Personal data that is anonymized is exempt from the requirements of the GDPR since the data no longer identifies the individual and thus eliminates any risk to their rights and freedoms.

Let’s Take an Example of Anonymization

A private university wants to learn how many of its former students transferred to a new university, and if so, which university. For this purpose, the university collects the data of all its students who transferred over the last 10 years by emailing them and requesting that they participate in an online survey. To anonymize the data, the survey does not include questions concerning name, email address, date of birth, or the year they graduated. Also, it does not record the IP addresses of the participants.

Furthermore, to avoid the identification of former students who transferred to uncommon schools, the organization will group those in a group labeled “other schools” to avoid collecting information that would allow singling out individuals. By minimizing the amount of data collected to what is absolutely necessary to carry out its survey, the likelihood of re-identification becomes extremely small. Thus, the anonymization is successful, and the GDPR does not apply.

However, it’s important to note that the process of anonymizing data constitutes further processing. Under GDPR, this means that the organization’s purpose for processing anonymous data must be compatible with the original purpose of processing unless the organization has a separate lawful basis to process the anonymous personal data for an incompatible purpose.

Also, because the standard of anonymizing data is so high, organizations may consider imposing contractual terms on the party receiving the anonymized data to reduce the risks of re-identification. For example, by either restricting the use of the data or imposing reasonable security measures to help prevent re-identification.

What is Pseudonymization?

Under GDPR, an organization may also implement the safeguard technique of pseudonymization. The GDPR defines pseudonymization as the processing of personal data where the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that the organization:

- Keeps the additional information separately and securely; and
- Uses technical and organizational measures to ensure that the personal data are not attributed to an identified or identifiable natural person.

Pseudonymization reduces the linkability of data to its data subjects by separating the data from direct identifiers. However, it does not prevent re-identification—which is the key difference between it and anonymization.

How Might Pseudonymization Play Out in Practice?

Consider, for example, an organization that comprises of entities A and B. Entity A collects the personal data of its users regarding the list of products it sells, while entity B receives the collected data for profiling—specifically to determine the purchase behaviors.

But, before the user’s personal data is provided to entity B, it is pseudonymized by removing personal user information like their names and addresses and replaces it with reference numbers. These reference numbers are stored separately with the organization’s data protection officer, who safeguards the data and does not disclose it to entity B. As a result, entity B cannot link the data to its respective customers, thus making the personal data pseudonymized.

Despite not having any personal data, entity B analyzes the pseudonymized data and learns that two of its products, numbers 5 and 6, tend to be purchased together. This is valuable information to Entity B because it may now decide to market those products in a bundle or recommend one when a shopper has placed one in their shopping cart. This example highlights how businesses can gain helpful insights into marketing their products and, at the same time, protect the rights and freedoms of their customers’ personal data.

As organizations continue implementing state of the art technologies such as Artificial Intelligence, which tend to collect and analyze the maximum amount of data as possible, they will need to embed safeguard measures like anonymization and pseudonymization to demonstrate compliance with the GDPR. And as we have seen, these methods can reduce the risk of re-identification and also preserve the value of data assets.

If you have any questions about how we can help your organization achieve GDPR compliance, contact our team today.