Microservices and GDPR

Well, there’s two words you don’t often see together. What has an application development methodology have to do with European data protection laws? In fact, rather a lot. Let’s start with a quick recap of GDPR.

General Data Protection Regulation (GDPR) is surprisingly fairly easy to understand and is set out with excellent clarity in the regulations themselves. As architects, we should appreciate that the regulations are founded on a set of 7 principles :

Lawfulness, fairness and transparency
Purpose limitation
Data minimisation
Accuracy
Storage limitation
Integrity and confidentiality (security)
Accountability

GDPR is even straightforward enough to capture in a couple of sentences :

Companies that hold personal data must collect a minimum of data only for the purposes stated. They must ensure that personal data is held accurate, held securely and in confidence and must be transparent about the data they hold.

If you want to know the details, then the UK Information Commissioner’s Office is a good place to a start.

So, let’s look at the implications for an application that holds some personally sensitive data. Here’s a list of four of the more important requirements that can be derived from GDPR :

The application must be able to surface all information pertaining to a given individual so that it can be given to the individual (in the UK this is called a subject access request)
The company must be able to validate the accuracy of the information held, that its only the minimum necessary and ensure that the data is pertinent to the stated collection purposes
The company must ensure that any personal data is only retained for the minimum required for the main purpose, and that data is deleted after that time.
The company must secure the data so that risk of disclosure is minimised – including all copies and backups, and also that data integrity is maintained and that customer data is not lost
The company must be able to audit all sensitive data and it’s use to ensure that it complies with their policies.

So, how does this impact on our microservices architecture ?

There are two schools of thought in microservice architectures regarding data persistence. One approach is that each microservice has its own persistence storage which is, in essence, private to the microservice. The service communicates by a self-specified API and by no other means. The second approach is that data is help outside the specific microservices in a separate data persistence service.

As with so many of these situations, and despite much heated discussions between the schools of thought, the reality is that both of these models have their place in a microservice architecture. Even an individual microservice may need to use both types of persistence – internal for more transient data, and external for data that needs to be permanently retained. (It’s actually a little more complex that this, and I’ll come back to it in a later article).

And this is where GDPR comes in.

We cannot expect every microservice development team to be fully conversant with GDPR and all the implications. If we emburden those teams with all the GDPR requirements above, then this will severely limit their agility to innovate and respond to business demands. Also, microservice development teams are not expert in data security and data resilience and it would be significant duplication of effort and waste of resource to expect every microservice team to build full GDPR-compliant functionality as required above.

The simple answer here is to provide a single, shared persistence service designed specifically for sensitive data. It provides data-centric APIs that any other service can call to store and retrieve sensitive data. Then inside that service, all of the complexity of security, data integrity, data audit, data retention, and data access methods etc can be dealt with by an expert team that simply focuses on those requirements. The team would also be the point of control for the data schemas – and this has an additional side-benefit in that customer data is one of the most shared and exchanged data, and having a well-defined customer data schema for exchange of customer data between microservices is a necessity.

This also allows the data-centric microservice to choose appropriate technology for the long-term persistence of data to deliver an efficient and effective data service to all other client microservices.

This deals with the more obvious customer sensitive data – we have externalised it from the functional microservices, but this has not dealt with the entire challenge. In may cases, even transitory data can be regarded as sensitive. Examples might be the text messages that you send, a list of films that you watched, websites you have visited or even products that you have looked at on an e-commerce site. All of these are potentially sensitive data under GDPR.

If the data is truly transitory and held for a very brief time for a specific purpose, then many of the requirements (e.g. subject access requests, backup requirements etc) do not apply. However there is still a need to ensure that every service correctly applies to GDPR – and this can only really be done with strong data policies and governance that are applied across all developments that use customer data.

A final point is on the secondary use of data. While all of the data examples above (texts, films, websites and products) may be transitory from a functional perspective, they potentially have huge value beyond that use – all will build up a demographic view of the customer that has value to the company and potentially an external value through monetisation of the data. In this case and if there is customer consent, then that data is no longer transient it needs to be upgraded to a permanent data store for the secondary purposes. The functional microservice should delete that data from its internal transitory stores and pass the responsibility to a dedicated sensitive data store.

I hope that you will have found this article interesting, and if you have any thoughts, please leave a comment or contact me via LinkedIn.

Simon Griffiths

Focusing on Data, Architecture and AI

Leave a comment Cancel reply

Share this:

Leave a comment Cancel reply