Executive Perspective: Privacy Ops Meets DataOps
As privacy regulations proliferate, pressure builds for organizations to implement scalable plans for compliance. Matthew Carroll, co-founder and CEO of Immuta, discusses how this impacts data management and what companies can do as cloud migration grows.
- By James E. Powell
- November 13, 2020
Upside: As companies migrate their data to the cloud, what are the biggest challenges they face? How do most enterprises respond (and is it working)?
Matthew Carroll: One of the biggest challenges companies face as they move to the cloud is fragmentation of systems and processes. Often, they’ll have to maintain some portion of their legacy technologies as they are overhauling their processes for the new systems. Although they are better able to abstract their data workflows from the hardware in the cloud, it has to be done right to realize the value.
At the same time, companies are trying to avoid vendor lock-in because they don’t own the infrastructure and don’t want to be at the whims of their cloud provider. Changes to pricing models or the emergence of new technologies can quickly incentivize switching vendors. That means data management and governance processes need to be as loosely tied to the specifics of the implementation as possible.
Finally, and perhaps most importantly, companies have to seriously dig into who they’re trusting with their data, both internally and externally. A huge driver of the cloud migration is ease of access and interoperability, which can be a double-edged sword when it comes to data privacy and security. The pressure is really on companies to have a thorough understanding of applicable regulations and a scalable plan for following them.
The need for data engineers is on the rise: why? What is their role within a data-driven organization and what challenges do they face right now?
The data engineers are the folks who “get it done” in the data ecosystem. With the move to cloud and distributed computing technologies, their value to the organization has skyrocketed. The modern data engineer has to marry a solid understanding of computing principles with knowledge of how to build a platform end users can take advantage of. They are much more tied to creating business value than they have been in the past, and are often tasked with building the data platforms that drive an organization forward.
Data engineers are positioned to tackle the issues we just discussed: portability between clouds and set-up of scalable access controls and compliance methods. They are an extremely important asset to the data-driven organization. The engineering work is not always as glamorous as the modeling and analytics work done by data scientists, but it fuels the broader organization in a more foundational way.
What is PrivacyOps and how does it help companies comply with industry regulations?
PrivacyOps is emerging because privacy considerations can no longer be an afterthought in an organization’s software development lifecycle -- they need to be tightly integrated. There is pressure on organizations to prove they are taking responsibility for personal data and acting in compliance with regulations, and it’s only going to increase.
The real opportunity that the emergence of PrivacyOps presents is bringing security and privacy processes together, and standardizing best practices that need to be implemented across organizations. It’s far too easy for engineering, analytics, and compliance teams to talk over each other. Bringing these domains together through software will help to set expectations across the industry about privatizing data assets. Techniques such as k-anonymization, for example, are practiced by some of the best teams in healthcare, but they are hardly commonplace, despite being relatively easy to implement.
When it comes to compliance, what are data engineers’ biggest challenges?
To deliver compliant analytics, you need data engineers that can reliably ship the data from place to place, while implementing the appropriate transformations. However, what actually needs to be done is often not very clear to the engineering team. Data scientists want as much data as possible; compliance teams are pushing to minimize the data footprint. Regulations are in flux and imprecise. Figuring out, in practice, how to balance these different demands is a huge challenge for engineers, especially when considering the challenges of cloud migration.
Fundamentally, it’s about defining data policies that can actually be implemented by engineers. Laws do not come packaged with a direct translation to the data and technology systems on which they will be enforced. We have found that attribute-based access controls (ABAC) are the best way to interface between these different groups, while still providing actionable policy that data engineers can implement. ABAC has the further advantage of being future-proof. If you define your policies in terms of technology-agnostic and data-agnostic attributes, then you can implement them across clouds and across databases.
Where do you see data governance, state and federal privacy regulations and compliance headed in 2020 and beyond? What’s just over the horizon that we haven’t heard much about yet? How will this impact data teams?
Well, we will certainly continue to see new regulations. One area I think we’ll hear a lot more about is quantitative privacy guarantees. The conversation has already started with the confusion around what “anonymized data” really means in the context of GDPR, and the only real resolution is for the field to come up with acceptable standards regarding privacy guarantees that can be made to data subjects. I think that terms such as differential privacy and k-anonymization, which sound unfamiliar to most practitioners now, will eventually be a part of data science training programs.
How are data teams building/adjusting their systems to account for data compliance laws (GDPR, HIPAA, CCPA, etc.)? Any useful tools, practices, or workflows to assist in meeting the requirements?
A major issue is standardizing how to implement these regulations, regardless of system, and being able to communicate those practices around the company. Attribute-based access control, which I mentioned earlier, is one very transparent way to create flexible and compliant policies for data access. We are also seeing the emergence of “purpose” constraints around how organizations can use certain data. This adds to the challenges of sharing data internally and externally -- it will soon no longer be acceptable to simply give people database credentials and tell them to have at it.
These policies have to be automated, because they have to scale. The data engineer’s challenge is one of fragmentation. They have to make sure their data is secure across multiple clouds, multiple databases, many different user bases and contexts. You can’t achieve the level of scale and precision needed without automating policy.
Describe your product/solution and the problem it solves for enterprises.
Immuta accelerates self-service access to and control of sensitive data. Our Automated Data Governance platform creates trust across data engineering, security, legal, compliance, and business teams so they can work together to ensure timely access to critical data with minimal risk, while adhering to global data privacy regulations including GDPR, CCPA and HIPAA. Immuta’s automated, scalable, no-code approach makes it easy for users to access the data they need, when they need it, while protecting sensitive information and ensuring customer privacy. Enterprises mitigate risk, save time, and increase data usage efficiency with Immuta’s dynamic and easily-implemented data protection methods.
James E. Powell is the editorial director of TDWI, including research reports, the Business Intelligence Journal, and Upside newsletter. You can contact him
via email here.