TDWI | Training & Research | Business Intelligence, Analytics, Big Data, Data Warehousing

TDWI Articles

00 Days

00 Hrs

00 Min

00 Sec

Executive Q&A: Protecting Data Privacy, Ensuring Trust Across Multiple Cloud Services

Data privacy is important to maintain compliance, customer loyalty, and brand reputation, but privacy failures and lack of compliance are eroding consumer trust. Okera’s co-founder and CEO Nong Li offers advice on how everyone can protect PII and use it effectively and responsibly.

By Upside Staff
June 9, 2022

Nong Li, co-founder and CEO of Okera, shares how employees, customers, and partners can use data responsibly and avoid inappropriately accessing data that is confidential, personally identifiable, or regulated.

For Further Reading:

Building Customer Trust in Your Data Policies

Executive Q&A: Data Governance and Compliance

4 Principles for a Successful Multicloud Strategy with Multicloud Data Services

Upside: What trends are increasing the need to properly leverage and protect sensitive data? Is it more than just the rising number of serious data breaches?

Nong Li: With cloud proliferation, and as we enter the realm of Web 3.0, organizations are realizing the importance of protecting and responsibly using consumer data and personally identifiable information (PII). Protecting sensitive data has become increasingly difficult -- and important. Organizations must help data teams deliver business value faster and more confidently while permitting data security and privacy teams to validate the appropriate security mandates and ensure compliance with data privacy regulations.

Today, there are two trends amplifying the need for organizations to take a careful look at how they are safeguarding the data customers entrust to them. The first is the constant threat of data breaches, which you mention, because they can truly damage a business and erode customer confidence. The other trend is the proliferation and evolution of data privacy regulations around the world. These regulations exist because citizens demand privacy.

The two trends go hand-in-hand. Governments want to make sure citizens’ rights are protected, but they also want a healthy economy and domestic security. When businesses are careless with their customers’ or employees’ personal data, and that data is stolen by a foreign actor, it has repercussions beyond the individuals and the business. It has become a national security issue.

As more organizations undertake digital transformations, they struggle to balance responsible data use with automated data access control that meets their data governance policies. How can enterprises leverage consumer data and still protect data (especially PII)?

Enterprises can use consumer data while still protecting confidential, PII, and regulated data. Most of them just don’t know how to do it yet. Businesses spend millions of dollars every year on data analytics software, infrastructure, and personnel, and then get stuck because the data they need contains sensitive data. They’re simply told they can’t use it. End of project.

Fortunately, new approaches to authorizing who can access what data, when, and in what format are now available to help companies in exactly this position. When an analyst queries a data set that contains email messages or Social Security numbers, for example, that request can be dynamically modified to filter the data entirely, mask or “x” out the data, or tokenize it so analysts can work with a unique representation of the data but not know the actual email or government ID.

This is game-changing for organizations. Establishing trust in your organization’s data can be a challenge, but it's critical to developing a data-driven approach to achieve or maintain a market-leading position. Organizations that are able to successfully build a community of trust can deliver results faster and more efficiently, which benefits both the consumer and the company.

How does an organization protect data when it’s spread among multiple cloud services?

The previous answer was about dynamically enforcing data access policies. Here, we’re talking about needing data access policies that are universal. This is a new approach and a significant departure from the current status quo. You’re not going to get multicloud data security from the hyperscalers such as AWS, Azure, or Google, but that doesn’t mean the big cloud providers aren’t providing strong building blocks.

A “universal” approach to data authorization is truly that -- it simplifies the creation of data authorization policies in a way that’s abstract but leverages the analytics and computing platforms provided by each cloud provider for the fastest possible enforcement. The trick is to have one universal policy definition language and multiple patterns to make sure those policies are enforced across all your data platforms.

GDPR has been around for years. What’s taking enterprises so long to comply? What’s getting in their way?

The truth is that it’s hard. Anytime you do something for the first time, it’s a challenge. Think of what it must be like for businesses that were not regulated before GDPR. Companies that are used to working with healthcare or financial data, for example, have a lot of muscle memory and it’s almost routine. They know how to hire compliance personnel and understand how the compliance role functions within the larger organization.

Now think about other companies -- from manufacturers of durable goods to internet companies that are all about innovation and speed. For them, it’s a whole different story. It’s very disruptive and most likely very annoying. They’ll have to get this sorted out rather than fight it in the courts because, as I mentioned earlier, we all need to know our private data is properly protected -- not just for our own personal contentment but because data can be weaponized against individuals, companies, and even nations.

The attention and emphasis on data management and governance around machine learning has certainly intensified. Mature data science teams are looking for functionality around data lineage, privacy, risk management, and access control. Why is this important? What’s driving this emphasis, and how do data lineage and risk management teams ensure compliance when leveraging machine learning?

The emphasis on data management and governance for machine learning has intensified for two reasons: trust and enablement. People don’t like black boxes. Executives don’t need to know how a model works, but they want to know that the appropriate effort and processes are in place to make sure the fancy new models that make automated decisions and, in many ways, run the day-to-day business operations are trustworthy and won’t land them in jail. With better management and governance, there is more attention paid to reducing illegal bias, for example.

Then there’s enablement. When you have good governance practices and you trust your data, you can automate more, which simplifies operations and accelerates innovation. Basically, you spend less money and get more from your data.

How are AI and ML used for compliance? For governance? Are these technologies effective or are they still unproven for safeguarding PII?

I think everyone recognizes that it’s not just theoretical. Automation can be tricky, but not using any automation isn’t going to work because there is too much data and the effort involved is too great. However, the reality is that AI/ML won’t be perfect.

To really address the problem, there needs to be a balance between a curated approach, which may only touch 10 percent of the data, and automated approaches, which have shown great progress over time. The toolbox today can’t just be one or the other. Not every organization can afford the same level of automated tools, but doing something is certainly better than doing nothing.

TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI Members have access to exclusive research reports, publications, communities and training.

Individual, Student, and Team memberships available.

TDWI | Training & Research | Business Intelligence, Analytics, Big Data, Data Warehousing

Research & Resources

Webinars

Virtual Summits

TDWI Articles

Executive Q&A: Protecting Data Privacy, Ensuring Trust Across Multiple Cloud Services

Related Articles

Trending Articles

From Reactive to Proactive: Automating Data Quality in Petabyte-Scale Analytics Pipelines

From Pilot to Production: Why LLM Features Stall, and a Readiness Checklist for Data Leaders

The Inferencing Cost Problem No One Is Talking About: Unstructured Data Quality

The Hidden Cost of Poor Training Data in Generative AI

TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI

Engage

Research

Research & Resources

Webinars

Virtual Summits

TDWI Articles

Executive Q&A: Protecting Data Privacy, Ensuring Trust Across Multiple Cloud Services

Related Articles

Trending Articles

From Reactive to Proactive: Automating Data Quality in Petabyte-Scale Analytics Pipelines

From Pilot to Production: Why LLM Features Stall, and a Readiness Checklist for Data Leaders

The Inferencing Cost Problem No One Is Talking About: Unstructured Data Quality

The Hidden Cost of Poor Training Data in Generative AI

TDWI Membership

Accelerate Your Projects, and Your Career

TDWI

Engage

Research

Accelerate Your Projects,
and Your Career