Why Data Protection Requires a New Level of Resolution
Graph analysis is the next step toward a future perspective of security where the focus is always on the data.
- By Howard Ting
- September 9, 2022
“Enhance,” says the investigator, squinting at a blurry surveillance image or video as the technician zooms in and the clue becomes clearer. It’s a popular TV trope, but that’s not how things work in the real world. Sure, there are inventive ways we can reinterpret data or see it from another angle, but we will always be limited by the source data.
If we want to get better answers, we don’t just need more data, we need better data. Thankfully, new advancements in data-flow tracing and graph analysis are enabling organizations to directly protect their most valuable information, regardless of what the content is or how it is used.
Better Data for Better Understanding
Graph analysis of data flow can be a powerful tool. To protect data, organizations need a detailed and complete graph that captures and connects every movement or action performed on every piece of data. This requires a shift from an event-based perspective -- looking at each action in isolation -- to a flow-based perspective that encompasses all the complex interactions, movements, and relationships that actually define what data means to a business.
Understanding this flow is crucial to understanding what is really happening (or has happened) to a piece of data. Graph analysis allows organizations to connect the dots across large numbers of these flows to get the true enterprise-wide context of the data and the data risk. Every day, a single user will perform dozens or even hundreds of similar actions. An enterprise will have hundreds of users, machines, and applications with data flows moving between (or within them). Data will exist over long periods of time, so we will need to be able to connect all of these data flows across all of these entities. The graph provides the superset of all this data so we can pull the thread in any direction to get to the answers we need.
Analyzing Data at Scale
Once organizations have a deep, rich graph, they need to analyze it at enterprise scale. To truly understand the risk to a piece of data or from a user’s action, organizations may need to connect data flows in any direction, over any number of steps, across many devices and applications, and over long periods of time. This is different from most uses of graph analysis today, which are often designed for relatively short walks between a small number of data points.
In the context of protecting data, organizations should care very much about analyzing at great depth. They may need to trace potentially trillions of paths in parallel in order to find any risks or actions forbidden by policy. To understand the risk to a particular file or the sprawl and exposure of a particular type of data -- two very common scenarios security teams face -- organizations need parallel, multivector graph analysis that goes well beyond what traditional graph databases and tools can process quickly. The ability to process incredibly complex graphs quickly can redefine what is possible for enterprise data protection.
Graph Analysis is the Future of Data Security
By enabling security teams to solve problems that were previously impossible or prohibitively complex, this kind of graph analysis goes far beyond simply building the next generation of data loss prevention. Ultimately, it is the next step toward a future perspective of security where the focus is always on the data. Graph analysis enables organizations to:
- Control every derivative of sensitive data . There are numerous ways that an organization loses control over sensitive data today. By continually tracking data and context across every action, organizations can now control every copy or derivative of sensitive data automatically.
- Find all users and apps with the most intellectual property . As users work with new applications and developers add integrations between applications, it can become almost impossible to keep track of which applications contain a certain data set. A full graph analysis can instantly answer if a particular SaaS application contains sensitive data or show all the locations where that data is stored in the enterprise.
- Immediately assess data impact of security incidents . For most organizations, it is very difficult to know what data was impacted after a security incident (such as an intrusion or a malware infection). A full data-flow graph means that security teams can take any asset in the enterprise and instantly determine what sensitive data was on the machine or application.
- Create better, simpler security policies . By doing all the heavy lifting in the background, a data-flow graph alleviates the need for endlessly complex policies. Additionally, if a data leak does occur, staff can instantly trace the incident to its root cause to improve policies.
- Protect data based on business workflows . Although organizations may differ on what data they consider important, the workflows around important data are often quite similar. By automatically tracking the flow of data, an organization can see these patterns and likewise see if data is leaking outside of its normal orbit.
By following the full history and interaction of all data in the enterprise, organizations can greatly extend their visibility into data risk and enforce real-time policies to mitigate that risk and prevent loss. Building complete, risk-based context for every piece of data fuels high-definition graphs that provide new types of analysis for organizations to solve modern security problems.
About the Author
Howard Ting joined Cyberhaven as CEO in June 2020. In the past decade, he has played a critical role in scaling Palo Alto Networks and Nutanix to over $1B in sales, generating massive value for customers, employees, and shareholders. Howard has also served in GTM and product roles at Redis Labs, Zscaler, Microsoft, and RSA Security.