Putting AI to Work Protecting Your Data (Part 2 of 2)
Cybersecurity is a top priority for enterprises, but it often seems they’re losing an ever-changing battle. How can artificial intelligence help?
- By Upside Staff
- August 14, 2018
AI technology is attracting the attention of enterprises large and small, and given the current state of cybersecurity, it’s no wonder AI is being applied to keeping data safe. To learn more, we spoke with Grant Wernick, CEO of Insight Engines. In the first part of our conversation, we discussed the nature of cybersecurity issues, how enterprises are tackling them, the impact of a shortage of skilled security professionals, and how hackers and attackers are gaining the upper hand. In this, the second part of our conversation, we discuss how AI technology can solve many of these problems.
Upside: AI is often cited as a technology that can solve many enterprise problems. What’s its role in cybersecurity, and is AI enough?
Grant Wernick: As I mentioned [in part 1], AI and machine learning (ML) can help improve cybersecurity efforts in anomaly detection across specific, structured, extremely vast data feeds. One of AI’s biggest advantages is how quickly an AI model can process and analyze large amounts of data. However, the real value of AI is when it churns out results that empower users to apply their intuition and creativity to further think about and question the data. This is when the true power of AI is recognized in cybersecurity.
What is intelligence augmentation (IA) and how does it differ from AI?
We humans are designed to think and handle imprecise logic and association very well. When we ask a question, we are capable of seeing the results from different points of view, and our brains make connections machines can’t because we understand context. A natural language approach to querying allows people to ask questions of data the way we ask questions of Google. The first question inspires another, and another, and so on. Then people can dig through the results to view the information from different angles and better understand what's going on. In this way, the AI or the machine is feeding the humans the information they need and augmenting human intelligence -- this is intelligence augmentation (IA).
AI and ML are often used interchangeably. Can you explain how they’re related and/or how they are different?
AI has been around for a long time and continues to evolve. We've seen work in the field of AI moving away from increasingly complex calculations to focus more on mimicking human decision-making processes and completing tasks in human ways. An example we all understand today is self-driving cars. The car is programmed to act and react as a model human driver would. This is actually machine learning (ML), which is a subset of AI, and in this case the machine learning models are constantly being retrained around new data coming in from dozens of sensors on the car.
Natural language processing (NLP) is another form of AI and depending on the application, it can be applied to many different types of ML. Usually it’s applied as an application of semi-supervised ML where humans and machines work together to make natural language models smarter. The kind of NLP we specialize in is called “natural language search,” and the purpose of this kind of technology is to make it easy for humans to interact with machines more naturally. In our case, we see the human factor being paramount to win the cybersecurity battle, with the technology serving humans, not the other way around.
What was your “ah-ha” moment when you started to explore the application of NLP search to the security issues faced by CSOs and CISOs?
We have years of experience around NLP. In fact, my cofounder Jacob Perkins wrote one of the de facto books in the space (Python 3 Text Processing with NLTK 3 Cookbook) and we launched another natural language search company in the consumer space before Insight Engines. In late 2015, we were looking at areas where there was a lot of data, accessibility was difficult, and the need to garner insights from that data was high.
Our research led us to machine data (log data), and the growing need to utilize that data to fight cybercrime. Log data stores today are messy, searching them is complex, and there is so much value buried within the machine data. In fact, today only about one percent of machine data is even utilized by most organizations. We decided that there was a real problem that we have the unique capability to solve. Three years later, we are well on our way to being the company that evolves the way people extract meaning from their logs.
We saw a great need to apply this to cybersecurity. With security operations, incident response investigations are always fluid. New information is usually being gathered while simultaneously many users (from analysts to C-suite executives) all want answers as soon as possible. During these incidents, there is a clear need to synthesize and distill all data sources into clear, factual information as soon as possible. However, this process takes significant time and tends not to scale effectively -- one or two expert analysts usually are the bottleneck in these efforts.
By empowering all users to efficiently ask questions of their data simultaneously and obtain clear results quickly, these expert analysts can focus on asking deeper questions, rather than acting as a simple data translator. In this way, we see how an NLP interface can transform an otherwise lean team into one that scales gracefully.
Having well-structured data -- not just data stored in one place -- is frequently cited as a barrier to effective threat detection. Why? What is the state of the industry on that front?
Most organizations today are extremely overwhelmed with all of the machine data they have and dump it into various data stores, often creating “data swamps.” Having all of this unstructured machine data can be challenging for the organization when it comes to leveraging the data for threat defenders. For security teams to be most effective, they need to have a good understanding of their machine data, where it sits, where it’s coming from, and what might be missing. Having an automated assessment of their machine data, with visualizations to show them the structured and unstructured parts of their data, enables security teams to optimize and expand their coverage.
What suggestions do you have for organizations looking to optimize their data for improved cybersecurity initiatives as well as data-driven business initiatives?
For security teams to be successful, they need to get their data in good shape -- that’s priority #1. Taking the time to dig into your data store to figure out what is there is essential. Think of it as a data classification exercise, enabling security teams to better balance security measures and protect the company’s most critical information assets while enabling business innovation. Once your data is in good shape you can get more advanced and do a lot more.
How can newer tools enable both nontechnical users and experts alike to contribute to cybersecurity operations?
Security teams need to question the way it’s been done. They need to challenge log store search and the SIEM framework to get ahead of the attackers. They need to utilize new technologies such as ML and natural language search to get more from their data and look around the dark corners of their data store. This approach is a game-changer. By embracing a culture of data curiosity and continuous learning across the entire team, your security team can be inspired to investigate deeper and faster. This contributes to greater job success, job satisfaction, and talent retention.