Summary
- Information security risks of generative AI can be classified into two major categories: information leakage and vulnerability introduction
- Information leakage can come from public data, prompt leakage and personal data
- Vulnerability introduction arises from quality issues prevalent in AI-generated code
- Recent examples of these risks make these risks tangible, and not just hypothetical
- Mitigations for these risks are possible through proper governance, which may include:
- Subscribing to closed API
- Building an API facade to aid auditability and access control
- Self-hosting one or more internal open source LLM
- Using domain-oriented LLMs to create better, fit-for-purpose solutions while minimizing data leakage
- Strengthening CI/CD practices and implementing proven quality-first engineering practices as a prerequisite to generative AI usage
- Revisiting policies pertaining to releasing publicly-visible data
Generative AI, particularly Large Language Models (LLMs) such as ChatGPT, offer immense potential for solving a wide variety of business problems, from document creation to coding generation. The impressive output of these models is possible in part because they've been trained on immense data sets, i.e. a meaningful fraction of the entire public internet, and because they continuously learn from user reactions over time.
Generative AI is inherently interactive: users enter a prompt, usually text-based, and the model constructs an answer based on the many billions of parameters it has computed using its massive training corpus. Effective usage looks like a back-and-forth affair, with the user correcting the algorithm's output to improve it for the next round.
These algorithms can introduce information security risk in two novel ways: information leakage and vulnerability generation. In addition, any use of a new tool, API, service, or cloud provider can increase your attack surface, but since this is not unique to generative AI, that topic is largely set aside in the scope of this article.
Information leakage
Information leakage is a significant security risk when using generative AI solutions. I'll simplify the risk by looking at it according to three categories: public data, prompt leakage and personal data exposure. I’ll also briefly summarize the risk and discuss strategies for mitigation.
Public data
LLMs leverage large data sets, largely drawn from the public internet. This includes blog posts, corporate websites, training manuals, forum posts, news articles, support tickets and so on. It's likely that your company's website has been scraped and used to help train the model; recently, the Washington Post offered a search engine for Google's C4 dataset, allowing you to search by domain and see what percentage of the dataset's tokens it contributes.
This means that information you've hosted online is likely part of the model. This may not be restricted to the information a casual user can find on your homepage; it can include detailed material that may appear deep inside the archives of your site, or even information that has since been taken offline. There's little hope of undoing this; website scraping is a calculated risk every business takes. Nevertheless, earlier scraping efforts were never able to synthesize results in such a powerful way. The power of LLMs changes the risk calculus of making information freely public on the internet.
Prompt leakage
The second information security threat is more subtle and damning. Because many LLM products seek to improve based on user input, it should also be noted that the models can train based on that input. This means that any proprietary information used in the prompt phase of the model development can end up embedded in the model and leak to other users, including potential competitors. In fact, it's suspected this has already happened. Recently, Amazon issued an internal warning to employees about the tool, having seen the algorithm generate code that was suspiciously similar to restricted, internal source code.
This leakage applies to text data as well as code. For example, if you ask ChatGPT to suggest slogans for your upcoming advertising campaign, then it may embed the information about your product into the model and make that information available to others prematurely. Code, data models, or requirements fed into the algorithm can reflect internal proprietary information that could be harmful in the hands of a threat actor. This, too, can reveal information about forthcoming product launches, infrastructure details, or business strategies.
Personal data exposure
Another information leakage risk comes from the leakage of personal data. Although you may prevent using personal data directly from your systems when using generative AI, the category of personal data is broad. For instance, are customer service or sales representatives using ChatGPT to write emails to customers? If so, then you're at risk of exposing personal data, regardless of how minor or inconsequential it may seem. A recent data breach at OpenAI occurred where users were able to see parts of other users' interaction history. If these prompts included protected personal data, this could constitute a significant data privacy exposure risk for those firms.
"Any proprietary information used in the prompt phase of the model development can end up embedded in the model and leak to other users, including potential competitors."
Vulnerability introduction
Another risk occurs when vulnerabilities, bugs, and exploits emerge from AI-generated code. This issue arises from a combination of factors. First, because LLMs are trained on nearly all of the internet, this means they learn from every coding pattern in the training set, including code that may be incorrect, inefficient, obsolete, or insecure. Since the training data is historical, the model may omit the latest changes to a language or tool that patches security vulnerabilities.
Second, the LLM has no ability to reason. It cannot qualitatively assess whether a pattern is correct; it can only provide statistically likely outputs. It has no capacity to judge whether the code it emits follows best practices or even if it introduces security flaws. The burden falls on the developer to attend to code they did not create, which may pose an additional risk due to the attention deficit witnessed when using automated tools.
Research supports this risk assessment. A study of Github Copilot showed that 40% of the 1,689 programs generated with the tool included potential vulnerabilities and exploits. A more recent study of Copilot, ChatGPT and Amazon Codewhisperer showed each tool had high rates of incorrect code. The cost-benefit calculus of using LLMs to generate code must factor-in the elevated risk that comes from using their outputs.
Mitigations
The good news is that careful governance and implementation can mitigate many of these security risks. Groups like Github (Copilot) and OpenAI (ChatGPT) are aware of these information security risks and are conscious that these risks can hinder the success of their products. The developer community, too, has explored mitigations and while we're unlikely to develop a panacea that eliminates the risk, there are ways to avoid, mitigate and manage it.
Closed APIs
One option for preventing information leakage is to use "closed" API models. While many LLM implementations can learn dynamically from the information you contribute, several companies are introducing subscription-based models that promise that your inputs—be they code, text, or images—are not used to improve the model further. Each of the Big Three cloud providers are introducing generative AI offerings, and many of these also offer data processing guarantees that your input data does not leave the region, helping compliance with regulatory requirements.
Presently, however, it's difficult to externally validate these guarantees; you have to accept the provider at their word, which may not satisfy regulators or your legal team, particularly as it pertains to any information breach of customer data. It's difficult to prove a negative, so it is a challenge to develop a verification strategy that verifies that no data is leaking through. There is hope that in the future, AI tooling providers can offer validated APIs that ensure data security.
In terms of best practice, you should avoid sending personal data to these APIs, unless you have obtained explicit user consent to do so (and can prove consent was given). Personal data can, of course, exist in your data warehouse or data lake, but it can also leak in through customer service channels, shipping and logistics, or sales. Essentially, any time you might have an employee using an LLM to help craft a response to a customer, you should be aware of the risk of leaking personal data. If you haven't done so already, then now is the best time to implement corporate-wide policies that govern the use of LLMs with personal data.
With these risks in mind, combined with a desire for transparency in how LLMs are being used in the organization, one possible approach is to restrict access to the APIs directly, replacing it with a facade with an internally-hosted service. This service can help scrub personal data, flag misuse and abuse, offer internal auditability and simplify cost management. Instead of calling the external LLM's API, users call an internal API which validates the request and passing it through to the LLM. The facade API should, of course, be appropriately monitored and users authenticated through your cloud's single sign-on.
There is a downside to this approach, however. Since the models cannot learn from your inputs, they cannot adapt as easily to your needs. You're essentially at the mercy of the model, and as it learns from other users, there’s no guarantee it learns in a direction beneficial to you. As the model changes over time without your influence, its output may become less relevant and the validity of the outputs for your use case may degrade. This can be mitigated by developing your own solution.
Locally-hosted models
While using closed APIs and facade patterns can help reduce information leakage risk, they can't address the risk of vulnerability introduction, and they don't help models adapt to your internal knowledge base. Another option would be to host your own LLM or generative AI solution.
Massive, multi-hundreds of billion parameter models are extremely expensive to train and update. This puts them out of reach for most organizations to develop. However, recent improvements in transfer learning and AI development tools have made it easier for you to host your own LLM. Solutions like Meta's LLaMa and the derivatively-named project from Stanford, Alpaca, are much smaller models that can be trained with as little as $600 of cloud computing cost. Some solutions are small enough that they can be trained on a single Macbook.
The downside of this approach is that the resulting models are somewhat lower quality, although qualitative analyses suggest that their performance is still sufficient enough to be fit for purpose.
Locally-hosted models can adapt to your own knowledge base; in time, they may become even better than the public models. The now-infamous "we have no moat" leaked memo from Google suggests the same. It's entirely possible that the cat is out of the bag with regards to generative AI, and in the very near future, we'll see continuous improvements in tooling and technology that simplify the development of bespoke models.
Using a locally-hosted solution will significantly reduce the risk of information leakage, including personal data leakage, as none of the data leaves your walled garden. However, it does not fully eliminate the risk. For instance, a standard practice in data modeling is that domain events should not expose the internal data model or domain logic. In part, this is to prevent close-coupling of your systems and services that can ossify your software architecture, but it's also a best practice to prevent metadata leakage. Internal metadata and data models can contain confidential information that you don't want to expose to the entire business. Data Mesh architectures address this problem explicitly by ensuring that data and metadata can only be accessed through a data product's output ports.
Introducing organization-wide LLMs can break this isolation in the same way that open API models can. If teams are pushing domain-specific logic or data models to the AI, then it's possible that the AI can leak this information outside the domain. Given the decreasing cost and complexity of hosting your own generative AI, it may soon be a best practice for each domain to have their own bespoke model. We're not there yet. Effectively training and hosting local models is still a technical challenge but a solvable one. We've been successfully building machine learning platforms that run domain-specific models for a few years now, using Continuous Delivery for Machine Learning (CD4ML) principles. In fact, we think that continuous delivery principles are more important than ever for ensuring the development of secure, high-quality software.
"The best way to prevent vulnerabilities from being introduced is to ensure you have a robust review process."
Continuous delivery practices
The best way to prevent vulnerabilities from being introduced is to ensure you have a robust review process. At Thoughtworks, we're strong proponents of pair programming and we believe that having two people engaged in the end-to-end software development process is crucial to creating high-quality code. AI-assisted programming introduces some interesting questions: is AI an ersatz for a paired developer? Does the programmer become a code-reviewer? Given the quality and security risks provably demonstrated in AI-generated code, I'm not ready yet to assign personhood to AI. As mentioned before, an LLM cannot exercise judgment.
Instead, doubling down on test-driven development and pairing will be critical to ensure that AI-generated code is well-reviewed. In my experience with AI-driven code, I've found that the tools are most effective when I already know what "good" looks like. I'm able to tune and modify the outputs when needed, and I can engineer good prompts to get the right output quickly and efficiently.
Nevertheless, some of these productivity gains are offset if I have to spend a lot of time testing the code looking for edge cases and unexpected behaviors. The best solution is to automate as many of these checks as possible with a robust continuous delivery pipeline. Unit tests, integration tests, behavior checks, load and performance tests, smoke tests, dependency checks, security scans and more can all be integrated into a good CI/CD pipeline. This has always been true, but the possibility of AI-driven code makes these increasingly important. The last thing you want to do is ship flawed, untested or insufficiently-tested code that nobody in the organization understands. In a way, AI-generated code's biggest productivity gain might result from the fact that we must now put more focus on quality engineering.
There's another element to this, which pertains to which teams will be best positioned to take advantage of AI-generated code. I believe that high-performing and elite teams (as defined in Accelerate) will benefit most from the technology. This is because these teams already have robust paths to production, effective and well-tested rollback and roll-forward practices and effective monitoring and observability. These teams are more likely to know what good looks like and have experience deploying code multiple times a day. Underperforming teams may not benefit from AI at all; in the worst cases, they may do more damage with AI than without. Thus, the modern engineering manager should consider the CI/CD pipeline as one of the most potent risk mitigation solutions for AI-generated code. I'd argue it's the first and most effective intervention possible in software development today, and should be the first place to start your AI transformation.
Gating public data
For years, many firms have offered customer service through public fora and other forms of information exchange. This can help reduce duplicate requests by making the information available to anyone else who needs it, but it does mean that some of this information has been scraped and used to train LLMs. Data scraping has always been a risk, but the large-scale synthesis capabilities that LLMs possess might make this approach riskier in the future. Businesses should re-evaluate their policies regarding what they publish online.
The question regarding information disclosure is no longer "can my competitor see this and learn from it," but rather, "can my competitor use this information to train an AI that will help them beat me." In the past, businesses may have been bound by law or ethics from exploiting such information to create a competitive advantage. With LLMs, companies may not even be aware that that's what they're doing, and the opaque nature of the algorithm means there is no paper trail to prove it's happening.