As more organizations move toward the adoption of generative AI, Google wants us all to be more concerned about security. To that end, on Thursday the tech giant released its Secure AI Framework (SAIF), meant to be a sort of security roadmap, if a somewhat thinly sketched one for the time being.
But if you’re imagining this is a scheme for averting the sort of existential AI peril Elon Musk is always talking about, think smaller and more immediate.
Here’s a summary of the framework’s six “core elements”:
Elements 1 and 2 are about expanding an organization’s existing security framework to include AI threats in the first place.
Element 3 is about integrating AI into your defense against AI threats, which rather disturbingly calls to mind a nuclear arms race, whether that was intentional or not.
Element 4 is about the security benefits of uniformity in your AI-related “control frameworks.”
Elements 5 and 6 are about constantly inspecting, evaluating, and battle-testing your AI applications to make sure they can withstand attacks, and aren’t exposing you to unnecessary risk.
It looks like for now, Google mostly just wants organizations to bring elementary cybersecurity ideas to bear around AI. As Google Cloud’s info security chief Phil Venables told Axios, “Even while people are searching for the more advanced approaches, people should really remember that you've got to have the basics right as well.”
But there are already some new and unique security concerns cropping up in the here-and-now with generative AI applications like ChatGPT.
For instance, security researchers have identified one potential risk: “prompt injections,” a bizarre form of AI exploitation in which a malicious command directed at an unsuspecting AI chatbot plugin lies in wait in some block of text. When the AI scans the prompt injection, it changes the nature of the command given to the AI. It’s sort of like hiding a sinister mind-control spell in the text on Ron Burgundy’s teleprompter. Weird, right?
And prompt injections are just one of the new types of threats Google specifically says it hopes to help curb. Others include:
“Stealing the model,” a possible way of tricking a translation model into giving up its secrets.
“Data poisoning,” in which a bad actor sabotages the training process with intentionally faulty data.
Constructing prompts that can extract the potentially confidential or sensitive verbatim text that was originally used to train a model.
Google’s blog post about SAIF says the framework is being adopted by, well, Google. As for what the release of a “framework” means for the wider world, it could come to basically nothing, but it could also be adopted as a standard. For example, the US government’s National Institute of Standards and Technology (NIST) released a more general framework for cybersecurity in 2014. That was aimed at protecting critical infrastructure from cyberattacks, but it’s also highly influential, and recognized as the gold standard in cybersecurity by the majority of IT professionals surveyed about it.
Google, however, isn’t the US government, which calls into doubt just how authoritative its framework will be in the eyes of Google’s AI rivals, such as OpenAI. But in security, it looks like Google is trying to lead from the front in the AI space, instead of racing to play catch-up. Perhaps earning back some of the clout it lost in the earlier phases of the AI race is what the release of SAIF is really about.