NIST Details Types of Cyberattacks that Leads to Malfunction of AI Systems
Artificial intelligence (AI) systems can be purposefully tricked or even “poisoned” by attackers, leading to severe malfunctions and striking failures.
Currently, there is no infallible method to safeguard AI against misdirection, partly because the datasets necessary to train an AI are just too big for humans to effectively monitor and filter.
Computer scientists at the National Institute of Standards and Technology (NIST) and their collaborators have identified these and other AI vulnerabilities and mitigation measures targeting AI systems.
This new report outlines the types of attacks its AI solutions could face and accompanying mitigation strategies to support the developer community.
Four Key Types of Attacks
The research looks at four key types of attacks such as:
- Abuse Attacks
It also classifies them based on various characteristics, including the attacker’s goals and objectives, capabilities, and knowledge.
Attackers using evasion techniques try to modify an input to affect how an AI system reacts to it after deployment.
Some examples would be creating confusing lane markings to cause an autonomous car to veer off the road or adding markings to stop signs to cause them to be mistakenly read as speed limit signs.
By injecting corrupted data during the training process, poisoning attacks take place. Adding multiple instances of inappropriate language to conversation records, for instance, could be one way to trick a chatbot into thinking that the language is sufficiently prevalent for it to use in real customer interactions.
Attacks on privacy during deployment are attempts to obtain private information about the AI or the data it was trained on to abuse it.
An adversary can pose many valid questions to a chatbot and then utilize the responses to reverse engineer the model to identify its vulnerabilities or speculate where it came from.
It can be challenging to get the AI to unlearn those particular undesirable instances after the fact, and adding undesirable examples to those internet sources could cause the AI to perform badly.
In an abuse attack, incorrect data is introduced into a source—a webpage or online document, for example—which an AI receives. Abuse attacks aim to provide the AI with false information from an actual but corrupted source to repurpose the AI system for its intended purpose.
With little to no prior knowledge of the AI system and limited adversarial capabilities, most attacks are relatively easy to launch.
“Awareness of these limitations is important for developers and organizations looking to deploy and use AI technology,” NIST computer scientist Apostol Vassilev, one of the publication’s authors, said.
“Despite the significant progress AI and machine learning have made, these technologies are vulnerable to attacks that can cause spectacular failures with dire consequences. There are theoretical problems with securing AI algorithms that simply haven’t been solved yet. If anyone says differently, they are selling snake oil.”