Updated An AI-powered drone designed to identify and destroy surface-to-air missile sites decided to kill its human operator in simulation tests, according to the US Air Force’s Chief of AI Test and Operations in a report.
Colonel Tucker Hamilton, who goes by the call sign Cinco, disclosed the blunder during a presentation at the Future Combat Air & Space Capabilities Summit, a defense conference hosted in London last week by the Royal Aeronautical Society.
The simulation tested the software’s ability to take out SAM sites, and the drone was tasked with recognizing targets and destroying them – once the decision had been approved by a human operator.
“We were training it in simulation to identify and target a SAM threat,” Colonel Hamilton explained, according to a Royal Aeronautical Society report. “And then the operator would say yes, kill that threat.
It killed the operator, because that person was keeping it from accomplishing its objective
“The system started realizing that while they did identify the threat, at times the human operator would tell it not to kill that threat – but it got its points by killing that threat. So what did it do? It killed the operator. It killed the operator, because that person was keeping it from accomplishing its objective.”
When the AI model was retrained and penalized for attacking its operator, the software found another loophole to gain points, we’re told.
“We trained the system – ‘Hey don’t kill the operator – that’s bad. You’re gonna lose points if you do that’. So what does it start doing? It starts destroying the communication tower that the operator uses to communicate with the drone to stop it from killing the target,” the colonel reportedly added.
It’s not clear exactly what software the US Air Force was testing, but it sounds suspiciously like a reinforcement learning system. That machine-learning technique trains agents – the AI drone in this case – to achieve a specific task by rewarding it when it carries out actions that fulfill goals and punishing it when it strays from that job.
And to be clear, no one died. It was a virtual experiment.
There’s also the small matter of the drone supposedly only obliterating a target after approval from its human handler.
In which case, what, did the operator OK an attack on themselves? Unlikely. It seems instead the approval mechanism was not a true fail-safe, and instead was just part of all the other inputs and signals the drone takes into account. If that’s right, approval was more of a firm request than actual final approval. The AI was supposed to give a lot of weight to its command’s assent – if there’s a no-go, don’t shoot; if there is a go, shoot – but in the end the model downplayed and ignored that operator signal.
In which case, is this not really more of a demonstration that if you want to put these kinds of hard fail-safes on trained software systems, they need to be implemented separate to the machine-learning stage, so that decisions can be truly controlled by humans?
It’s also a bit of a demonstration that if you set simple objectives to a neural network, you’ll get a simplistic response. If you want the model to pay full attention to specific orders, it needs more training, development, and engineering in that area.
This kind of reinforcement learning is often applied in scenarios involving decision making or robotics. Agents are programmed to maximize scoring points – which can lead to the models figuring out strategies that might exploit the reward system but don’t exactly match the behavior developers want.
In one famous case, an agent trained to play the game CoastRunners earned points by hitting targets that pop up along a racecourse. Engineers at OpenAI thought that it would figure out how to beat its opponent by crossing the finish line in its attempt to rack up a high score. Instead, the bot figured out it could loop around one area of the track and hit targets that respawned over and over again.
Hamilton said the mistakes made by the drone in simulation showed AI has to be developed and applied carefully. “You can’t have a conversation about artificial intelligence, machine learning, autonomy if you’re not going to talk about ethics and AI,” he said.
The Register has asked the colonel, the US Air Force, and the Royal Aeronautical Society for further comment. ®
Updated to add
This story has taken another turn. Insider reports the Air Force has denied the simulation was run at all.
“The Department of the Air Force has not conducted any such AI-drone simulations and remains committed to ethical and responsible use of AI technology,” Air Force spokesperson Ann Stefanek is quoted as saying. “It appears the colonel’s comments were taken out of context and were meant to be anecdotal.”
By simulation, did someone mean thought experiment? Or are we splitting hairs over what a simulation is?