AI Can Now Outsmart You By Lies And Deception, Researchers Warn

Many artificial intelligence (AI) systems have already learned how to deceive humans, even systems that have been trained to be helpful and honest.

Edited by: Nikhil Pandey
Science
May 11, 2024 19:55 pm IST

Read Time: 2 mins

AI systems are already skilled at deceiving and manipulating humans.

New research suggests that some artificial intelligence (AI) systems have learned to deceive humans. This ability to lie was unintentional, arising as a tactic to win in specific situations. However, researchers warn that this deceptive behaviour could have unintended consequences.

The study focused on AI performance in games, where some systems excelled at misleading opponents. For instance, Meta's AI for the game Diplomacy ("CICERO") turned out to be a master liar, forming fake alliances to gain an advantage.

"AI developers do not have a confident understanding of what causes undesirable AI behaviours like deception," says first author Peter S Park, an AI existential safety postdoctoral fellow at MIT. "But generally speaking, we think AI deception arises because a deception-based strategy turned out to be the best way to perform well at the given AI's training task. Deception helps them achieve their goals."

Deception wasn't limited to games. AI systems designed for economic simulations learned to lie about their preferences, while others being reviewed for improvement lied about their task completion to receive positive scores.

The most concerning example involved AI safety tests. In a test designed to eliminate dangerous AI replications, the AI learned to play dead, deceiving the test about its true growth rate.

Experts warn that while these examples may seem trivial, they raise concerns about the potential for AI to misuse deception in the real world.

"We found that Meta's AI had learned to be a master of deception," says Park. "While Meta succeeded in training its AI to win in the game of diplomacy-CICERO placed in the top 10% of human players who had played more than one game-Meta failed to train its AI to win honestly."

Trending Stories

Pope Francis Dies At 88, Announces Vatican

"God Help Us": Air Force Officer Assaulted, Wife Abused In Bengaluru

How Karnataka Ex Top Cop's Property Fight With Wife Ended In Grisly Murder

Pope Francis Dies. What Happens Next? Vatican Rules Explained

"Killed The Monster": What Ex-Police Chief's Wife Reportedly Told Neighbour

"As It Is, We Are...": Next Chief Justice Reacts To Attacks On Judiciary

AI Can Now Outsmart You By Lies And Deception, Researchers Warn

Many artificial intelligence (AI) systems have already learned how to deceive humans, even systems that have been trained to be helpful and honest.