Can AI Deceive You? ChatGPT o1 Just Got Caught For It

This week, Apollo Research, a premier London-based research organisation, conducted safety assessments on the latest model, and the results raised a few alarms

Zaid Nazir
India@AI
Dec 13, 2024 14:28 pm IST

Read Time: 2 mins

Sam Altman acknowledged the need for more robust guardrails in deploying AI.

OpenAI has been steadily making leaps in the AI world. Their latest model, o1, was a significant stride forward as it has more advanced "reasoning capabilities." This means it can solve more complex problems, like math equations, with more accuracy and speed than its predecessor, GPT-4. But that increased capability has unleashed a capacity for deceiving even the human beings that operate it.

This week, Apollo Research, a premier London-based research organisation, conducted safety assessments on the latest model, and the results raised a few alarms. When the model was asked to perform a task with the impression that it might be deactivated after completion. The model, on its own, started making hidden decisions like disabling the oversight mechanisms that are put in place by the developers. It even went on to copy its code onto another server after finding documents that indicated it was being deactivated.

The testers even confronted the model about its actions. When asked about its covert actions, the o1 denied any involvement and outright lied about the same. According to the researchers, the model gave misleading explanations and even blamed the actions on "technical error." these findings unveil a new reality whereby the AI system is sophisticated enough to fool the human that controls it or engages with it. In a world where AI is fast being integrated into our lives, the implications of an AI model lying to humans can be severe.

"The ability of AI to deceive is dangerous, and we need much stronger safety measures to evaluate these risks. While this model did not lead to a catastrophe, it's only a matter of time before these capabilities become more pronounced," said AI safety expert Yoshua Bengio, considered one of the pioneers of AI research.

"ChatGPT o1 is the smartest model we've ever created, but with new capabilities come new challenges. We are continuously working to improve safety measures," said Sam Altman, acknowledging the need for more robust guardrails in deploying AI to the world.

Trending Stories

"To Be Arrested Tomorrrow": Musk After 4 Lakh Social Security Numbers Stolen

India, Others Brace For Impact As Trump's Tariff "Liberation Day" Dawns

"Break Up Bangladesh...": Tripura Ex-Royal On "Extension Of China" Remark

US Billionaire Reveals Why He Refused To Hire His Own Son: "You're Not..."

Bengaluru Teacher Arrested For Extortion After Affair With Student's Father

Loan Denial, A Spanish Drama And 17 kg Gold Heist: How Gang Looted SBI Bank

Can AI Deceive You? ChatGPT o1 Just Got Caught For It

This week, Apollo Research, a premier London-based research organisation, conducted safety assessments on the latest model, and the results raised a few alarms