OpenAI's Reasoning Models Are "Hallucinating" And Creators Have No Idea Why: Report

OpenAI has been betting heavily on the new models to beat the likes of Google, Meta, xAI, Anthropic, and DeepSeek in the cutthroat global AI race.

Edited by: Abhinav Singh
Science
Apr 19, 2025 16:09 pm IST

Read Time: 2 mins

OpenAI's newly-released o3 and o4-mini reasoning models are hallucinating.

OpenAI's recently launched o3 and o4-mini AI models are prone to hallucinations, more often than the company's previous reasoning models, a report in TechCrunch has claimed. The ChatGPT creators launched the models on Wednesday (Apr 16), which are designed to pause and work through questions before responding.

However, as per OpenAI's internal tests, the two new models are hallucinating or making things up much more frequently than even the non-reasoning models, such as GPT-4o. The company does not have an idea why this is happening.

In a technical report, OpenAI said "more research is needed" to understand why hallucinations are getting worse as it scales up reasoning models.

"Our hypothesis is that the kind of reinforcement learning used for o-series models may amplify issues that are usually mitigated (but not fully erased) by standard post-training pipelines," a former OpenAI employee was quoted as saying by the publication.

Experts claim that while hallucinations may help the models develop creative and interesting ideas, they could also make it a tough sell for businesses in a market where accuracy is the paramount benchmark to achieve.

OpenAI has been betting heavily on the new models to beat the likes of Google, Meta, xAI, Anthropic, and DeepSeek in the cutthroat global AI race. As per the Sam Altman-led company, o3 achieves state-of-the-art performance on SWE-bench verified -- a test measuring coding abilities, scoring 69.1 per cent. Meanwhile, the o4-mini model achieves similar performance, scoring 68.1 per cent.

Also Read | Can US Manipulate Space And Time? White House Tech Chief's Speech Triggers Conspiracy Theories

ChatGPT makes people lonely

Earlier this month, a joint study conducted by OpenAI and MIT Media Lab found that ChatGPT might be making its most frequent users more lonely. While feelings of loneliness and social isolation are often influenced by various factors, the study authors concluded that participants who trusted and "bonded" with ChatGPT more were likelier than others to be lonely and to rely on it more.

Though the technology is still in its nascent stage, researchers said the study may help start a conversation about its full impact on the mental health of users.

Trending Stories

Wife Threw Chilli Powder At Ex Top Cop, Tied Him Up, Stabbed Him: Sources

End Of Salaried Middle-Class? Market Expert Predicts Indian Economy's Future

China's Big Warning For Countries Signing Trade Deals With US

How Death In A Cinema Hall Paved The Way For Raj Thackeray's Shiv Sena Exit

Liberals Vs Conservatives: Where Things Stand In Canada Polls 2025

Bengaluru Man's Rant On Chaotic Flight Sparks Debate On Air Travel Etiquette

OpenAI's Reasoning Models Are "Hallucinating" And Creators Have No Idea Why: Report

OpenAI has been betting heavily on the new models to beat the likes of Google, Meta, xAI, Anthropic, and DeepSeek in the cutthroat global AI race.

ChatGPT makes people lonely