OpenAI's Reasoning Models Are "Hallucinating" And Creators Have No Idea Why: Report

OpenAI has been betting heavily on the new models to beat the likes of Google, Meta, xAI, Anthropic, and DeepSeek in the cutthroat global AI race.

Advertisement
Read Time: 2 mins
OpenAI's newly-released o3 and o4-mini reasoning models are hallucinating.

OpenAI's recently launched o3 and o4-mini AI models are prone to hallucinations, more often than the company's previous reasoning models, a report in TechCrunch has claimed. The ChatGPT creators launched the models on Wednesday (Apr 16), which are designed to pause and work through questions before responding.

However, as per OpenAI's internal tests, the two new models are hallucinating or making things up much more frequently than even the non-reasoning models, such as GPT-4o. The company does not have an idea why this is happening.

In a technical report, OpenAI said "more research is needed" to understand why hallucinations are getting worse as it scales up reasoning models.

"Our hypothesis is that the kind of reinforcement learning used for o-series models may amplify issues that are usually mitigated (but not fully erased) by standard post-training pipelines," a former OpenAI employee was quoted as saying by the publication.

Experts claim that while hallucinations may help the models develop creative and interesting ideas, they could also make it a tough sell for businesses in a market where accuracy is the paramount benchmark to achieve.

OpenAI has been betting heavily on the new models to beat the likes of Google, Meta, xAI, Anthropic, and DeepSeek in the cutthroat global AI race. As per the Sam Altman-led company, o3 achieves state-of-the-art performance on SWE-bench verified -- a test measuring coding abilities, scoring 69.1 per cent. Meanwhile, the o4-mini model achieves similar performance, scoring 68.1 per cent.

Also Read | Can US Manipulate Space And Time? White House Tech Chief's Speech Triggers Conspiracy Theories

ChatGPT makes people lonely

Earlier this month, a joint study conducted by OpenAI and MIT Media Lab found that ChatGPT might be making its most frequent users more lonely. While feelings of loneliness and social isolation are often influenced by various factors, the study authors concluded that participants who trusted and "bonded" with ChatGPT more were likelier than others to be lonely and to rely on it more.

Advertisement

Though the technology is still in its nascent stage, researchers said the study may help start a conversation about its full impact on the mental health of users.

Featured Video Of The Day
Delhi CM Flags Off 1,100 Tankers Fitted With GPS System, To Boost Summer Water Supply
Topics mentioned in this article