On 31 May, OpenAI announced its efforts to improve the mathematical problem-solving capabilities of ChatGPT, with the aim of reducing instances of artificial intelligence (AI) hallucinations. OpenAI highlights the reduction of hallucinations as an important step towards developing tuned AI.
In March, the introduction of ChatGPT-4 – the latest version of ChatGPT – brought AI into the mainstream. However, generative AI chatbots have long struggled with factual accuracy, sometimes generating false information commonly referred to as “hallucinations”. Efforts to reduce these AI hallucinations were announced via a post on OpenAI’s website.
AI hallucinations refer to cases where artificial intelligence systems generate factually incorrect outputs, misleading or unsupported by real-world data. These hallucinations can appear in various forms, such as generating false information, creating non-existent events or people, or giving false descriptions about certain subjects.
OpenAI researched the effectiveness of two types of feedback: “outcome supervision” and “process supervision”. Outcome supervision is about feedback based on the end result, while process supervision provides input for each step in the train of thought. OpenAI evaluated these models using math problems, generating multiple solutions and selecting the highest ranked solution based on each feedback model.
After a thorough analysis, the research team found that process supervision performed better because it encouraged the model to follow a human-approved process. On the other hand, monitoring the results consistently proved to be more challenging to check.
OpenAI recognizes that process supervision has implications beyond mathematics, and further research is needed to understand its implications in various domains. This raised the possibility that process guidance may provide a beneficial combination of performance and alignment compared to outcome monitoring if the observed results are placed in a broader context. To facilitate research, the company has publicly released the complete dataset of Process Control, inviting exploration and study in this area.
Connected: AI demand briefly propels Nvidia into the $1T club
While OpenAI didn’t provide the obvious examples that inspired the study of hallucinations, two recent incidents illustrated the problem in real-life scenarios.
In a recent incident, attorney Steven Schwartz in Mata v. Avianca Airlines to rely on chatbot as a research resource. However, the information provided by ChatGPT turned out to be a complete fabrication, which highlights the problem.
OpenAI’s ChatGPT isn’t the only example of an artificial intelligence system encountering hallucinations. During a demonstration of its chatbot technology in March, Microsoft’s Bing AI chatbot probed revenue reports and generated incorrect numbers for companies like Gap and Lululemon.
magazine: 25,000 traders bet on ChatGPT’s stock pick, AI sucks dice and more
Stay connected with us on social media platforms for instant updates, Click here to join us Facebook