OpenAI’s artificial intelligence-powered chatbot ChatGPT seems to be degrading over time, and researchers can’t figure out why.
In a July 18 study, researchers from Stanford and UC Berkeley found that ChatGPT’s latest models were far less capable of providing accurate answers to a similar set of questions over a period of a few months.
The authors of the study were unable to provide a clear answer as to why the capabilities of AI chatbots have deteriorated.
To test the reliability of different models of ChatGPT, three researchers, Lingjiao Chen, Mattei Zahariya and James Xue, tested the ChatGPT-3.5 and ChatGPT-4 models to solve a series of mathematical problems, answer sensitive questions, code Asked to compose and write new lines. Behavioral cue-based spatial reasoning.
we rated #chatgptSubstantial differences were found in its behavior over time and the answers to *same questions* between the June versions of GPT4 and GPT3.5 and the March versions. Newer versions broke down in some functions. with Lingjiao Chen @matei_zaharia https://t.co/TGeN4T18Fd https://t.co/36mjnejERy pic.twitter.com/FEiqrUVbg6
— James Zou (@james_y_zou) 19 July 2023
According to research, ChatGPT-4 was able to identify prime numbers in March with an accuracy rate of 97.6%. In the same test conducted in June, GPT-4’s accuracy dropped to only 2.4%.
In contrast, the earlier GPT-3.5 model improved prime number recognition within the same time frame.
Connected: The SEC’s Gary Gensler believes AI can strengthen its enforcement regime
When it came to generating lines of new code, the capabilities of both models deteriorated significantly between March and June.
The study also found that ChatGPT’s responses to sensitive questions—in some instances focusing on ethnicity and gender—became more terse, refusing to answer later.
Previous iterations of the chatbot gave broad reasons why it could not answer some sensitive questions. However, in June, the models apologized to the user and refused to respond.
Treating “Equal” [large language model] Service can change significantly in a relatively short period of time,” wrote the researchers, noting that the quality of AI models should be monitored on an ongoing basis.
The researchers advised users and businesses that rely on LLM services as part of their workflow to implement some form of monitoring analytics to ensure the chatbot’s speed.
On June 6, OpenAI unveiled plans to create a team that will help manage risks arising from super-intelligent AI systems, which it hopes to achieve within a decade.
O eye: Trained AIs Go Crazy on AI Stuff, Are Threads Leading to Losses for AI Data?
Stay Connected With Us On Social Media Platforms For Instant Updates, Click Here To Join Us Facebook