New AI models are more likely to give a wrong answer than admit they don't know

Europe • Oct 1, 2024, 5:30 AM

3 min de lecture

Newer large language models (LLMs) are less likely to admit they don’t know an answer to a user’s question making them less reliable, according to a new study.

Artificial intelligence (AI) researchers from the Universitat Politècnica de València in Spain tested the latest versions of BigScience’s BLOOM, Meta’s Llama, and OpenAI's GPT for accuracy by asking each model thousands of questions on maths, science, and geography.

Researchers compared the quality of the answers of each model and classified them into correct, incorrect, or avoidant answers.

The study, which was published in the journal Nature, found that accuracy on more challenging problems improved with each new model. Still, they tended to be less transparent about whether they could answer a question correctly.

The earlier LLM models would say they could not find the answers or needed more information to come to an answer, but new models were more likely to guess and produce incorrect responses even to easy questions.

'No apparent improvement' in solving basic problems

LLMs are deep learning algorithms that use AI to understand, predict, and generate new content based on data sets.

While the new models could solve more complex problems with more accuracy, the LLMs in the study still made some mistakes when answering basic questions.

"Full reliability is not even achieved at very low difficulty levels," according to the research paper.

"Although the models can solve highly challenging instances, they also still fail at very simple ones".

This is the case with OpenAI’s GPT-4, where the number of "avoidant" answers significantly dropped off from its previous model, GPT-3.5.

“This does not match the expectation that more recent LLMs would more successfully avoid answering outside their operating range,” the study authors said.

Researchers concluded then that there's "no apparent improvement" for the models even though the technology has been scaled up.

Today

Wielding a chainsaw or welcome relief? EU anti-deforestation law delay seen as a 'disgrace' by many

Europe • 6:15 AM

9 min

A much-debated deforestation law has been postponed by the European Commission, sparking outrage from environmentalists but relief from affected businesses.

Read the article

How will the EU handle another wave of refugees moving towards it? | Radio Schuman

Europe • 5:54 AM

2 min

Renewed conflict in the Middle East may bring about a new refugee crisis in Europe, where anti-immigration far-right parties are rising in consensus and influencing the political narrative.

Read the article

Ukraine's Zelenskyy ramps up efforts to secure military investments and humanitarian aid

Europe • 2:12 AM

2 min

Ukrainian President met with several entities on Wednesday to discuss strengthening Ukraine's defence, as well as securing aid in preparation for the winter.

Read the article