When AI gets it wrong: why understanding LLMS is key to minimising damage

July 2023

by Tom Whitley

When AI gets it wrong: why understanding LLMS is key to minimising damage

July 2023

By Tom Whitley

As AI chatbots powered by large language models (LLMs), such as ChatGPT and Bard, become more ubiquitous and people grow increasingly comfortable with them, the way we interact with them is changing. Generative AI is no longer the preserve of data scientists on the cutting edge of technology – deftly proven just a few weeks ago when US Senator Richard Blumenthal opened a hearing with OpenAI CEO Sam Altman with a recording of his own voice. The Senator hadn’t written the words or recorded the audio: both had been generated by AI.

Sam Altman’s appearance before Congress was timely. Current AI regulations are alarmingly inefficient, not least when you consider the slate of globally-significant elections coming up in the next 18 months, with the potential to be exploited by AI-supercharged malevolent actors. We have already seen cases of AI generating mis- and disinformation, and while this is often easy to identify as false – such as the instance of ChatGPT declaring that Elon Musk had died in a Tesla accident – it is not always so obvious. Musk is high-profile, and constantly reminding us of his presence, so people quickly realised he had not died. But less high-profile people are more susceptible to people believing false statements about them that are generated by AI chatbots, such as in the case of Australian Mayor Brian Hood, who ChatGPT falsely named as being implicated in a foreign bribery scandal – when he was actually the whistleblower.

These LLM-generated false statements are, perhaps generously, being called “hallucinations”, but the reality is that the information always originates from one or more sources. There is a very real risk that LLM tools can mistakenly mass-produce, and spread, libellous information. Language proficiency is key to these tools, so it follows that at some point there will be a language-related offence, such as libel, defamation, bribery, or coercion, committed by an AI chatbot. Indeed, the first defamation case has recently been brought against OpenAI, with a man in the US alleging that ChatGPT generated a fake legal summary accusing him of fraud and embezzlement.

Identifying the information sources used by LLMs

To understand where the hallucinations are coming from, we need to take a look at how LLMs work. They are trained by collecting vast amounts of data, which is then organised and processed using a neural network architecture called a transformer, which identifies patterns in the way words are structured, the way phrases relate to one another, and the way language is constructed. Most importantly, transformers can then make (usually) accurate predictions about what words should come next. This is similar to the autocomplete function we are familiar with, but on a far greater scale.

However, AI does not understand at its core what it is saying, or think like a human. An LLM will always attempt to give an answer, and lacks the ability to admit that it does not know. The models are also unable to differentiate clearly between the need to be factually accurate and the desire to conjure something creative. This is partly why, and how, hallucinations occur.

The datasets behind LLMs

The companies creating these tools have not been overly forthcoming in revealing what datasets they are using, but we do have some information. ChatGPT partly uses a dataset called Common Crawl, a publicly-available corpus of billions of web pages. Bard, which is based on a language model called LaMDA, was trained on a dataset called Infiniset, which is comprised of about 12.5% C4 dataset (a specially-filtered version of Common Crawl) and 12.5% Wikipedia. The remaining 75% consists of words scraped from the internet that Google does not provide specifics on, using vague terms such as “non-English web documents” and “dialogs data from public forums” (which makes up 50%) to describe them.

A reasonable guess is that Reddit and Stack Overflow could be the main “public forums” (both have announced their intention to start charging for access to their data, which hints at their importance). Crucially, however, the precise source of information is not known, nor is it even necessarily consistent – LLM tools will frequently present different answers to the same question. Because answers can be generated from multiple blended sources, conflicts and contradictions can arise.

Minimising damage and reducing risk

Hallucinations will continue to be a major headache for both the developers and the users of LLM tools, not least because of the jeopardy they could create. Recently, a lawyer was forced to admit that he had used ChatGPT to generate a legal brief – it turned out that the cases cited by the bot did not actually exist.

The solution remains unclear, with Alphabet CEO Sundar Pichai saying that the question of whether hallucinations can be solved “is a matter of intense debate”. One possible remedy is greater process supervision, known as reinforcement learning with human feedback – providing human approval for each step in an LLM’s chain of thought – but the efficacy and scale of this is not clear.

Hallucinations underline the current limitations of the technology and the reality that they are not yet intelligent in the same way as humans. However, with the threats they pose to governments, companies and individuals around the world, greater understanding of how LLM-powered tools generate information can only be of benefit to those seeking to monitor the content they produce and minimise the reputational damage they could inflict.

Back to News

When AI gets it wrong: why understanding LLMS is key to minimising damage

When AI gets it wrong: why understanding LLMS is key to minimising damage

Join our newsletter and get access to all the latest information and news: