Friday, August 4, 2023

"ChatGPT Isn’t ‘Hallucinating.’ It’s Bullshitting."

From UnDark, June 4:

Artificial intelligence models will make mistakes. We need more accurate language to describe them. 

Artificial intelligence hallucinates. So we are told by news headlines, think pieces, and even the warning labels on AI websites themselves. It’s by no means a new phrase. As early as the 1980s, the term was used in the literature on natural language processing and image enhancement, and in 2015 no article on the acid phantasmagoria of Google’s DeepDream could do without it. Today, large language models such as ChatGPT and Bard are said to “hallucinate” when they make incorrect claims not directly based on material in their training sets.

The term has a certain appeal: It uses a familiar concept from human psychiatry as an analogy for the falsehoods and absurdities that spill forth from these computational machines. But the analogy is a misleading one. That’s because hallucination implies perception: It is a false sense impression that can lead to false beliefs about the world. In a state of altered consciousness, for example, a person might hear voices when no one is present, and come to believe that they are receiving messages from a higher power, an alien intelligence, or a nefarious government agency. 

A large language model, however, does not experience sense impressions, nor does it have beliefs in the conventional sense. Language that suggests otherwise serves only to encourage the sort of misconceptions that have pervaded popular culture for generations: that instantiations of artificial intelligence work much like our brains do.

If not “hallucinate,” then what? If we wanted to stick with the parlance of psychiatric medicine, “confabulation” would be a more apt term. A confabulation occurs when a person unknowingly produces a false recollection, as a way of backfilling a gap in their memory. Used to describe the falsehoods of large language models, this term marches us closer to what actually is going wrong: It’s not that the model is suffering errors of perception; it’s attempting to paper over the gaps in a corpus of training data that can’t possibly span every scenario it might encounter.

But the terms “hallucination” and “confabulation” both share one big problem: As used in medicine, they each refer to states that arise as a consequence of some apparent malfunction in an organism’s sensory or cognitive machinery. (Importantly, perspectives on what hallucinations and confabulations are — and how they manifest — are profoundly shaped by cultural and social factors.)

The “hallucinations” of large language models are not pathologies or malfunctions; rather they are direct consequences of the design philosophy and design decisions that went into creating the models. ChatGPT is not behaving pathologically when it claims that the population of Mars is 2.5 billion people — it’s behaving exactly as it was designed to. By design, it makes up plausible responses to dialogue based on a set of training data, without having any real underlying knowledge of things it’s responding to. And by design, it guesses whenever that dataset runs out of advice.

It’s not that the model is suffering errors of perception; it’s attempting to paper over the gaps in a corpus of training data that can’t possibly span every scenario it might encounter.

A better term for this behavior comes from a concept that has nothing to do with medicine, engineering, or technology. When AI chatbots flood the world with false facts, confidently asserted, they’re not breaking down, glitching out, or hallucinating. No, they’re bullshitting.

Bullshitting? The philosopher Harry Frankfurt, who was among the first to seriously scrutinize the concept of bullshit, distinguishes between a liar, who knows the truth and tries to lead you in the opposite direction, and a bullshitter, who doesn’t know or care about the truth one way or the other. A recent book on the subject, which one of us co-authored, describes bullshit as involving language intended to appear persuasive without regard to its actual truth or logical consistency. These definitions of bullshit align well with what large language models are doing: The models neither know the factual validity of their output, nor are they constrained by the rules of logical reasoning in the output that they produce....