Core Concepts
Current language models lack the necessary conceptual representations and grounding to achieve true understanding, despite their impressive performance on language tasks.
Abstract
The essay explores the limitations of current artificial intelligence systems, particularly large language models, in achieving true machine understanding. It highlights the distinction between the statistical patterns captured by language models and the deeper conceptual representations and grounding required for genuine understanding.
The author discusses the historical debate between behaviorism and cognitivism, drawing parallels to the current challenges faced by large language models. Behaviorism attempted to explain all behavior through stimulus-response associations, while cognitivism argued for the necessity of cognitive processes. Similarly, language models rely solely on statistical patterns of language, without any deeper conceptual representations.
The essay delves into the symbol grounding problem, which posits that symbols alone are not enough to achieve meaning and understanding. The author suggests that language models need to be connected to theory-like conceptual representations, similar to how scientific theories model the world, in order to move beyond being "stochastic parrots" and achieve true understanding.
The author proposes several properties that these conceptual representations should possess, such as contextually modulated similarity judgments, representations of objects and causal relations, and the ability to capture meaning beyond just token patterns. Experimental methods are discussed as a way to investigate and evaluate the development of machine understanding.
The key message is that while language models have impressive capabilities, they lack the necessary conceptual foundations to achieve true understanding. Overcoming this limitation will require a research program that goes beyond the current focus on language statistics and explores the development of more robust conceptual representations.
Stats
"Artificial intelligence systems exhibit many useful capabilities, but they appear to lack understanding."
"The transformer architecture on which current systems are based takes one string of tokens and produces another string of tokens (one token at a time) based on the aggregated statistics of the associations among tokens."
"Larger language models include larger representations (more parameters) of larger stimuli (contexts) and more associations (parameters) among the elements of the stimulus to make better predictions of the behavior (the words produced)."
"Despite claims to the contrary, evidence of these deeper cognitive processes is so-far missing."
Quotes
"What exactly would it mean for an artificial intelligence system to understand? How would we know that it does?"
"Faithful transmission from speaker to listener is often blocked by the fact that there are multiple ways of expressing the same idea (synonymy) and the same tokens can correspond to multiple ideas (polysemy)."
"Reinforcement is not enough to predict what people will say. In hindsight, however, his own position is equally deficient."
"Larger language models include larger representations (more parameters) of larger stimuli (contexts) and more associations (parameters) among the elements of the stimulus to make better predictions of the behavior (the words produced)."