The content delves into the comparison between a Xerox photocopier's lossy compression format and large language models like ChatGPT. It highlights how both systems use compression techniques that may lead to inaccuracies or hallucinations in the reproduced content. The analogy of lossy compression helps understand the functioning of large language models and raises questions about their true understanding of the information they process.
In 2013, a German construction company discovered discrepancies in copies made by a Xerox photocopier due to its lossy compression format. This incident led to an investigation by computer scientist David Kriesel, revealing how modern photocopiers use digital scanning and compression techniques.
The difference between lossless and lossy compression is explained, with examples of where each type is typically used based on the importance of accuracy. Lossy compression, like that used in Xerox photocopiers, can lead to subtle inaccuracies that are not immediately noticeable.
Xerox photocopiers utilize JBIG2, a lossy compression format for black-and-white images, which can result in misleading but readable outputs. The comparison between this technology and large language models like ChatGPT is drawn to highlight similarities in their approach to data processing.
ChatGPT is likened to a blurry JPEG of all text on the Web, retaining information but potentially leading to hallucinations or incorrect responses due to its lossy nature. The article explores whether such large language models truly understand the content they process or merely offer statistical approximations.
The relationship between text compression and understanding is discussed through examples related to arithmetic principles and economic theories. Large language models' ability to identify correlations in text raises questions about their level of comprehension versus mere statistical analysis.
Till ett annat språk
från källinnehåll
www.newyorker.com
Viktiga insikter från
by Cond... på www.newyorker.com 02-09-2023
https://www.newyorker.com/tech/annals-of-technology/chatgpt-is-a-blurry-jpeg-of-the-webDjupare frågor