Core Concepts
Explanations, whether generated by humans or machines, can lead to over-reliance on incorrect AI predictions when the explanations are perceived as helpful, highlighting the dilemma of AI errors.
Abstract
The study explored the effectiveness of human-generated and machine-generated explanations for text generation tasks, specifically question-answering using the SQuAD 1.1v dataset.
Key highlights:
- 156 human-generated text and saliency-based explanations were collected and analyzed. Human saliency maps had little overlap with machine explanations, and human text explanations mostly copied or paraphrased source text.
- In a large human-participant study (N=136), the correctness of AI predictions had a strong, significant effect on all measures (performance, time, quality, helpfulness, and mental effort). Machine saliency maps were significantly less helpful than human saliency maps.
- Participants trusted text extractions more than ChatGPT explanations. Measures of explanation satisfaction, trust in the AI, and explanation helpfulness were negatively correlated with performance scores.
- The findings highlight the dilemma of machine explanations: "good" explanations for incorrect AI predictions can lead to over-reliance on AI, resulting in decreased performance. Participants may try to match their intuition of relevance with saliency maps, leading to explanation confirmation bias.
Stats
As a result, by 1206 Temüjin had managed to unite or subdue the Merkits, Naimans, Mongols, Keraites, Tatars, Uyghurs, and other disparate smaller tribes under his rule.
The title Khagan was not conferred on Genghis until after his death, when his son and successor, Ögedei, took the title for himself and extended it posthumously to his father (as he was also to be posthumously declared the founder of the Yuan dynasty).
Fees range from under £3,000 to £21,000 and above per year for day pupils, rising to £27,000+ per year for boarders.
Quotes
"good" explanations for incorrect AI predictions can lead to over-reliance on AI resulting in decreased performance
participants may try to match their intuition of relevance with saliency maps themselves as a heuristic mechanism to evaluate explanations