The paper discusses the limitations of the current NLP methodology, which relies heavily on the "overhearing" paradigm, in modeling human dialogue strategies and grounding mechanisms. It argues that the prevailing supervised learning approach, where dialogue models are trained to react to conversational histories produced by someone else, fails to capture the interactive and collaborative nature of human communication.
The authors provide evidence that human decisions on meta-communicative acts, such as requesting clarification, exhibit significant variability that is difficult to model using data-driven techniques. They present a pilot study showing low agreement among overhearers in predicting when a clarification request should be made, even when provided with the same dialogue context and scene information.
The paper emphasizes the need to move beyond the overhearing paradigm and explore alternative setups, such as reinforcement learning or hybrid approaches, that can better account for the interactive and adaptive nature of human dialogue. It also calls for more studies on the variability of human grounding acts and its impact on modeling human dialogue strategies.
Vers une autre langue
à partir du contenu source
arxiv.org
Questions plus approfondies