The study investigates how contextual information affects the evaluation of machine-translated chats. It compares reference-based and reference-free metrics, showing that context can significantly improve translation quality assessment. The research emphasizes the importance of adapting evaluation methods to the unique characteristics of chat conversations.
The content discusses the challenges posed by unstructured chat conversations for traditional translation quality metrics designed for structured texts like news articles. It highlights the reliance on contextual information in assessing translation quality and proposes a new metric, CONTEXT-MQM, that incorporates bilingual context to enhance evaluation accuracy.
Furthermore, it analyzes error types and severity levels in chat translations, demonstrating how adding context can improve correlation with human judgments across most error types. The study also explores the impact of noisy or partial context on metric performance, emphasizing the importance of complete and related contextual information for accurate evaluation.
Additionally, the research delves into LLM-based contextual quality estimation for chat translations, introducing CONTEXT-MQM as a promising metric that outperforms existing approaches. The results suggest that leveraging context can enhance LLM-based evaluation and improve error detection in imperfect translations.
Overall, the study provides valuable insights into optimizing translation quality assessment in chat conversations through effective utilization of contextual information.
To Another Language
from source content
arxiv.org
Djupare frågor