Core Concepts
Machine Translation Quality Estimation has evolved from handcrafted features to Large Language Models (LLMs), offering new insights and challenges in the field.
Abstract
This content provides a detailed overview of Machine Translation Quality Estimation (MTQE) evolution, focusing on datasets, annotation methods, shared tasks, methodologies, challenges, and future research directions. It covers the transition from handcrafted features to deep learning and LLM-based methods. The paper categorizes methods into different categories and discusses the advantages and limitations of each approach.
I. Introduction
Importance of QE in MT development.
Evolution from traditional evaluation metrics to QE techniques.
Significance of QE in real-world applications.
II. Data, Annotations Methods, and Shared Tasks for Quality Estimation
Overview of datasets like MLQE-PE and WMT2023 QE.
Annotation methods including HTER, DA, MQM.
Categorization of shared tasks into word-level, sentence-level, document-level, explainable QE.
III. Methods of Quality Estimation
A. Handcrafted Features Based Methods
QuEst framework for feature extraction.
QuEst++ for word-level and document-level QE.
B. Deep Learning Based Methods
Classic deep learning approaches for feature extraction.
QUETCH model using DNN architecture.
C. Large Language Models Based Methods
GEMBA for direct prediction based on LLM content.
EAPrompt combining CoT with EA for better performance.
IV. Findings
Challenges like data scarcity and interpretability issues.
Lack of standardized evaluation metrics.
Need for more focus on word-level and document-level QE methods.
V. Conclusion
Summarizes the progress made in MTQE over the years and highlights the importance of leveraging LLMs for future advancements.
Stats
"BLEU: a method for automatic evaluation of machine translation," Papineni et al., 2002.
"Meteor: An automatic metric for mt evaluation with improved correlation with human judgments," Banerjee et al., 2005.
"A study of translation edit rate with targeted human annotation," Snover et al., 2006.