Charles Translator is a machine translation system developed to quickly provide high-quality translation between Ukrainian and Czech in order to mitigate the language barrier faced by Ukrainian refugees in the Czech Republic following the Russian invasion of Ukraine.
Parameter-efficient fine-tuning (PEFT) methods can effectively adapt large pre-trained language models for diverse tasks, offering a balance between adaptability and computational efficiency. This study comprehensively evaluates the performance of various PEFT architectures for improving low-resource language (LRL) neural machine translation (NMT).
Anti-Language Model (Anti-LM) decoding objective with exponential decay can significantly improve zero-shot in-context machine translation performance compared to other decoding methods.
The choice of retrieval technique significantly impacts the performance of retrieval-augmented neural machine translation models, with varying effects across different architectures. Optimizing for coverage and diversity of retrieved examples can yield substantial gains, especially for non-autoregressive models.
Random target embeddings can outperform pre-trained embeddings, especially on larger datasets and for rare words, in continuous-output neural machine translation.
The emergence of Large Language Models (LLMs) like GPT-4 and ChatGPT is introducing a new phase in the Machine Translation (MT) domain, offering vast linguistic understandings and innovative methodologies that have the potential to further elevate MT.
Vocabulary trimming, a common practice in neural machine translation, fails to consistently improve model performance and can even lead to substantial degradation across a wide range of hyperparameter settings.
KazParC is a large-scale parallel corpus designed to facilitate machine translation across Kazakh, English, Russian, and Turkish languages. The corpus was developed with the assistance of human translators and contains over 371,000 parallel sentences spanning diverse domains. The research also introduces Tilmash, a neural machine translation model that demonstrates competitive performance compared to industry-leading services.
ACT-MNMT introduces a novel mechanism for Multilingual Neural Machine Translation, addressing off-target issues and improving translation performance.
기계 번역의 평가를 위한 BiVert 방법론 소개