Developing a High-Quality Vietnamese-English Medical Machine Translation Dataset and Evaluating Translation Models
This paper introduces the MedEV dataset, a high-quality Vietnamese-English parallel corpus containing 358.7K sentence pairs in the medical domain, and conducts a comprehensive empirical investigation to improve the performance of neural machine translation models within the medical health domain.