The authors propose a novel approach to address the challenges of document-level literary translation, focusing on the Chinese-English language pair. Their methodology involves a three-stage training process:
Continual Pre-training using Extensive Monolingual Literary Data: The authors adapt a general-purpose large language model (LLM) into a specialized Literary LLM by using monolingual literary data in both Chinese and English. This step enhances the model's understanding of nuanced language, stylistic elements, and narrative structures.
Continual Pre-training with Aligned Chinese-English Interlinear Text Format Literary Documents: The authors further enhance the model's cross-lingual translation capabilities by using aligned Chinese-English interlinear text format literary documents. This step enables the model to better understand and map the syntactic and semantic structures between Chinese and English.
Supervised Fine-Tuning with Context-aware and Style-related Instructions: In the final stage, the authors conduct supervised fine-tuning using context-aware and style-related instructions, specifically tailored to address the challenges of semantic coherence and stylistic consistency in literary translation.
Additionally, the authors propose an Incremental Decoding framework that considers the translation of each sentence as part of a continuous process, taking into account the translations of previous sentences and similar sentences in terms of content and style. This approach ensures that the translated text maintains a cohesive flow and consistent style throughout the entire document.
The authors' experiments demonstrate significant improvements in both sentence-level and document-level BLEU scores, highlighting the effectiveness of their proposed framework in addressing the complexities of document-level literary translation.
To Another Language
from source content
arxiv.org
Key Insights Distilled From
by Yuanchang Lu... at arxiv.org 09-26-2024
https://arxiv.org/pdf/2409.16539.pdfDeeper Inquiries