A Comprehensive Analysis of Prompt Engineering for Open-Source Large Language Models in Machine Translation and Summarization Evaluation
Systematic exploration of prompt engineering reveals that open-source large language models can be effective for evaluating machine translation and summarization, but their performance is highly sensitive to even minor prompt variations, emphasizing the need for careful prompt design and selection.