The paper introduces MuChin, the first open-source benchmark for evaluating the performance of large language models (LLMs) in understanding and describing Chinese music in colloquial language.
The key highlights are:
Motivation: Existing music description datasets either have a semantic gap between algorithmic and human understanding or are limited to expert annotations, failing to capture the perspectives of the general public. MuChin aims to address this gap.
Benchmark Design: MuChin includes tasks for textual music description, lyric generation, and automatic annotation. It utilizes a multi-person, multi-stage quality assurance process to ensure high-precision annotations from both professionals and amateurs.
Dataset Creation: The authors developed the Caichong Music Annotation Platform (CaiMAP) and built the Caichong Music Dataset (CaiMD), a comprehensive dataset with multi-dimensional, high-quality music annotations aligned with public perception.
Experiments: The paper analyzes the discrepancies between professionals and amateurs in music description, and demonstrates the effectiveness of the CaiMD dataset in fine-tuning LLMs for music-related tasks. It also evaluates the performance of existing music understanding models on the MuChin benchmark.
Significance: MuChin provides a new perspective on evaluating the capabilities of LLMs in the music domain, requiring models to not only extract basic music attributes but also align with the public's musical perceptions and describe music in a colloquial manner.
Іншою мовою
із вихідного контенту
arxiv.org
Ключові висновки, отримані з
by Zihao Wang,S... о arxiv.org 04-03-2024
https://arxiv.org/pdf/2402.09871.pdfГлибші Запити