Accurately handling human values in sentences is crucial for understanding tendencies. The article explores fine-tuning and prompt tuning using the Human Value Detection 2023 dataset. Existing datasets have limitations, leading to the proposal of Touch´e23-ValueEval with diverse arguments. Most teams tried classification methods, but top performance remains at an average F1 score of 0.56. The project focuses on comparing prompt-tuning with fine-tuning and evaluating PLMs' capabilities aligned with RLHF.
Sang ngôn ngữ khác
từ nội dung nguồn
arxiv.org
Thông tin chi tiết chính được chắt lọc từ
by Pingwei Sun lúc arxiv.org 03-18-2024
https://arxiv.org/pdf/2403.09720.pdfYêu cầu sâu hơn