CoTBal: Comprehensive Task Balancing for Multi-Task Visual Instruction Tuning
Grunnleggende konsepter
CoTBal algorithm enhances multi-task visual instruction tuning by balancing task contributions and difficulties.
Sammendrag
CoTBal introduces a novel approach to optimize multi-task visual instruction tuning by considering inter-task contributions and intra-task difficulties. The algorithm assigns task weights based on these factors, leading to improved overall performance while ensuring task balance. Experimental results show that CoTBal outperforms existing methods, demonstrating its effectiveness in enhancing model performance across various visual tasks.
CoTBal
Statistikk
To mitigate this issue, we propose a novel Comprehensive Task Balancing (CoTBal) algorithm for multi-task visual instruction tuning of LMMs.
Experiments show that our CoT-Bal leads to superior overall performance in multi-task visual instruction tuning.
Specifically, we propose a Generic Task Weighting (GTW) paradigm where losses are task-specific weighted and averaged at the token level.
Tasks achieving near-optimal performance with a limited dataset are relatively simpler, while those requiring the full dataset for optimal performance have greater inherent learning difficulties.
The training loss is obtained by averaging the cross-entropy losses calculated across all valid tokens.
Sitater
"Experiments show that our CoT-Bal leads to superior overall performance in multi-task visual instruction tuning."
"To mitigate this issue, based on the mixture of LoRA experts, Gou et al. (2023) utilizes distinct experts to learn conflicting tasks."
"Our experiments demonstrate that CoTBal outperforms existing methods, significantly improving overall performance while ensuring task balance."