Multimodal Fusion with Pre-Trained Model Features for Robust Affective Behavior Analysis In-the-wild
This paper presents a multimodal fusion approach that leverages pre-trained model features to achieve outstanding performance in Valence-Arousal Estimation and Expression Recognition tasks on the Aff-Wild2 dataset.