Facial Action Units (AU) detection is enhanced by the innovative PETL paradigm, MoKE collaboration mechanism, and MDWA-Loss in AUFormer.
Utilizing temporal convolution and GPT-2 enhances AU detection accuracy by integrating audio-visual data for nuanced emotional expression understanding.
AUFormer introduces a Parameter-Efficient Transfer Learning paradigm for efficient AU detection, showcasing state-of-the-art performance without relying on additional data.
A novel contrastive learning framework that incorporates both self-supervised and supervised signals to enhance the learning of discriminative features for accurate facial action unit detection, addressing challenges such as class imbalance and noisy labels.