Addressing challenges in facial expression recognition through semi-supervised pretraining and temporal modeling for improved performance.
A3lign-DFER introduces a new paradigm for dynamic facial expression recognition, enhancing alignment and achieving state-of-the-art results.
MAE-FaceとFusion Attentionを統合した革新的なアプローチにより、ABAWコンペティションでの表情分類の性能向上を実証。
Employing multi-task multi-modal self-supervised learning to learn rich data representations for facial expression recognition from in-the-wild video data without requiring expensive annotations.
A dual-branch adaptive distribution fusion framework is proposed to address the ambiguity problem in facial expression recognition by mining class distributions of emotions and adaptively fusing them with label distributions of samples.
xLSTM-FER, a novel architecture based on Extended Long Short-Term Memory (xLSTM), offers a computationally efficient and highly accurate method for recognizing student facial expressions, outperforming existing CNN and ViT-based approaches.
GReFEL, a novel facial expression learning framework, leverages Vision Transformers and a geometry-aware reliability balancing module to improve accuracy and mitigate biases stemming from imbalanced datasets in facial expression recognition.