The paper introduces a novel Quadruplet Cross Similarity (QCS) Network for Facial Expression Recognition (FER) that leverages cross similarity attention to refine features by maximizing inter-class differences and minimizing intra-class differences, achieving state-of-the-art performance on several FER datasets without relying on additional landmark information or external training data.
This paper introduces AffectNet+, an improved facial expression dataset utilizing "soft-labels" to represent the presence of multiple emotions in a single image, addressing limitations of traditional "hard-label" approaches in capturing the complexity of human emotions.
ARBEx is a novel framework that leverages a Vision Transformer and a reliability balancing mechanism to improve the accuracy and robustness of facial expression recognition by addressing challenges such as poor class distributions, bias, and uncertainty.
GReFEL, a novel facial expression learning framework, leverages Vision Transformers and a geometry-aware reliability balancing module to improve accuracy and mitigate biases stemming from imbalanced datasets in facial expression recognition.
xLSTM-FER, a novel architecture based on Extended Long Short-Term Memory (xLSTM), offers a computationally efficient and highly accurate method for recognizing student facial expressions, outperforming existing CNN and ViT-based approaches.
A dual-branch adaptive distribution fusion framework is proposed to address the ambiguity problem in facial expression recognition by mining class distributions of emotions and adaptively fusing them with label distributions of samples.
Employing multi-task multi-modal self-supervised learning to learn rich data representations for facial expression recognition from in-the-wild video data without requiring expensive annotations.
MAE-FaceとFusion Attentionを統合した革新的なアプローチにより、ABAWコンペティションでの表情分類の性能向上を実証。
A3lign-DFER introduces a new paradigm for dynamic facial expression recognition, enhancing alignment and achieving state-of-the-art results.
Addressing challenges in facial expression recognition through semi-supervised pretraining and temporal modeling for improved performance.