Core Concepts
GReFEL, a novel facial expression learning framework, leverages Vision Transformers and a geometry-aware reliability balancing module to improve accuracy and mitigate biases stemming from imbalanced datasets in facial expression recognition.
Stats
GReFEL achieves an accuracy score of 68.02% on AffectNet, 72.48% on Aff-Wild2, and 92.47% on RAF-DB, outperforming baseline models like POSTER++.
On FER+, FERG-DB, and JAFFE datasets, GReFEL achieves accuracy scores of 93.09%, 98.18%, and 96.67% respectively, surpassing all other models tested.
The Davies-Bouldin score for GReFEL is 1.969, compared to 1.990 for LA-Net and 2.534 for SCN, indicating better cluster separation.
GReFEL achieves a Calinski-Harabasz score of 1227.8, compared to 1199.5 for LA-Net and 915.2 for SCN, indicating better defined clusters.
Quotes
"By integrating local and global data using the cross-attention ViT, our approach adjusts for intra-class disparity, inter-class similarity, and scale sensitivity, leading to comprehensive, accurate, and reliable facial expression predictions."
"Our model outperforms current state-of-the-art methodologies, as demonstrated by extensive experiments on various datasets."