Facial expression recognition (FER) models can be made more interpretable by aligning their internal feature representations with spatial cues derived from facial action units, without requiring additional manual annotations.
Leveraging spatial action unit cues to train deep models that can perform accurate facial expression recognition while providing interpretable decisions.