Utilizing Vision Transformer and Transformer models for emotion recognition through Valence-Arousal estimation, facial expression recognition, and Action Unit detection.