The author explores the importance of image modality in hyperbole detection and evaluates various fusion methods. Pre-trained multimodal models are found to be ineffective for this task.