HYDRA is a dynamic compositional visual reasoning framework that integrates a planner, RL agent, and reasoner to enhance reasoning capabilities.
Models struggle with complex visual reasoning tasks, but a new dataset and model propose solutions.
HYDRA is a multi-stage dynamic compositional visual reasoning framework designed for reliable and incrementally progressive general reasoning.
Addressing the need for complex visual reasoning, VISREAS dataset introduces unanswerable questions to enhance model performance.