Supervised Knowledge Retrieval and Reasoning for Effective Visual Question Answering
Supervised retrieval of relevant knowledge from external knowledge bases and scene graphs, combined with multi-hop reasoning, can significantly improve performance on knowledge-based visual question answering tasks.