Retrieval-Augmented Generation for Multi-Modal Large Language Models: Introducing UniRAG
UniRAG, a novel retrieval augmentation technique, significantly improves the performance of multi-modal large language models (MM-LLMs) on image captioning and image generation tasks by incorporating relevant retrieved information as few-shot examples during inference.