Efficient Knowledge Caching for Retrieval-Augmented Generation (RAG) Systems
RAGCache, a novel multilevel dynamic caching system, efficiently caches the key-value tensors of retrieved documents to minimize redundant computation in Retrieval-Augmented Generation (RAG) systems.