The paper proposes the Bicameral Cache, a cache organization for vector architectures that segregates data according to their access type, distinguishing scalar from vector references. The goal is to avoid interference between the two types of references in each other's data locality, with a special focus on prioritizing the performance of vector references.
The Bicameral Cache consists of two partitions: the Scalar Cache and the Vector Cache. The Scalar Cache stores data referenced by scalar memory instructions and uses a set-associative mapping with a write buffer to handle evictions. The Vector Cache stores data referenced by vector memory instructions and uses a fully associative organization with longer cache lines to exploit spatial locality. The two caches are exclusive, ensuring that a sector cannot be present in both at the same time.
The proposal also includes a memory-side prefetching mechanism that opportunistically fills vector cache lines that belong to rows that are open in the memory controller, further exploiting the spatial locality of vector data.
The evaluation using the Cavatools RISC-V simulator shows that the Bicameral Cache with prefetching achieves an average best-case speedup of 1.31x on stride-1 vector benchmarks and 1.11x on non-stride-1 workloads, compared to a conventional cache. The improvements are attributed to a significant reduction in the average memory access time, enabled by the segregation of scalar and vector data and the prefetching mechanism.
翻譯成其他語言
從原文內容
arxiv.org
深入探究