The content discusses the challenges faced by traditional memory allocators in supporting Processing-Using-Memory (PUM) architectures, particularly focusing on Processing-Using-DRAM (PUD) operations. It highlights the inefficiencies of standard memory allocation routines in meeting the data layout and alignment needs of PUD substrates. To address these issues, a new memory allocation routine called PUMA is proposed to enable aligned data allocation for PUD instructions without requiring hardware modifications.
PUMA leverages internal DRAM mapping information and huge pages to ensure proper data alignment and allocation for PUD operations. The routine consists of three main components: DRAM organization information, DRAM interleaving scheme, and a huge pages pool for PUD memory objects. By splitting huge pages into finer-grained units aligned with DRAM subarrays, PUMA enhances performance by increasing the likelihood of operations being executed in DRAM.
Evaluation results demonstrate that PUMA significantly outperforms baseline memory allocators across various micro-benchmarks and allocation sizes. The performance improvements are more pronounced with larger data allocations due to reduced data movement between DRAM and CPU. Overall, PUMA proves to be an efficient and practical solution for memory allocation in PUD substrates.
เป็นภาษาอื่น
จากเนื้อหาต้นฉบับ
arxiv.org
สอบถามเพิ่มเติม