Conceitos Básicos
The authors propose new persistent multi-word compare-and-swap (PMwCAS) algorithms that remove redundant compare-and-swap and cache flush instructions, improving performance on many-core CPUs compared to the original algorithm.
Resumo
The paper introduces new PMwCAS algorithms for many-core CPUs. The original PMwCAS algorithm by Wang et al. contains redundant compare-and-swap (CAS) and cache flush instructions, leading to performance degradation on many-core CPUs.
The authors propose two new PMwCAS algorithms:
- PMwCAS with dirty flags: This algorithm uses dirty flags to indicate when target words are visible in CPU caches but not yet persisted. Worker threads must wait for words to be flushed before continuing their processes.
- PMwCAS without dirty flags: This algorithm removes the dirty flags, using the PMwCAS descriptors as write-ahead logs instead to ensure consistency.
The key improvements in the proposed algorithms are:
- Removal of redundant CAS and flush instructions
- Exclusion of dirty flags, which helps avoid frequent small writes and double flushes
Experimental results show that the proposed methods are up to 10 times faster than the original PMwCAS algorithm. The authors also provide suggestions for using PMwCAS operations effectively, such as keeping the number of target words small, avoiding false sharing in CPU cache lines, and swapping high-contention words first.
Estatísticas
The original PMwCAS algorithm has a throughput of only around 2 MOps/s on 56 threads, while the proposed methods achieve up to 20 MOps/s.
The 99th percentile latency of the original PMwCAS is around 1000 us, while the proposed methods achieve around 100 us.
Citações
"Wang et al. proposed persistent multi-word compare-and-swap (PMwCAS) operations to support programming in persistent memory."
"We remove redundant CAS and flush instructions in our PMwCAS algorithms. In addition to the improvements applied to our MwCAS operations [23], we exclude dirty flags that manage data durability in the original algorithm."
"Experimental results demonstrate the effectiveness of the proposed methods. Our PMwCAS algorithms are up to ten times faster than the original algorithm."