核心概念
Distributed protocols like 2PC and Paxos can be optimized for scalability through rule-driven query rewrites that preserve correctness.
要約
The paper presents an approach for scaling distributed protocols by applying rule-driven rewrites, borrowing from query optimization techniques. The key ideas are:
Decoupling:
- Partitions code by breaking a single-node component into multiple components that can run in parallel across nodes.
- Leverages order-insensitivity and data dependency analysis to identify correct coordination-free decoupling opportunities.
Partitioning:
- Distributes data across nodes to parallelize compute.
- Utilizes relational techniques like functional dependency analysis to find data partitioning schemes that allow sub-programs to work on local partitions without reshuffling.
The authors demonstrate the generality of their optimizations by applying them to three seminal distributed protocols: voting, 2PC, and Paxos. The optimized protocols achieve 2-5x throughput improvement compared to the unoptimized versions, matching the performance of state-of-the-art ad hoc rewrites.
統計
Throughput of voting protocol scales from 50K to 200K commands/sec with 5 partitions.
Throughput of 2PC scales from 75K to 175K commands/sec with 5 partitions.
Throughput of Paxos scales from 50K to 150K commands/sec with 5 partitions.
引用
"Distributed protocols such as 2PC and Paxos lie at the core of many systems in the cloud, but standard implementations do not scale."
"Our local-first approach naturally has a potential cost: the space of protocol optimization is limited by design as it treats the initial implementation as 'law'."