Core Concepts
Microservice-based architectures incur significant communication overhead due to the proliferation of remote procedure calls (RPCs), which can dominate application runtime. The authors propose Notnets, a network-bypass strategy that leverages emerging disaggregated and shared memory technologies to transparently implement message-passing semantics, avoiding the dominant bottlenecks in the current RPC stack.
Abstract
The paper examines the performance overhead of RPCs in microservice-based architectures and proposes a novel approach called Notnets to address these issues.
Key highlights:
Profiling experiments reveal that RPC overheads are highly sensitive to workload characteristics, making point solutions targeting specific bottlenecks ineffective.
Emerging disaggregated and shared memory technologies, such as Compute Express Link (CXL), enable the possibility of shared, remote memory access across multiple nodes.
The authors argue that the semantics of RPC, which involve the transfer of immutable data and control between independently failing agents, can be efficiently implemented using shared memory, sidestepping the traditional challenges of distributed shared memory (DSM).
The Notnets approach aims to emulate message-passing RPCs by sharing message payloads and metadata on CXL-backed far memory, avoiding the dominant bottlenecks in the current RPC stack.
The authors present an initial prototype that bypasses communication-related overheads, including the TCP/IP and HTTP stacks, and demonstrate significant performance improvements in end-to-end RPC latency.
The paper discusses the remaining challenges and open questions, such as addressing transport-level security, load balancing, and memory allocation in the context of a shared-memory RPC implementation.
Stats
Only 40% of the compute cycles contribute to processing business logic, with the rest being spent on communication.
Quotes
"Even looking beyond the more shocking recent headlines, there is increasing evidence that the fundamental costs of microservices may not justify their flexibility."
"Time will not be well-spent on point solutions that target and accelerate a single perceived bottleneck of the communication stack. The problem is the communication itself."