Accelerating GPU Serverless Computing with Fast Setup and High Throughput
SAGE, a GPU serverless framework, enables fast function setup and high function throughput by parallelizing data preparation and context creation, and leveraging sharing-based memory management.