Kernekoncepter
Improving resource utilization in GNN training through a Unified CPU-GPU protocol.
Resumé
The content introduces a novel Unified CPU-GPU protocol for Graph Neural Network (GNN) training to enhance resource utilization. It addresses inefficiencies in existing GNN frameworks by balancing workload between CPUs and GPUs dynamically. The protocol instantiates multiple GNN training processes on both CPUs and GPUs, improving memory bandwidth utilization and reducing data transfer overhead. Key contributions include proposing the protocol, developing a Dynamic Load Balancer, and evaluating performance on various platforms with speedups up to 1.41×. The system design includes a GNN Process Manager, Dynamic Load Balancer, and GPU Feature Caching for optimization.
Abstract:
- Proposes a Unified CPU-GPU protocol for GNN training.
- Addresses inefficiencies in existing GNN frameworks.
- Improves resource utilization by balancing workload dynamically.
- Key contributions include the proposed protocol, Dynamic Load Balancer, and performance evaluation.
Introduction:
- Graph Neural Networks (GNNs) are used in various applications.
- Existing protocols cannot efficiently utilize platform resources.
- Proposed Unified CPU-GPU protocol aims to improve resource utilization.
System Design:
- Introduces the GNN Process Manager for workload assignment.
- Describes the Dynamic Load Balancer for workload balancing.
- Explains GPU Feature Caching to reduce memory access overhead.
Experiments:
- Evaluates performance on different platforms with speedups up to 1.41×.
- Demonstrates impact of optimizations including Dynamic Load Balancer and GPU Feature Caching.
Statistik
Our protocol speeds up GNN training by up to 1.41× on platforms where the GPU moderately outperforms the CPU. On platforms where the GPU significantly outperforms the CPU, our protocol speeds up GNN training by up to 1.26×.
Citater
"Our key contributions are: conducting detailed analysis of state-of-the-art GNN frameworks, proposing a novel Unified CPU-GPU protocol, developing a Dynamic Load Balancer, evaluating work using various platforms."
"Our system consists of several building blocks to execute the Unified CPU-GPU protocol without altering model accuracy or convergence rate."