Core Concepts
Authors propose a novel approach using Deep Reinforcement Learning to manage tenant-specific QoS levels in multi-tenant, multi-accelerator cloud environments. The focus is on guaranteeing model-specific QoS levels while considering real-time constraints.
Abstract
This paper addresses the challenge of managing Quality of Service (QoS) in cloud services by introducing a novel approach using Deep Reinforcement Learning. The goal is to ensure tenant-specific QoS levels for Deep Neural Networks (DNNs) while considering real-time constraints. The study emphasizes the importance of individual tenant expectations and varying Service Level Indicators (SLIs) in achieving fair and firm real-time scheduling.
The authors highlight the significance of SLIs, Service Level Objectives (SLOs), and Service Level Agreements (SLAs) in evaluating and maintaining QoS. They stress the need for a balanced approach to honor SLAs for all users while providing tailored QoS based on individual service requests.
The research focuses on an online scheduling algorithm for DNNs in multi-accelerator systems, specifically targeting deadline hit rates as the chosen SLI. By allowing clients to specify SLO achievement rates for each service request, the proposed method aims to prevent unfair prioritization and ensure consistent QoS levels across diverse user demands.
The study introduces a unique perspective on managing tenant-specific QoS within cloud services through Deep Reinforcement Learning. By addressing challenges related to individual variations in QoS expectations, the research contributes to more efficient and reliable scheduling practices in multi-tenant environments.
Stats
A dynamic priority assignment technique for streams with (m, k)-firm deadlines,” IEEE transactions on Computers, vol. 44, no. 12, pp. 1443–1451, 1995.
E. Russo, M. Palesi, S. Monteleone, D. Patti, G. Ascia, and V. Catania,
“Medea: A multi-objective evolutionary approach to dnn hardware mapping,” in 2022 Design, Automation & Test in Europe Conference (DATE), 2022, pp. 226–231.
Y.-H. Chen and T.-J. Yang, J. Emer, and V. Sze, “Eyeriss v2: A flexible accelerator for emerging deep neural networks on mobile devices,” IEEE Journal on Emerging and Selected Topics in Circuits and Systems, vol. 9,
T.P.Lillicrap et al., "Continuous control with deep reinforcement learning," arXiv preprint arXiv:1509.02971
Quotes
"Each user commonly has unique quality expectations aligned with their expenditure on the service."
"The proposed method contributes to fairer scheduling within multi-accelerator systems."
"The study introduces a unique perspective on managing tenant-specific QoS through Deep Reinforcement Learning."