Core Concepts
This work proposes an intervention-assisted framework that combines the learning power of neural networks with the guaranteed stability of classical control policies to enable online deep reinforcement learning for stochastic queuing network optimization.
Abstract
The content discusses the challenges of applying deep reinforcement learning (DRL) to stochastic queuing network (SQN) control tasks in an online setting, where an intelligent agent directly interacts with the real-world environment and learns an optimal control policy through these online interactions.
Key highlights:
Traditional DRL methods rely on offline simulations or static datasets, limiting their real-world application in SQN control.
SQNs present a challenge for online DRL due to the unbounded nature of the queues within the network, resulting in an unbounded state-space. Neural networks are poor at extrapolating to unseen states in unbounded state-spaces.
To address this challenge, the authors propose an intervention-assisted framework that leverages strategic interventions from known stable policies to ensure the queue sizes remain bounded.
This framework combines the learning power of neural networks with the guaranteed stability of classical control policies for SQNs.
The authors introduce a method to design these intervention-assisted policies to ensure strong stability of the network.
They extend foundational DRL theorems for intervention-assisted policies and develop two practical algorithms specifically for online DRL of SQNs.
Experiments show that the proposed algorithms outperform both classical control approaches and prior online DRL algorithms.