Kernkonzepte
The authors propose a structured reinforcement learning solution called mmDPT-TS to efficiently solve the delay-optimal data packet transmission problem in dense mmWave networks, which is formulated as a restless multi-armed bandits problem with fairness constraints (RMAB-F).
Zusammenfassung
The authors study the data packet transmission problem (mmDPT) in dense cell-free millimeter wave (mmWave) networks, where users send data packet requests to access points (APs) via uplinks and APs transmit the requested data packets to users via downlinks. The objective is to minimize the average delay in the system due to APs' limited service capacity and unreliable wireless channels between APs and users.
The authors first formulate the mmDPT problem as a restless multi-armed bandits problem with fairness constraints (RMAB-F). Since finding the optimal policy for RMAB-F is intractable, the authors propose a structured reinforcement learning (RL) solution called mmDPT-TS.
The key contributions are:
The authors design a low-complexity and provably asymptotically optimal index policy for RMAB-F, called the mmDPT Index Policy.
The authors leverage the structure of the mmDPT Index Policy to develop a structured RL algorithm called mmDPT-TS, which provably achieves an optimal sub-linear Bayesian regret with low computational complexity.
The authors build a 60GHz mmWave testbed and conduct extensive evaluations, demonstrating significant performance gains of mmDPT-TS over existing approaches.
Statistiken
The probability of successfully delivering 1, 2, 3, 4 packets over frames in synthetic traces for some users is provided in a table.