The study explores the effectiveness of greedy algorithms compared to UCB in multi-armed bandit scenarios. It highlights the importance of subsampling and free exploration, showing that SS-Greedy performs exceptionally well. The research indicates that Greedy benefits from a large number of arms, leading to low regret rates. Simulations with real data support these findings, showcasing the superior performance of SS-Greedy over other algorithms. The study also delves into contextual settings, demonstrating the robustness of insights across different scenarios.
Introduction
Lower Bound and an Optimal Algorithm
A Greedy Algorithm
Simulations
Generalizations
Іншою мовою
із вихідного контенту
arxiv.org
Ключові висновки, отримані з
by Mohsen Bayat... о arxiv.org 03-21-2024
https://arxiv.org/pdf/2002.10121.pdfГлибші Запити