핵심 개념
Adapting the nearest neighbour rule to contextual bandits leads to an efficient algorithm with no assumptions about data generation.
초록
Introduction:
Adapting the nearest neighbour rule to contextual bandits for efficient algorithm development.
No assumptions about data generation process.
Results:
Generic regret bounds for the algorithm.
Application to stochastic bandit problem in euclidean space.
Bandits in a Metric Space:
Utilizing a data-structure for adaptive nearest neighbour search.
Application of the algorithm to metric bandit problem.
Stochastic Bandits in Euclidean Space:
Utilizing algorithms for stochastic bandits in [0, 1]d space.
Regret scaling in well-separated clusters.
Notation:
Definitions of sets, functions, and metrics used in the paper.
The Algorithm:
Description of the CBNN algorithm for solving the similarity bandit problem.
Online Belief Propagation:
Efficient computation of the function θt in the algorithm.
Acknowledgments:
Funding and support for the research.
통계
알고리즘은 다음을 수행합니다.
알고리즘은 다음을 계산합니다.
알고리즘은 다음을 유지합니다.
인용구
"Our algorithm handles the fully adversarial setting with no assumptions about the data-generation process."
"Our algorithm is extremely efficient with per-trial running time polylogarithmic in both the number of trials and actions."