核心概念
Studying stochastic bandits with noisy contexts using Thompson Sampling and information-theoretic tools.
摘要
The content discusses the application of Thompson Sampling in stochastic linear contextual bandits with noisy contexts. It introduces a modified algorithm and analyzes Bayesian cumulative regret. The article covers decision-making under uncertainty, challenges of noisy contexts, related works, motivation, problem settings, and novel approaches. It provides comparisons with existing algorithms and empirical demonstrations.
统计
Bayesian cumulative regret scales as O(d√T) for d-dimensional Gaussian bandits with Gaussian context noise.
Information-theoretic regret bounds are derived for the proposed TS algorithm.
Comparison of regret bounds with state-of-the-art algorithms is provided in Table I.
引用
"Decision-making in the face of uncertainty is a widespread challenge found across various domains."
"Recent efforts have been made to develop CB algorithms tailored to noisy context settings."