Core Concepts
This research explores and optimizes best-arm identification algorithms for unimodal bandits, focusing on minimizing sample complexity while maintaining computational efficiency, particularly in the fixed-confidence setting.
Stats
For Gaussian distributions with unit variance, the characteristic time T⋆(µ) is approximately equal to the sum of the inverse squared gaps between the means of the best arm and its neighbors: T⋆(µ) ≈ Σi∈N(⋆)(µ⋆−µi)−2.
The ratio T⋆1/2(µ)/T⋆(µ), representing the efficiency of the UniTT algorithm, lies numerically in the range (1, r2] with r2 ≈ 1.03.