Exploration via Linearly Perturbed Loss Minimisation: A Study on Bandit Algorithms
The authors introduce EVILL as a method for structured stochastic bandit problems, providing insights into the effectiveness of random reward perturbations. EVILL offers a new approach to exploration by minimizing loss functions perturbed with random linear components.