The key insights are:
The MSR can often be significantly smaller than the K1 calibration error, despite their linear relationship in the worst-case. This allows us to bypass the Ω(T^0.528) lower bound for the K1 calibration error.
We establish a general lemma (Lemma 5.2) that attributes MSR to bucket-wise biases. This lemma plays a crucial role in our analysis.
We show that the guarantee |b̂qi - qi| ≤ O(1/√ni) can be approximately achieved in the online binary prediction setting, using a refinement of the result from Noarov et al. (2023).
Combining Lemma 5.2 with the bound on |b̂qi - qi|, we obtain the final O(√T log T) expected MSR guarantee.
Our algorithm works in the standard online binary prediction setting. In each round t, the algorithm makes a prediction pt ∈ [0, 1] and the adversary reveals the true state θt ∈ {0, 1}. Both pt and θt can depend on the past history, but they cannot depend on each other. This allows our algorithm to leverage the power of randomization.
На другой язык
из исходного контента
arxiv.org
Дополнительные вопросы