insight - Regularized Best-of-N sampling for language model alignment
No data
No data