Core Concepts
The author exposes the statistical invalidity of Post-Selections in machine learning, revealing that traditional cross-validation methods do not rescue from misconduct.
Abstract
The content delves into the theoretical analysis of deep learning misconduct, focusing on the flaws of Post-Selections and the inadequacy of cross-validation to rectify them. The author highlights the statistical shortcomings and ethical implications of selecting only the luckiest models while hiding errors, emphasizing the need for reporting all trained networks' errors for a more accurate evaluation. The discussion extends to social issues, questioning the validity of post-selection practices in broader contexts like national development. Overall, the paper challenges existing methodologies and calls for a more comprehensive approach to evaluating model performance.
Stats
Almost all machine learning methods are rooted in cheating and hiding bad-looking data.
Authors must report at least the average error of all trained networks on the validation set.
Cross-validation for data splits is insufficient to exonerate Post-Selections in machine learning.
Quotes
"Post-Selection breaks the wall between data and models."
"The luckiest network on V does not likely translate to a future test T."
"NNWT and PGNN can give a zero validation error with input cross-validation."