Regression trees can be made pointwise consistent by controlling the minimum number of observations in each cell. This leads to a bias-variance trade-off associated with tree size, where small trees are biased but have low variance, while large trees are unbiased but have high variance. Subagging can improve the variance of consistent trees without affecting the bias.
Prediction intervals are crucial for quantifying uncertainty in regression problems, but ensuring their validity and calibration is challenging. This study reviews and compares four main classes of methods - Bayesian, ensemble, direct interval estimation, and conformal prediction - to construct well-calibrated prediction intervals without being overly conservative.
This paper presents the first in-depth study of H-consistency bounds for regression, establishing non-asymptotic guarantees for the squared loss with respect to various surrogate regression losses such as Huber loss, ℓp losses, and squared ε-insensitive loss. The analysis leverages new generalized theorems for establishing H-consistency bounds.