The paper studies the effectiveness of regression trees, a popular non-parametric method in machine learning, and the subagging (subsample aggregating) technique for improving their performance.
Key highlights:
Pointwise consistency of regression trees: The authors establish sufficient conditions for pointwise consistency of regression trees, showing that the bias depends on the diameter of cells and the variance depends on the number of observations in cells. They provide an algorithm that satisfies the consistency assumptions by controlling the minimum and maximum number of observations in each cell.
Bias-variance trade-off and tree size: The authors illustrate the bias-variance trade-off associated with tree size through simulations. Small trees tend to be biased but have low variance, while large trees are unbiased but have high variance. Trees grown under the consistency conditions strike a balance between bias and variance.
Subagging consistent trees: The authors show that subagging consistent (and hence stable) trees does not affect the bias but can improve the variance, as the subagged estimator averages over more observations compared to the original tree.
Subagging small trees: The authors analyze the effect of subagging on stumps (single-split trees) as a proxy for small trees. They show that subagging increases the number of distinct observations used to estimate the target and covers a wider part of the feature space compared to a single tree. Subagging also reduces the variance around the split point, where a single tree has high variance.
Optimal tree size: The authors find that a single tree grown at the optimal size can outperform subagging if the size of its individual subtrees is not optimally chosen. This suggests that subagging large trees is not always a good idea, and that the optimal size for the ensemble method should be determined based on the optimal size for a single tree.
다른 언어로
소스 콘텐츠 기반
arxiv.org
더 깊은 질문