Belangrijkste concepten
CSForest optimizes set-valued predictions for inliers and outliers under distributional shifts, outperforming state-of-the-art methods.
Samenvatting
The article introduces CSForest, a semi-supervised random forest classifier that enhances prediction accuracy and outlier detection by incorporating unlabeled test samples. It addresses distributional shifts in training and test data, providing a theoretical guarantee for true label coverage. Extensive experiments demonstrate CSForest's effectiveness in predicting inliers and detecting outliers across various datasets.
Introduction
Random Forests classifier assumes training and test samples from the same distribution.
Challenges in safety-critical scenarios due to discrepancies between training and test sets.
Conformal Prediction
Conformalization technique Jackknife+aB used for set-valued prediction C(x).
CSForest employs unlabeled test samples for enhanced accuracy and outlier detection.
Comparison with State-of-the-Art Methods
CSForest compared with synthetic examples and real-world datasets.
Highlights effective prediction of inliers and outlier detection unique to test data.
Related Work
Distribution Shift and Generalized Label Shift model discussed.
Comparison with BCOPS, DC, CRF, ACRF, and ACRFshift.
Experiments
Synthetic data and real-world data evaluations conducted.
Performance compared under different distributional shift settings.
Varying Sample Sizes
Comparison of methods under varying sample sizes.
CSForest and BCOPS outperform other methods for outlier detection.
Discussion
Future directions include exploring outlier detection with limited test samples.
Potential extension of CSForest under adversarial settings.
Statistieken
CSForest는 set-valued 예측을 최적화하여 inlier와 outlier를 효과적으로 처리합니다.
CSForest는 분포 변화에 대응하여 효과적인 예측을 제공하며 최신 기법을 능가합니다.
Citaten
"CSForest couples conformalization technique with semi-supervised tree ensembles for set-valued predictions."
"CSForest demonstrates robustness in covering true labels under varying degrees of data drift."