核心概念
Active learning strategy using Wasserstein distance and GroupSort neural networks for regression problems.
摘要
The paper introduces a new active learning strategy for regression using the Wasserstein distance and GroupSort neural networks. It focuses on distribution matching, uncertainty-based sampling, and diversity to improve estimation accuracy. The study compares this method with other classical and recent solutions, showing its effectiveness in achieving precise estimations faster.
- Introduction
- Challenges in data collection for machine learning.
- Importance of active learning in reducing labeling costs.
- Active Learning Framework
- Estimating unknown functions with labeled and unlabeled data subsets.
- Utilizing an estimator belonging to a class of neural networks.
- Wasserstein Distance
- Definition of the Wasserstein distance for probability measures on metric spaces.
- GroupSort Neural Networks
- Introduction to GroupSort activation function and neural network architecture.
- Theoretical Foundations
- Assumptions about mathematical background for the approach.
- Training the Estimator
- Minimizing error risk with Lipschitz functions and loss function minimization.
- Minimizing Uncertainty and Query Procedure
- Construction of score function based on Wasserstein distance and uncertainty-based method.
- Numerical Experiments
- Comparison of different models on various datasets using RMSE metrics.
统计
25%のデータがラベル付けされたときのRMSEは、WARがBostonで3.63、Airfoilで8.67、Energy1で2.65、Yachtで2.71、Concreteで6.15です。