核心概念
Introducing a novel framework for robust hypothesis testing using Sinkhorn uncertainty sets to optimize detectors efficiently.
摘要
The article introduces a new framework for non-convex robust hypothesis testing, focusing on optimizing detectors to minimize type-I and type-II errors. It discusses the challenges in existing methods and proposes an exact mixed-integer exponential conic reformulation for global optimization. The article also explores the connection between robust hypothesis testing and regularized risk functions, highlighting the computational efficiency and statistical performance of the proposed framework through a numerical study.
-
Introduction
- Hypothesis testing is fundamental in statistics.
- Distributionally robust hypothesis testing addresses challenges with unknown true distributions.
-
Robust Detector Construction
- Uncertainty sets are crucial for computational tractability and performance.
- Existing approaches construct uncertainty sets using various statistical divergences.
-
Sinkhorn Discrepancy
- Utilizing Sinkhorn discrepancy-based uncertainty sets for robust hypothesis testing.
- Proposed framework balances computational efficiency with statistical performance.
-
Optimization Methodology
- Exact mixed-integer conic formulation introduced for solving non-convex problems.
- Convex approximation method discussed as an alternative approach.
-
Regularization Effects
- Regularized formulations of non-robust risk functions explored under different scaling regimes.
-
Numerical Experiments
- Performance evaluation on synthetic and real datasets showcasing superiority over baseline methods.
统计
"The maximum of type-I/type-II error is 0.165, with computational time 264.1s."
"Training Size: 50, Testing Size: 2115, Data Dimension: 784 (MNIST)"
"Training Size: 50, Testing Size: 2000, Data Dimension: 1024 (CIFAR-10)"
"Training Size: 12, Testing Size: 10, Data Dimension: 56 (Lung Cancer)"
"Training Size: 20000, Testing Size: 3662, Data Dimension: 39 (Sepsis)"
引用
"Our proposed framework balances the trade-off between computational efficiency and statistical testing performance."
"Sinkhorn discrepancy-based uncertainty set has received great attention due to its data-driven nature and flexibility."
"The contributions include obtaining finite-dimensional optimization reformulation under the random feature model."