المفاهيم الأساسية
A novel methodology that combines SHAP-based feature importance analysis with an optimal epsilon technique to generate highly effective and precise adversarial samples that can evade machine learning models.
الملخص
The paper introduces a comprehensive methodology for conducting evasion attacks on machine learning models. The key components are:
-
Feature Importance Analysis using SHAP:
- Applies SHAP (SHapley Additive exPlanations) to analyze the impact of individual features on model predictions in both binary and multiclass classification scenarios.
- Generates global and local visualizations to understand feature importance and their directional influence on the model.
-
Feature Impact Categorization and Conversion:
- Categorizes feature impacts as 'Low', 'Medium', or 'High' based on predefined thresholds.
- Categorizes SHAP values as 'positive', 'neutral', or 'negative' based on their sign.
- Constructs a SHAP summary dictionary to capture the impact of features for each class.
- Determines feature modifications necessary to convert from one class to another.
-
Optimal Epsilon Technique:
- Introduces a novel binary search-based approach to systematically determine the minimum epsilon (perturbation magnitude) required for successful evasion.
- Iteratively generates adversarial samples, evaluating their effectiveness on the target model to find the optimal epsilon.
The methodology is evaluated on the Iris and Bank Marketing datasets across various machine learning models, demonstrating its effectiveness in generating precise adversarial samples and exposing model vulnerabilities. The paper also includes a comparative analysis against established adversarial attack methods.
الإحصائيات
"The accuracy of the models decreases significantly with the application of the evasion attacks. For the Iris dataset, the accuracy drops from 0.92 to 0.00 for the SVM model with RBF kernel, from 0.96 to 0.00 for XGBoost, and from 0.94 to 0.00 for Logistic Regression."
"For the Bank Marketing dataset, the accuracy drops from 0.91 to 0.03 for the SVM model with RBF kernel, from 0.96 to 0.00 for XGBoost, and from 0.91 to 0.00 for Logistic Regression."
اقتباسات
"The novel approach of systematically integrating SHAP-based feature importance analysis into the evasion attack process, allowing for targeted manipulation of the most influential features, leading to more efficient and effective attacks."
"Introduction of a novel and systematic technique for determining the minimum epsilon needed for successful evasion through a binary search-based approach, enhancing the precision of adversarial sample generation and providing a nuanced understanding of model robustness."