Core Concepts
AssetHarvester, a static analysis tool, can detect the assets (e.g., database credentials, API keys) protected by secrets in software artifacts, aiding developers in prioritizing secret removal efforts.
Abstract
The paper presents AssetHarvester, a static analysis tool that can detect assets protected by secrets in software artifacts. The key highlights are:
The authors identified four secret-asset co-location patterns in the source code, which form the basis for AssetHarvester's approaches.
AssetHarvester employs three approaches to detect secret-asset pairs: pattern matching, data flow analysis, and fast-approximation heuristics. The data flow analysis approach achieved 100% precision in detecting secret-asset pairs.
The authors curated a benchmark dataset, AssetBench, containing 1,791 secret-asset pairs extracted from 188 public GitHub repositories. They evaluated AssetHarvester against AssetBench and achieved an overall precision of 97%, recall of 90%, and F1-score of 94%.
The authors discuss how AssetHarvester can be extended to detect non-database assets (e.g., API keys, private keys) protected by secrets, in addition to the database assets covered in this study.
The authors highlight that data flow analysis can improve the recall of existing secret detection tools by identifying secrets that are missed by regex-based approaches.
Stats
"GitGuardian monitored secrets exposure in public GitHub repositories and reported that developers leaked over 12 million secrets (database and other credentials) in 2023, indicating a 113% surge from 2021."
"Existing secret detection tools demonstrate a precision of less than 7% and a recall of only 3%, leading developers to ignore the reported warnings."
Quotes
"Each secret protects assets of different values accessible through asset identifiers (a DNS name and a public or private IP address). The asset information for a secret can aid developers in filtering false positives and prioritizing secret removal from the source code."
"Data flow analysis employed in AssetHarvester detects secret-asset pairs with 0% false positives and aids in improving the recall of secret detection tools."