The proposed Min-K%++ method normalizes the token probability with statistics of the categorical distribution over the whole vocabulary, which accurately reflects the relative likelihood of the target token compared with other candidate tokens. This provides a more informative signal for detecting pre-training data compared to the existing state-of-the-art Min-K% method.
The presence or absence of prompts used to generate text significantly impacts the accuracy of likelihood-based zero-shot detectors, with white-box detection (using prompts) demonstrating a substantial increase in AUC compared to black-box detection (without prompts).