The author presents WIMBD, a platform for analyzing large text corpora, revealing insights on data quality, benchmark contamination, and personally identifiable information.