Leveraging Large Language Models for Semantic Table Profiling to Enhance Data Quality Analysis
Cocoon, a data profiling system that integrates Large Language Models (LLMs) to imbue statistical profiling with semantics, enhances traditional profiling methods by adding a three-step process: Semantic Context, Semantic Profile, and Semantic Review, to accurately discern whether data anomalies are genuine errors or acceptable variations based on the semantics for real-world datasets.