Core Concepts
Modern lossy compression methods can achieve a 50-100× compression ratio improvement for a 1% or less loss in quality, guiding the future use and design of lossy compressors for ML/AI.
Abstract
The content delves into the impact of lossy compression on Machine Learning and Artificial Intelligence training sets. It introduces a systematic methodology for evaluating data reduction techniques, showcasing that modern lossy compression methods can significantly improve compression ratios with minimal quality loss. The study covers various applications, error-bounded compression methods, and insights for practitioners and compressor designers.
Structure:
Introduction to ML/AI in HPC applications requiring vast data volumes.
Importance of data reduction techniques like compression.
Methodology for evaluating 17+ data reduction methods on 7 ML/AI applications.
Results showing significant improvements in compression ratios with minimal quality loss.
Insights on the effectiveness of error-bounded compressors and value range relative error bounds by column.
Performance evaluation and scalability considerations for parallel compression.
Stats
Modern lossy compression methods can achieve a 50-100× improvement in compression ratio for a 1% or less loss in quality.