The content introduces a novel approach for first-shot unsupervised anomalous sound detection by leveraging metadata and audio generation. The proposed method, FS-TWFR-GMM, optimizes the hyperparameter r to distinguish between normal and abnormal sounds effectively. By synthesizing machine sounds and fine-tuning models, the approach shows promising results in detecting unseen anomalies in new machine types.
The paper addresses challenges in adapting existing ASD methods to first-shot tasks due to the lack of anomaly data for target machines. By utilizing text-to-audio generation models and TWFR-GMM algorithms, the proposed framework estimates unknown anomalies efficiently. The experiments demonstrate competitive performance compared to top systems in the DCASE 2023 Challenge Task 2 while requiring significantly fewer resources.
The study highlights the importance of leveraging all available training data, including metadata and sound information, to improve anomaly detection accuracy. By fine-tuning models with synthetic data and optimizing hyperparameters, the proposed method achieves effective anomaly detection even without real abnormal sound data for target machines.
Sang ngôn ngữ khác
từ nội dung nguồn
arxiv.org
Thông tin chi tiết chính được chắt lọc từ
by Hejing Zhang... lúc arxiv.org 03-12-2024
https://arxiv.org/pdf/2310.14173.pdfYêu cầu sâu hơn