The content introduces a novel approach for first-shot unsupervised anomalous sound detection by leveraging metadata and audio generation. The proposed method, FS-TWFR-GMM, optimizes the hyperparameter r to distinguish between normal and abnormal sounds effectively. By synthesizing machine sounds and fine-tuning models, the approach shows promising results in detecting unseen anomalies in new machine types.
The paper addresses challenges in adapting existing ASD methods to first-shot tasks due to the lack of anomaly data for target machines. By utilizing text-to-audio generation models and TWFR-GMM algorithms, the proposed framework estimates unknown anomalies efficiently. The experiments demonstrate competitive performance compared to top systems in the DCASE 2023 Challenge Task 2 while requiring significantly fewer resources.
The study highlights the importance of leveraging all available training data, including metadata and sound information, to improve anomaly detection accuracy. By fine-tuning models with synthetic data and optimizing hyperparameters, the proposed method achieves effective anomaly detection even without real abnormal sound data for target machines.
翻譯成其他語言
從原文內容
arxiv.org
深入探究