Core Concepts
Encouraging innovation in audio-driven machine learning through the provision of specialized datasets and benchmarks.
Abstract
The NeurIPS 2023 Machine Learning for Audio Workshop aims to address the scarcity of specialized audio datasets by providing resources like HUME-PROSODY, HUME-VOCALBURST, MODULATE-SONATA, and MODULATE-STREAM. These datasets offer opportunities for researchers to explore various tasks such as emotion recognition, vocal burst classification, speech generation, and unsupervised audio-driven tasks. The workshop establishes baselines and encourages collaboration to foster innovation in audio-driven machine learning.
Directory:
Introduction:
Unique challenges of working with audio data in machine learning compared to other fields like computer vision.
Recent renaissance in audio research with a focus on synthesis.
Workshop Audio Datasets:
Overview of datasets provided: HUME-PROSODY, HUME-VOCALBURST, MODULATE-SONATA, and MODULATE-STREAM.
Description of each dataset's content and purpose.
Current Baselines, and Machine Learning Tasks:
Tasks assigned to each dataset: Emotion Share Sub-Challenge, ExVo Multi-Task Learning, ExVo Emotion Generation, ExVo Few-Shot Emotion Recognition.
Initial baseline results for each task presented at the workshop.
Summary and Conclusions:
Efforts made by the workshop to encourage innovation in audio-driven machine learning through specialized datasets and benchmarks.
Stats
"The NeurIPS 2023 Machine Learning for Audio Workshop brings together machine learning (ML) experts from various audio domains."
"There are several valuable audio-driven ML tasks from speech emotion recognition to audio event detection."
"A major limitation with audio is the available data; high-quality data collection is time-consuming and costly."
"To encourage researchers with limited access to large-datasets, the organizers first outline several open-source datasets that are available."
Quotes
"The relative scarcity of prior research and this recent boom serves as the primary motivations behind organizing the 2023 NeurIPS Machine Learning for Audio (MLA) Workshop."
"Despite the availability of such datasets, there still exists a scarcity of openly accessible large-scale datasets particularly tailored for more specialized domains."