Proposing speech-dependent models for accurate in-ear own voice simulation in hearables.
Self-motioning human requires specialized SELD dataset and multi-modal system for improved performance.
The author aims to address the limitations of conventional SELD systems by introducing a 6DoF SELD Dataset for wearable systems, considering self-motion. The proposed multi-modal system combines audio and motion tracking sensor signals to improve SELD performance.