M-BEST-RQ: A Multi-Channel Speech Foundation Model for Wearable Devices
M-BEST-RQ is a multi-channel speech foundation model designed to leverage large-scale self-supervised learning for tasks on wearable devices such as smart glasses, enabling array-geometry agnostic representations and strong performance across multiple downstream applications.