Core Concepts
The proposed DeepSC-SR system learns and extracts text-related semantic features from speech signals, enabling efficient transmission and recovery of text transcriptions at the receiver without performance degradation.
Abstract
The paper presents a novel deep learning-enabled semantic communication system, named DeepSC-SR, for speech recognition. The key highlights are:
DeepSC-SR is designed as an end-to-end system that jointly optimizes the semantic encoder and channel encoder/decoder. The semantic encoder uses CNN and BRNN modules to learn and extract text-related features from the input speech spectrum, which are then transmitted over the wireless channel.
At the receiver, the channel decoder recovers the text features, which are then decoded into the final text transcription using a greedy decoder. The system is trained end-to-end by minimizing the CTC loss.
To enable robust performance across different channel conditions, DeepSC-SR is trained under a fixed channel condition and then shown to adapt well to various testing channel environments without retraining.
Simulation results demonstrate that DeepSC-SR outperforms traditional communication systems in terms of character error rate (CER) and word error rate (WER), especially in the low SNR regime. It also exhibits better robustness to channel variations compared to the benchmark systems.
The proposed DeepSC-SR system provides an efficient semantic communication solution for speech recognition, transmitting only the necessary text-related features while maintaining high recognition accuracy, and adapting well to dynamic channel conditions.
Stats
The simulation results show that the proposed DeepSC-SR system achieves lower CER and WER scores compared to the traditional speech transceiver and text transceiver systems under both AWGN and Rayleigh fading channels.
Quotes
"DeepSC-SR obtains lower CER scores than the speech transceiver and text transceiver under all tested channel environments."
"DeepSC-SR performs steadily when coping with dynamic channels and SNRs while the performance of two benchmarks is quite poor under dynamic channel conditions."
"DeepSC-SR significantly outperforms the benchmarks in the low SNR regime."