Multilingual Turn-taking Prediction Using Voice Activity Projection
The author explores the effectiveness of a multilingual voice activity projection model for turn-taking prediction in spoken dialogues, highlighting the importance of language-specific training and prosodic cues.