Generating vivid and emotional 3D co-speech gestures with emotion transitions is crucial for human-machine interaction applications.