Embedded Representation Learning Network for Generating Realistic and Controllable Talking Head Videos
ERLNet, a novel generative framework, can effectively generate realistic talking head videos with precise control over facial expressions and head movements by leveraging FLAME coefficients as an intermediate representation.