The research aims to extend deepfake technology beyond facial manipulation to generate credible sign language videos that encompass the entire upper body, including hands and fingers. The key highlights and insights are:
Construction of a reliable deepfake dataset with over 1200 videos, featuring both previously seen and unseen individuals for the generation model. This dataset was vetted by a sign language expert.
Linguistic analysis reveals that the generated fake videos are comparable to real sign language videos, with the interpretation of a fake being at least 90% the same as the real video.
Visual analysis demonstrates that visually convincing deepfake videos can be produced, even with entirely new subjects, using a pose/style transfer model for video generation.
Machine learning algorithms were applied to establish a baseline performance on the dataset for deepfake detection, highlighting the challenges in accurately classifying real and fake sign language videos.
The sign language expert exhibited confusion in identifying the real from fake videos, further validating the credibility of the generated deepfakes.
The research makes a pioneering contribution to accelerate work in the sign language production domain and create videos that are visually believable and technically & linguistically credible to human perception.
Başka Bir Dile
kaynak içeriğinden
arxiv.org
Önemli Bilgiler Şuradan Elde Edildi
by Shahzeb Naee... : arxiv.org 04-03-2024
https://arxiv.org/pdf/2404.01438.pdfDaha Derin Sorular