The authors propose a methodology for generating a synthetic face image dataset that captures a broader spectrum of facial diversity compared to existing datasets. The key aspects of the methodology are:
Attribute Collection and Filtering: The authors compile a list of terms representing various attributes beyond just demographics and biometrics, such as hairstyle, accessories, and makeup. These terms are carefully selected and filtered to eliminate specific words or phrases.
Combinations of Attributes: The authors create meaningful combinations of the collected attributes to generate diverse face images.
Prompt Formulation: The attribute combinations are used to formulate prompts that guide a state-of-the-art text-to-image model (Stable Diffusion) in generating the face images.
Diffusion Process: The authors use a Denoising Diffusion Probabilistic Model (DDPM) for the text-to-image generation, specifically the Stable Diffusion version 2.1 model, along with appropriate scheduling and guidance parameters.
The resulting SDFD dataset contains 1000 high-quality, realistic face images that cover a wide range of diversity in terms of race, gender, age, hairstyle, accessories, and other attributes. The authors compare SDFD with existing datasets, FairFace and LFW, in terms of image classification performance and spatial distribution of the images. The results show that SDFD is equally or more challenging for classification tasks while being much smaller in size, making it a suitable evaluation set for AI systems.
The authors also discuss the challenges encountered during the generation process, such as the inability to apply certain attributes in the final images and the potential for perpetuating stereotypes. They outline plans for future work to address these issues and further expand the dataset.
To Another Language
from source content
arxiv.org
Önemli Bilgiler Şuradan Elde Edildi
by Georgia Balt... : arxiv.org 04-29-2024
https://arxiv.org/pdf/2404.17255.pdfDaha Derin Sorular