A novel tabular EHR generation method, EHR-D3PM, enables both unconditional and conditional generation of realistic synthetic EHR data using a discrete diffusion model, significantly outperforming existing generative baselines on fidelity, utility and privacy metrics.
The authors present the CEHR-GPT framework for generating synthetic electronic health records (EHRs) that preserve the temporal dependencies and chronological patient timelines, enabling applications such as disease progression analysis, population estimation, and counterfactual reasoning.