핵심 개념
Federated Learning systems can enhance data transparency and model trustworthiness through data provenance and cryptographic techniques.
초록
Abstract:
Federated Learning (FL) offers decentralized model training while preserving data privacy.
Challenges in ensuring data integrity and transparency in distributed environments.
Introduction:
Background on FL and its evolution to address data privacy concerns.
Advancements in FL frameworks and algorithms for decentralized learning.
Motivation:
Concerns about data provenance and model transparency in FL systems.
Attacks on FL systems highlight the need for enhanced security measures.
Proposed Approach:
Data provenance and model transparency through a data-decoupled FL architecture.
Use of cryptographic hashing for integrity and reproducibility.
Contributions:
Model provenance in databases and chained hashing for training verifiability.
Preliminaries and Related Work:
Overview of FL, data privacy, and demanding model transparency.
Proposed Methodology:
Data-decoupled FL architecture, model snapshots, and chained hashing for integrity.
Evaluation:
Experimental setup with different datasets and model architectures.
Benchmarking FL using single-node simulation and CloudLab testing.
FL Transparency:
Baseline and multithreaded provenance analysis for ResNet-18 and Vision Transformer models.
FL Reproducibility:
Cryptographic hash feature analysis for reducing overhead in model training.
통계
데이터 투명성을 향상시키기 위해 암호화 해싱을 사용하여 오버헤드를 감소시킴.
CIFAR10 및 MNIST 데이터셋에서 암호화 해싱 삽입은 오버헤드를 3%로 감소시킴.
CelebA 데이터셋에서 암호화 해싱 삽입은 약 44%의 오버헤드를 감소시킴.
인용구
"Our findings show that our system can greatly enhance data transparency in various FL environments by storing chained cryptographic hashes and client model snapshots in our proposed design for data decoupled FL."
"Extensive experimental results suggest that integrating a database subsystem into federated learning systems can improve data provenance in an efficient manner, encouraging secure FL adoption in privacy-sensitive applications."