Secure and Efficient Private Inference of Deep Neural Network Models using Layer Partitioning and Trusted Execution Environments
A framework for secure and efficient private inference of deep neural network models by partitioning the model layers between a trusted execution environment (TEE) and a GPU accelerator, balancing privacy preservation and computational efficiency.