Deep Leakage from Gradients

Venue: NeurIPS 2019
Authors: Ligeng Zhu, Zhijian Liu, and Song Han

Introduction.
As deep learning is being used across a variety of fields and at scale, new problems are emerging. In the case of distributed learning were sharing training data is not an option, different participants perform a local training on local data with a shared global model, and communicate to other participants just the gradients. The gradients are then averaged and applied to the global model. This technique is also very similar to federated learning. In such a setting, a malicious attacker can pry on the gradients and try to extract information about the data based on which the gradients were produced. This paper proposes an attack, called deep leakage from gradients (DLG), which can recover pixel-level information for image classification tasks, and token-level information for text NLP tasks just from the gradients. They also propose two defense techniques that their attack cannot break.

Deep Leakage from Gradients.
There have been few prior works that discuss the possible ways to extract information about the training data by manipulating the gradients. But according to the authors of this paper, none of those attacks can extract pixel-level or token-level (the exact training data) information about the data. The key idea of this attack is that matching the gradients will help match the data. First, dummy inputs and labels are generated, and training is performed (on the same global model) to generated 'dummy gradients'. These dummy gradients are compared with the original gradients that were shared. Next, the difference between the original and dummy gradients is calculated, and the dummy inputs are optimized with the objective being to reduce the distance between original and dummy gradients. This optimization can be performed with traditional SGD. After many optimizing iterations, the difference in gradients gets reduced significantly, and the dummy inputs would look very close to the original input used to train. This technique does not require any prior knowledge about the input data or label in order to carry out the attack.

Results.
They compare DLG with a work that uses GANs to extract information about the training data. They show a few example images in the paper comparing the image recovered by DLG and the baseline. For the images shown, the baseline seem to work only for MNIST, and for datasets like CIFAR and SVHN, the recovered image has lots of distortions/noise in it. Also empirically they compare the mean squared error, and DLG achieves MSE of less than 0.03, whereas the baseline achieved an MSE of grater than 0.2. They explore three techniques to defend against this attack. First is to add Gaussian/Laplacian noise to the gradients, and it seems to work well. With a noise of variance higher than 10^-2, the image recovered by DLG is completely full of distortion and nowhere close to the original image. The second technique is quantization. They try both IEEE FP16 and BFLOAT16, and this technique doesn't seem to help with concealing the original image. The third and final technique tried is compression. As already shown in deep gradient compression, low valued gradients can be converted to zeros, and the gradients can be compressed. When the sparsity ratio is more than 20%,  DLG fails to recover the training data.

Thoughts.
The problem the authors address is also faced in Federated Learning. With respect to federated learning, only encrypted gradients are communicated. There are works that propose compression of gradients to reduce the overhead of communication. But if compression can solve both the privacy and communication problem, then this idea can be directly applied to federated learning.

Comments