我的论文笔记频道
【Active Learning】
【论文笔记01】Learning Loss for Active Learning, CVPR 2019
【论文笔记02】Active Learning For Convolutional Neural Networks: A Core-Set Approch, ICLR 2018
【论文笔记03】Variational Adversarial Active Learning, ICCV 2019
【论文笔记04】Ranked Batch-Mode Active Learning,ICCV 2016
【Transfer Learning】
【论文笔记05】Active Transfer Learning, IEEE T CIRC SYST VID 2020
【论文笔记06】Domain-Adversarial Training of Neural Networks, JMLR 2016
【Differential Privacy】
【论文笔记07】A Survey on Differentially Private Machine Learning, IEEE CIM 2020
原文传送
Machine learning models suffer from a potential risk of learking private information contained in training data. 机器学习虽然在很多应用中表现得很好,但是他们有可能泄露一些训练数据中地私人信息。
As one of the mainstream private-preserving techniques, differential privacy provides a promising way to prevent the leaking of individual-level privacy. 差分隐私是一种主流的隐私保护技术,有效地防护了个体层级的信息隐私。[何谓individual-level, 从后面的定义可以很清晰地理解。直观上说,就是增减一个 D D D 集合中的个体数据不会带来 M ( D ) \mathcal{M}(D) M(D)上的变化。]
This work provides a comprehensive survey on the existing works that incorporate differential privacy with maching learning and categorizes them into two broad categories.
In the former, a calibrated[标刻度的,校准的] amount of noise is added to the non-private model. And in the latter, the output or the objective function is perturbed by random noise.
Motivation
The datasets used for learning desirable models in ML methods may contain sensitive individual information. 机器学习算法用到的数据集可能有隐私信息。
Ideally, the sensitive individual information should not be leaked in the process of training ML models. 既然这些数据是含有隐私成分的,我们不希望在训练过程中泄露其中的信息。
In other words, we allow the parameters of machine learning models to learn general patterns (people who smoke are more likely to suffer from lung cancer), rather than facts about specific training samples (he had lung cancer).
Unfortunately, shallow models like support vector machine and logistic regression are capable of memorizing secret information of the training dataset. Deep models like convolutional neural networks are able to exactly memorize arbitrary labels of the training data
Attacks against machine learning
这篇综述列举了很多关于对机器学习模型进行攻击以窃取训练数据的例子。
The meaning of Differency Privacy
Why differential privacy has recently been considered as a promising strategy 很有前途 for privacy preserving in machine learning? There are roughly three major reasons:
An illustration of Differentially Private Machine Learning:
The categories of Differentially Private Machine Learning methods:
Definition 1 (Differential Privacy)
Theorem 1 (Sequential Composition Theorem)
Theorem 2 (Parallel Composition Theorem)
Definition 2 (Global Sensitivity)
Definition 3 (Local Sensitivity)
Definition 4 (Laplace Mechanism)
Definition 5 (Gaussian Mechanism)
Definition 6 (Exponential Mechanism)
The Laplace mechanism, the Gaussian mechanism and the exponential mechanism are three classical differential privacy mechanisms. The privacy of individual data can be preserved by combining the Laplace mechanism or the Gaussian mechanism or the exponential mechanism with specific machine learning algorithms.
The output and objective perturbation mechanisms are two generic differentially private methods to achieve privacy preserving.
The output perturbation mechanism is performed by adding an amount of noise to the model output while the objective perturbation mechanism can be implemented by adding noise to the objective function and optimizing the perturbed objective function.
ERM: Empirical Risk Minimization
这个方法有一体化加入理论bound分析的可能性!
Linear Regression
Distributed Optimization 分布式优化
[1] Gong, Maoguo, et al. “A survey on differentially private machine learning.” IEEE Computational Intelligence Magazine 15 (2020): 49-64. R. Shokri, M. Stronati, C. Song, and V. Shmatikov, “Membership inference attacks against machine learning models,” in Proc. IEEE Symp. Security and Privacy, San Jose, CA, May 2017. doi: 10.1109/SP.2017.41.
[2] M. Fredrikson, S. Jha, and T. Ristenpart, “Model inversion attacks that exploit confidence information and basic countermeasures,” in Proc. 22nd ACM SIGSAC Conf. Computer Communication Security, Denver, CO, Oct. 2015, pp. 1322–1333. doi: 10.1145/2810103.2813677.
[3] F. Tramèr, F. Zhang, A. Juels, M. K. Reiter, and T. Ristenpart, “Stealing machine learning models via prediction APIs,” in Proc. 25th Security Symp., Austin, TX, Aug. 2016, pp. 601–618. doi: 10.5555/3241094.3241142.
[4] M. Fredrikson, E. Lantz, S. Jha, S. Lin, D. Page, and T. Ristenpart, “Privacy in pharmacogenetics: An endtoend case study of personalized warfarin dosing,” in Proc. 23rd Security Symp., San Diego, CA, Aug. 2014, pp. 17–32.
[5] G. Ateniese, G. Felici, L. V. Mancini, A. Spognardi, A. Villani, and D. Vitali, “Hacking smart machines with smarter ones: How to extract meaningful data from machine learning classifiers,” Int. J. Secur. Netw., vol. 10, no. 3, pp. 137–150, 2015. doi: 10.1504/IJSN.2015.071829.