【论文阅读】Where Did You Learn That From? Surprising Effectiveness of Membership Inference Attacks Agains

一.论文信息

题目:Where Did You Learn That From? Surprising Effectiveness of Membership Inference Attacks Against Temporally Correlated Data in Deep Reinforcement Learning(成员推理攻击在深度强化学习中对时间相关数据的惊人有效性)
**发表年份:**2021
**会议:**Axriv
**论文链接:**https://arxiv.org/abs/2109.03975
**作者信息:**Maziar Gomrokchi,Susan Amin,Hossein Aboutalebi

二.论文结构

【论文阅读】Where Did You Learn That From? Surprising Effectiveness of Membership Inference Attacks Agains_第1张图片

三.论文内容

摘要

While significant research advances have been made in the field of deep reinforcement learning, a major challenge to widespread industrial adoption of deep reinforcement learning that has recently surfaced but little explored is the potential vulnerability to privacy breaches. In particular, there have been no concrete adversarial attack strategies in literature tailored for studying the vulnerability of deep reinforcement learning algorithms to membership inference attacks.
To address this gap, we propose an adversarial attack framework tailored for testing the vulnerability of deep reinforcement learning algorithms to membership inference attacks. More specifically, we design a series of experiments to investigate the impact of temporal correlation, which naturally exists in reinforcement learning training data, on the probability of information leakage. Furthermore, we study the differences in the performance of collective and individual membership attacks against deep reinforcement learning algorithms. Experimental results show that the proposed adversarial attack framework is surprisingly effective at inferring the data used during deep reinforcement training with an accuracy exceeding 84% in individual and 97% in collective mode on two different control tasks in OpenAI Gym, which raises serious privacy concerns in the deployment of models resulting from deep reinforcement learning. Moreover, we show that the learning state of a reinforcement learning algorithm significantly influences the level of the privacy breach.

摘要中文版

虽然在深度强化学习领域取得了重大的研究进展,但最近浮出水面但很少探索的深度强化学习在工业上广泛采用的一个主要挑战是隐私泄露的潜在脆弱性。特别是,没有具体的对抗性攻击文献中针对研究深度强化学习算法对成员推理攻击的脆弱性而量身定制的策略。
为了解决这一差距,我们提出了一种对抗性攻击框架,专门用于测试深度强化学习算法对成员推理攻击的脆弱性。更具体地说,我们设计了一系列实验研究了强化学习训练数据中自然存在的时间相关性对信息泄露概率的影响。此外,我们研究了针对深度强化学习的集体和个人成员攻击的性能差异实验结果表明,所提出的对抗性攻击框架在推断深度强化训练期间使用的数据方面非常有效,在 OpenAI Gym 中的两个不同控制任务中,个人准确率超过 84%,集体模式下准确率超过 97%,这提高了严重的隐私性深度强化学习导致模型部署的担忧。此外,我们表明强化学习算法的学习状态显着影响隐私泄露的程度。

相关工作

  • 成员推理攻击

结论

1.与单个成员推理攻击相比,强化学习在集体环境中明显更容易受到成员推理攻击。
2.由环境设定的最大轨迹长度对深度强化学习模型中使用的训练数据是否容易受到成员推理攻击起着重要作用。
3.揭示了时间相关性在攻击训练中的作用,以及攻击者在多大程度上可以利用这些信息来设计针对深度强化学习的高精度成员推理攻击。

你可能感兴趣的:(论文阅读,人工智能,机器学习,安全)