文章简介:
- 选择最好攻击的类别进行攻击
      b e s t   c l a s s   a t t a c k                                     arg min j z t − z j ∣ ∣ ∇ x ( z t − z j ) ∣ ∣                                                                                                            \,\,\,\,\,best\,class\,attack\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\underset{j}{\arg \min} \frac{z_t - z_j}{||\nabla_x(z_t-z_j)||}\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, bestclassattackjargmin∣∣∇x(zt−zj)∣∣zt−zj
- 选择最难攻击的类别进行攻击
      h a r d e s t   c l a s s   a t t a c k                             arg min j z t − z j ∣ ∣ ∇ x ( z t − z j ) ∣ ∣                                                                                                            \,\,\,\,\,hardest\,class\,attack\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\underset{j}{\arg \min} \frac{z_t - z_j}{||\nabla_x(z_t-z_j)||}\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, hardestclassattackjargmin∣∣∇x(zt−zj)∣∣zt−zj
Contributions:
Limitations:
Trust Region Method:
首先假设神经网络函数为 f f f,将其在其领域内泰勒展开
f ( x k + Δ x ) = f k + g k T Δ x + 1 2 Δ x T ∇ 2 f ( x k + Δ x ) Δ x f(x_k+\Delta x)=f_k+g_k^T\Delta x+\frac{1}{2}\Delta x^T\nabla^2f(x_k+\Delta x)\Delta x f(xk+Δx)=fk+gkTΔx+21ΔxT∇2f(xk+Δx)Δx
其中 f k = f ( x k ) ,    g k = ∇ f ( x k ) f_k=f(x_k), \,\,g_k=\nabla f(x_k) fk=f(xk),gk=∇f(xk)
然后利用 B k B_k Bk去逼近Hessian矩阵(个人认为这个逼近的意思应该为近似表达,这样可以减轻点计算开销)
m k ( Δ x ) = f k + g k T Δ x + 1 2 Δ x T B k Δ x m_k(\Delta x) = f_k + g_k^T \Delta x+ \frac{1}{2}\Delta x^T B_k \Delta x mk(Δx)=fk+gkTΔx+21ΔxTBkΔx
在每一步迭代迭代过程,都会求解下述子问题
min p ∈ R h m k ( Δ x ) = f k + g k T Δ x + 1 2 Δ x T B k Δ x \underset{p \in R^h}{\min} m_k(\Delta x) = f_k+g_k^T\Delta x + \frac{1}{2}\Delta x^T B_k \Delta x p∈Rhminmk(Δx)=fk+gkTΔx+21ΔxTBkΔx
做完铺垫,引出本文的方法,下图中<>
代表点积。
对于DeepFool而言,其解决问题的方法是通过线性放射变换来估计决策边界。对于这样一个决策边界,只需计算当前点处的梯度,就可以分析计算扰动量。然而,对于神经网络来说,这种近似可能非常不准确,也就是说,它可能导致对沿次优方向的扰动的过高/过低估计。 因为最小方向与决策边界正交,由于决策边界是非线性的,不能通过简单的仿射变换来计算。
TR方法的主要思想是迭代地选择可信半径 ϵ \epsilon ϵ,以找到该区域内的对抗扰动,使不正确类的概率达到最大值:
此外如果仔细观察,可以发现Algorithm 1中红框部分是错误的?虽然 我不太清楚到底是不是我搞错了,但如果把括号去掉的话可以发现分子是为0的?
Code:
代码中 ρ \rho ρ表达式中分子部分为
o r i _ d i f f f − a d v _ d i f f ori\_{difff} - adv\_{diff} ori_difff−adv_diff
其中
o r i _ d i f f = Z [ r a n g e ( n ) , t r u e _ i n d ] − Z [ r a n g e ( n ) , t a r g e t _ i n d ] ori\_diff = Z[range(n), true\_ind] - Z[range(n), target\_ind] ori_diff=Z[range(n),true_ind]−Z[range(n),target_ind]
a d v _ d i f f = Z a d v [ r a n g e ( n ) , t r u e _ i n d ] − Z [ r a n g e ( n ) , t a r g e t _ i n d ] adv\_diff = Z_{adv}[range(n), true\_ind] - Z[range(n), target\_ind] adv_diff=Zadv[range(n),true_ind]−Z[range(n),target_ind]
代码中 ρ \rho ρ表达式中分母部分为
ϵ             d e f a u l t = 0.001 \epsilon\,\,\,\,\,\,\,\,\,\,\, default=0.001 ϵdefault=0.001
不同算法扰动大小与攻击速度的的对比展示
从中可以看出达到相同的扰动量级时,CW算法相比于TR是比较费时的;在相近的处理时间下,DeepFool方法产生的扰动会更大。
如果觉得我有地方讲的不好的或者有错误的欢迎给我留言,谢谢大家阅读(点个赞我可是会很开心的哦)~