【阅读笔记】Differentiable plasticity: training plastic neural networks with backpropagation

Differentiable plasticity: training plastic neural networks with backpropagation

作者:
Thomas Miconi/Jeff Clune/Kenneth O. Stanley
Uber AI Labs
{tmiconi,jeffclune,kstanley}@uber.com

读后感

这篇文章是有关元学习的论文,元学习就是学习如何学习的学习。
这篇文章的核心思想是在常规的神经网络的一定的权重基础上加入了可塑性的权重,和前后的神经元是否同时放电有关。
实验的结果是加入了这个机制之后训练速度更快了。

感觉有加入了突触可塑性跟注意力机制有点神似。网络变得有记忆功能,也增加了不少的参数。

文章核心IDEA

Differentiable plasticity:

xj(t)=σ(iinputs[wi,jxi(t1)+αi,jHebbi,j(t)xi(t1)]) x j ( t ) = σ ( ∑ i ∈ i n p u t s [ w i , j x i ( t − 1 ) + α i , j H e b b i , j ( t ) x i ( t − 1 ) ] )

Hebbi,j(t+1)=ηxi(t1)xj(t)+(1η)Hebbi,j(t) H e b b i , j ( t + 1 ) = η x i ( t − 1 ) x j ( t ) + ( 1 − η ) H e b b i , j ( t )

Hebbi,j(t+1)=Hebbi,j(t)+ηxj(t)[xi(t1)xj(t)Hebbi,j(t)] H e b b i , j ( t + 1 ) = H e b b i , j ( t ) + η x j ( t ) [ x i ( t − 1 ) − x j ( t ) H e b b i , j ( t ) ]

In this way, depending on the values of wi,j w i , j and αi,j α i , j , a connection can be fully fixed (if αi,j=0 α i , j = 0 ), or fully plastic with no fixed component (if wi,j=0 w i , j = 0 ), or have both a fixed and a plastic component.

optimized by gradient descent between lifetimes (descending the gradient of the error computed during episodes), to maximize expected performance over a lifetime/episode. Note that η η , the “learning rate” of plasticity, is also an optimized parameter of the algorithm
(for simplicity, in this paper, all connections share the same value of η η , which is thus a single scalar parameter for the entire network).

实验

Pattern memorization: Binary patterns

Pattern memorization: Natural images

One-shot pattern classification: Omniglot task

Reinforcement learning: Maze exploration task

auto-associative networks:表达能力变强,训练时loss下降的更快

你可能感兴趣的:(机器学习,读书笔记)