RBM 受限玻尔兹曼机

目录

  • 目录
  • RBM的基于CD的快速学习算法主要步骤
  • matlab代码
  • afterthought
  • 三种RBM
    • BB
    • GB
    • BG


RBM的基于CD的快速学习算法主要步骤

cd: 对比散度

  • 输入:一个训练样本 x0; 隐层单元个数 m; 学习率 ϵ; 最大训练周期 T .
  • 输出: 连接权重矩阵 W、可见层的偏置向量 a、隐层的偏置向量 b.
  • 训练阶段:

    初始化: 令可见层单元的初始状态 v1 = x0; W、 a 和 b 为随机选取的较小数值.
    
    For t = 1, 2, · · · , T
        For j = 1, 2, · · · , m (对所有隐单元) STEP1
            计算 P (h1j = 1 | v1), 即 P (h1j = 1 | v1) = σ(bj + ∑i v1iWij);
            从条件分布 P (h1j | v1) 中抽取 h1j ∈ {0, 1}.
        EndFor
    
        For i = 1, 2, · · · , n (对所有可见单元) STEP2
            计算 P (v2i = 1 | h1), 即 P (v2i = 1 | h1) = σ(ai + ∑j Wijh1j);
            从条件分布 P (v2i | h1) 中抽取 v2i ∈ {0, 1}.
        EndFor
    
        For j = 1, 2, · · · , m (对所有隐单元) STEP3
            计算 P (h2j = 1 | v2), 即 P (h2j = 1 | v2) = σ(bj + ∑i v2iWij);
        EndFor
    
        按下式更新各个参数
        – W ← W + ϵ(P (h1. = 1 | v1)v1 T − P (h2. = 1 | v2)v2 T );
        – a ← a + ϵ(v1 − v2);
        – b ← b + ϵ(P (h1. = 1 | v1) − P (h2. = 1) | v2);
    EndFor

    其中σ(x) = 1+exp( 1 −x) 为 sigmoid激活函数

    尽管上述基于 CD 的学习算法是针对 RBM 的可见单元和隐层单元均为二值变量的情形提出的,但很容易推广到可见层单元为高斯变量、可见层和隐层单元均为高斯变量等其他情形

matlab代码

(基于二值的,和以上流程对应)
Code provided by Geoff Hinton and Ruslan Salakhutdinov

epsilonw      = 0.1;   % Learning rate for weights 
epsilonvb     = 0.1;   % Learning rate for biases of visible units 
epsilonhb     = 0.1;   % Learning rate for biases of hidden units 
weightcost  = 0.0002;   
initialmomentum  = 0.5;
finalmomentum    = 0.9;

[numcases numdims numbatches]=size(batchdata);

if restart ==1,
  restart=0;
  epoch=1;

% Initializing symmetric weights and biases. 
  vishid     = 0.1*randn(numdims, numhid);
  hidbiases  = zeros(1,numhid);
  visbiases  = zeros(1,numdims);

  poshidprobs = zeros(numcases,numhid);
  neghidprobs = zeros(numcases,numhid);
  posprods    = zeros(numdims,numhid);
  negprods    = zeros(numdims,numhid);
  vishidinc  = zeros(numdims,numhid);
  hidbiasinc = zeros(1,numhid);
  visbiasinc = zeros(1,numdims);
  batchposhidprobs=zeros(numcases,numhid,numbatches);
end

for epoch = epoch:maxepoch,
 fprintf(1,'epoch %d\r',epoch); 
 errsum=0;
 for batch = 1:numbatches,
 fprintf(1,'epoch %d batch %d\r',epoch,batch); 

%%%%%%%%% START POSITIVE PHASE 就是正向的,可见到隐层 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  data = batchdata(:,:,batch);
  poshidprobs = 1./(1 + exp(-data*vishid - repmat(hidbiases,numcases,1)));   %激活函数 o(x)= 1/(1+exp(-x))  STEP1
  batchposhidprobs(:,:,batch)=poshidprobs;
  posprods    = data' * poshidprobs;
  poshidact   = sum(poshidprobs);
  posvisact = sum(data);

%%%%%%%%% END OF POSITIVE PHASE  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  poshidstates = poshidprobs > rand(numcases,numhid);

%%%%%%%%% START NEGATIVE PHASE  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  negdata = 1./(1 + exp(-poshidstates*vishid' - repmat(visbiases,numcases,1)));  STEP3
  neghidprobs = 1./(1 + exp(-negdata*vishid - repmat(hidbiases,numcases,1)));     STEP2
  negprods  = negdata'*neghidprobs;
  neghidact = sum(neghidprobs);
  negvisact = sum(negdata); 

%%%%%%%%% END OF NEGATIVE PHASE %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  err= sum(sum( (data-negdata).^2 ));
  errsum = err + errsum;

   if epoch>5,
     momentum=finalmomentum;
   else
     momentum=initialmomentum;
   end;

%%%%%%%%% UPDATE WEIGHTS AND BIASES %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
    vishidinc = momentum*vishidinc + ...
                epsilonw*( (posprods-negprods)/numcases - weightcost*vishid);
    visbiasinc = momentum*visbiasinc + (epsilonvb/numcases)*(posvisact-negvisact);
    hidbiasinc = momentum*hidbiasinc + (epsilonhb/numcases)*(poshidact-neghidact);

    vishid = vishid + vishidinc;
    visbiases = visbiases + visbiasinc;
    hidbiases = hidbiases + hidbiasinc;

%%%%%%%%%%%%%%%% END OF UPDATES %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

  end
  fprintf(1, 'epoch %4i error %6.1f  \n', epoch, errsum); 
end;

afterthought

怎么说呢,前段时间想用RBM做语音预测恢复,看了一些RBM的论文
大概自己的想法就是呢,RBM就是个概率模型啦
根据可见层、隐层的数据分布的不同,有BB, GB, BG
B是伯努利即二值,G是高斯
然后不同的数据,对应不同的计算公式,从而对应不同的计算代码
同时恢复数据的时候也是不同的
所以啊,数学很重要啊,我是不会推的

额,最后效果不好,可能是自己代码原因,也可能是训练集原因
反正,不管我怎么RBM叠加,总是效果不好
总之,数学是科学研究的基石啊


三种RBM

poshidprobs negdata neghidprobs 这三个计算的不同了,需不需要激活函数的差别,同理,恢复的时候也是需不需要激活函数的差别

BB

%%%%%%%%% START POSITIVE PHASE %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  data = batchdata(:,:,batch);
  poshidprobs = 1./(1 + exp(-data*vishid - repmat(hidbiases,numcases,1)));    
  batchposhidprobs(:,:,batch)=poshidprobs;
  posprods    = data' * poshidprobs;
  poshidact   = sum(poshidprobs);
  posvisact = sum(data);

%%%%%%%%% END OF POSITIVE PHASE  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  poshidstates = poshidprobs > rand(numcases,numhid);

%%%%%%%%% START NEGATIVE PHASE  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  negdata = 1./(1 + exp(-poshidstates*vishid' - repmat(visbiases,numcases,1)));
  neghidprobs = 1./(1 + exp(-negdata*vishid - repmat(hidbiases,numcases,1)));    
  negprods  = negdata'*neghidprobs;
  neghidact = sum(neghidprobs);
  negvisact = sum(negdata); 

%%%%%%%%% END OF NEGATIVE PHASE %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

GB

%%%%%%%%% START POSITIVE PHASE %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  data = batchdata(:,:,batch);

  %poshidprobs =  (data*vishid) + repmat(hidbiases,numcases,1);
  poshidprobs = 1./(1 + exp(-(data*vishid) - repmat(hidbiases,numcases,1)));
  batchposhidprobs(:,:,batch)=poshidprobs;
  posprods    = data' * poshidprobs;
  poshidact   = sum(poshidprobs);
  posvisact = sum(data);

%%%%%%%%% END OF POSITIVE PHASE  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%poshidstates = poshidprobs+randn(numcases,numhid);
poshidstates = poshidprobs > rand(numcases,numhid);

%%%%%%%%% START NEGATIVE PHASE  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  negdata = poshidstates*vishid' + repmat(visbiases,numcases,1);
  neghidprobs =1./(1+ exp( -(negdata*vishid) - repmat(hidbiases,numcases,1)));
  negprods  = negdata'*neghidprobs;
  neghidact = sum(neghidprobs);
  negvisact = sum(negdata); 

%%%%%%%%% END OF NEGATIVE PHASE %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

BG

%%%%%%%%% START POSITIVE PHASE %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  data = batchdata(:,:,batch);
  poshidprobs =  (data*vishid) + repmat(hidbiases,numcases,1);
  batchposhidprobs(:,:,batch)=poshidprobs;
  posprods    = data' * poshidprobs;
  poshidact   = sum(poshidprobs);
  posvisact = sum(data);

%%%%%%%%% END OF POSITIVE PHASE  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
poshidstates = poshidprobs+randn(numcases,numhid);

%%%%%%%%% START NEGATIVE PHASE  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  negdata = 1./(1 + exp(-poshidstates*vishid' - repmat(visbiases,numcases,1)));
  neghidprobs = (negdata*vishid) + repmat(hidbiases,numcases,1);
  negprods  = negdata'*neghidprobs;
  neghidact = sum(neghidprobs);
  negvisact = sum(negdata); 

%%%%%%%%% END OF NEGATIVE PHASE %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

你可能感兴趣的:(基础知识)