Training a deep autoencoder or a classifier on MNIST digits_Rbm训练(python)

Training a deep autoencoder or a classifier on MNIST digits_Rbm训练(python)

一、Rbm阅读材料
    http://en.wikipedia.org/wiki/Restricted_Boltzmann_machine
    http://deeplearning.net/tutorial/rbm.html
二、Rbm训练的基本原理
三、Rbm代码分析
       我们建立一个RBM类型。网络参数可以经过结构体初始化,也可以通过参数初始化。当一个RBM被当作 一个深度网络的搭建块时,初始化是有有用的。RBM作为深度网络 的基本块时,权值矩阵和隐藏层偏置都由一个多层感知器相应的sigmoid层共享。
class RBM(object):
  """Restricted Boltzmann Machine (RBM) """
  def __init__(self, input=None, n_visible=784, n_hidden=500,
               W=None, hbias=None, vbias=None, numpy_rng=None,
               theano_rng=None):
      """
      RBM constructor. Defines the parameters of the model along with
      basic operations for inferring hidden from visible (and vice-versa),
      as well as for performing CD updates.

      :param input: None for standalone RBMs or symbolic variable if RBM is
      part of a larger graph.

      :param n_visible: number of visible units

      :param n_hidden: number of hidden units

      :param W: None for standalone RBMs or symbolic variable pointing to a
      shared weight matrix in case RBM is part of a DBN network; in a DBN,
      the weights are shared between RBMs and layers of a MLP

      :param hbias: None for standalone RBMs or symbolic variable pointing
      to a shared hidden units bias vector in case RBM is part of a
      different network

      :param vbias: None for standalone RBMs or a symbolic variable
      pointing to a shared visible units bias
      """

      self.n_visible = n_visible
      self.n_hidden = n_hidden


      if numpy_rng is None:
          # create a number generator
          numpy_rng = numpy.random.RandomState(1234)

      if theano_rng is None:
          theano_rng = RandomStreams(numpy_rng.randint(2 ** 30))

      if W is None :
         # W is initialized with `initial_W` which is uniformely sampled
         # from -4.*sqrt(6./(n_visible+n_hidden)) and 4.*sqrt(6./(n_hidden+n_visible))
         # the output of uniform if converted using asarray to dtype
         # theano.config.floatX so that the code is runable on GPU
         initial_W = numpy.asarray(numpy.random.uniform(
                   low=-4 * numpy.sqrt(6. / (n_hidden + n_visible)),
                   high=4 * numpy.sqrt(6. / (n_hidden + n_visible)),
                   size=(n_visible, n_hidden)),
                   dtype=theano.config.floatX)
         # theano shared variables for weights and biases
         W = theano.shared(value=initial_W, name='W')

      if hbias is None :
         # create shared variable for hidden units bias
         hbias = theano.shared(value=numpy.zeros(n_hidden,
                             dtype=theano.config.floatX), name='hbias')

      if vbias is None :
          # create shared variable for visible units bias
          vbias = theano.shared(value =numpy.zeros(n_visible,
                              dtype = theano.config.floatX),name='vbias')


      # initialize input layer for standalone RBM or layer0 of DBN
      self.input = input if input else T.dmatrix('input')

      self.W = W
      self.hbias = hbias
      self.vbias = vbias
      self.theano_rng = theano_rng
      # **** WARNING: It is not a good idea to put things in this list
      # other than shared variables created in this function.
      self.params = [self.W, self.hbias, self.vbias]


Next step is to define functions which construct the symbolic graph associated with Eqs. (7) - (8). The code is as follows:


def propup(self, vis):
    ''' This function propagates the visible units activation upwards to
    the hidden units

    Note that we return also the pre_sigmoid_activation of the layer. As
    it will turn out later, due to how Theano deals with optimization and
    stability this symbolic variable will be needed to write down a more
    stable graph (see details in the reconstruction cost function)
    '''
    pre_sigmoid_activation = T.dot(vis, self.W) + self.hbias
    return [pre_sigmoid_activation, T.nnet.sigmoid(pre_sigmoid_activation)]

def sample_h_given_v(self, v0_sample):
    ''' This function infers state of hidden units given visible units '''
    # compute the activation of the hidden units given a sample of the visibles
    pre_sigmoid_h1, h1_mean = self.propup(v0_sample)
    # get a sample of the hiddens given their activation
    # Note that theano_rng.binomial returns a symbolic sample of dtype
    # int64 by default. If we want to keep our computations in floatX
    # for the GPU we need to specify to return the dtype floatX
    h1_sample = self.theano_rng.binomial(size=h1_mean.shape, n=1, p=h1_mean,
                                         dtype=theano.config.floatX)
    return [pre_sigmoid_h1, h1_mean, h1_sample]

def propdown(self, hid):
    '''This function propagates the hidden units activation downwards to
    the visible units

    Note that we return also the pre_sigmoid_activation of the layer. As
    it will turn out later, due to how Theano deals with optimization and
    stability this symbolic variable will be needed to write down a more
    stable graph (see details in the reconstruction cost function)
    '''
    pre_sigmoid_activation = T.dot(hid, self.W.T) + self.vbias
    return [pre_sigmoid_activation, T.nnet.sigmoid(pre_sigmoid_activation)]

def sample_v_given_h(self, h0_sample):
    ''' This function infers state of visible units given hidden units '''
    # compute the activation of the visible given the hidden sample
    pre_sigmoid_v1, v1_mean = self.propdown(h0_sample)
    # get a sample of the visible given their activation
    # Note that theano_rng.binomial returns a symbolic sample of dtype
    # int64 by default. If we want to keep our computations in floatX
    # for the GPU we need to specify to return the dtype floatX
    v1_sample = self.theano_rng.binomial(size=v1_mean.shape,n=1, p=v1_mean,
                                         dtype=theano.config.floatX)
    return [pre_sigmoid_v1, v1_mean, v1_sample]


We can then use these functions to define the symbolic graph for a Gibbs sampling step. We define two functions:
•gibbs_vhv which performs a step of Gibbs sampling starting from the visible units. As we shall see, this will be useful for sampling from the RBM.
•gibbs_hvh which performs a step of Gibbs sampling starting from the hidden units. This function will be useful for performing CD and PCD updates.

The code is as follows:


def gibbs_hvh(self, h0_sample):
    ''' This function implements one step of Gibbs sampling,
        starting from the hidden state'''
    pre_sigmoid_v1, v1_mean, v1_sample = self.sample_v_given_h(h0_sample)
    pre_sigmoid_h1, h1_mean, h1_sample = self.sample_h_given_v(v1_sample)
    return [pre_sigmoid_v1, v1_mean, v1_sample, pre_sigmoid_h1, h1_mean, h1_sample]

def gibbs_vhv(self, v0_sample):
    ''' This function implements one step of Gibbs sampling,
        starting from the visible state'''
    pre_sigmoid_h1, h1_mean, h1_sample = self.sample_h_given_v(v0_sample)
    pre_sigmoid_v1, v1_mean, v1_sample = self.sample_v_given_h(h1_sample)
    return [pre_sigmoid_h1, h1_mean, h1_sample, pre_sigmoid_v1, v1_mean, v1_sample]

    我们可以采用这些函数来定义一次吉布斯采样的符号图形。我们定义了两个函数:
gibbs_vhv :这个函数从可视单元开始,执行一次吉布斯采样。正如我们所看见的,这对从RBM采样是有用的。
gibbs_hvh :这个函数从隐藏层单元开始,执行一次吉布斯采样。这个函数对执行CD和PCD 的更新有是帮助的。
这两个函数的代码如下:
def gibbs_hvh(self, h0_sample):
    ''' This function implements one step of Gibbs sampling,
        starting from the hidden state'''
    pre_sigmoid_v1, v1_mean, v1_sample = self.sample_v_given_h(h0_sample)
    pre_sigmoid_h1, h1_mean, h1_sample = self.sample_h_given_v(v1_sample)
    return [pre_sigmoid_v1, v1_mean, v1_sample, pre_sigmoid_h1, h1_mean, h1_sample]

def gibbs_vhv(self, v0_sample):
    ''' This function implements one step of Gibbs sampling,
        starting from the visible state'''
    pre_sigmoid_h1, h1_mean, h1_sample = self.sample_h_given_v(v0_sample)
    pre_sigmoid_v1, v1_mean, v1_sample = self.sample_v_given_h(h1_sample)
    return [pre_sigmoid_h1, h1_mean, h1_sample, pre_sigmoid_v1, v1_mean, v1_sample]





你可能感兴趣的:(Denoise,Autoencoders,机器学习)