跟着吴恩达学深度学习:用Scala实现神经网络-第二课:用Scala实现多层神经网络

上一章我们讲了如何使用Scala实现LogisticRegression,这一张跟随着吴恩达的脚步我们用Scala实现基础的深度神经网络。顺便再提一下,吴恩达对于深度神经网络的解释是我如今听过的最清楚的课,感叹一句果然越是大牛知识解释得越清晰明了。

 

本文分为以下四个部分。按照软件开发top-down的思路,第一部分我先展示一下使用构建好的神经网络对Gas Censor数据进行分类的demo,这一部分重点关注接口的顶层设计,如何达到易用、简洁、逻辑清晰以及新手友好,这是软件工程一个比较难的领域,所以我的接口设计不一定是最优的,欢迎在评论区提出自己的想法,我们共用探讨~

 

第二个部分给出了NeuralNetworkModel类的具体实现。从接口的角度来看,NeuralNetworkModel实现了名为Model的trait(类似于Java中的interface),所有的model都拥有一些common protocol,在本项目中所有的model都拥有setLearningRate,setIterationTime,train,predict,accuracy,getCostHistory方法,此外NeuralNetworkModel也拥有自己独有的setHiddenLayerStructure,setOutputLayerStructure方法。

 

第三部分我们介绍了各种神经网络Layer的实现,包含ReluLayer,SigmoidLayer,TanhLayer这三个类。所有的layer都实现了名叫Layer的trait,提供etNumHiddenUnits,forward和backward三个方法。

 

最后一部分我们介绍了各种Utils类,代码以及功能注释都附在文章最后,感兴趣的可以自己看一下。我的GitHub地址为:https://github.com/pan5431333/coursera-deeplearning-practice-in-scala-remote,欢迎clone代码,指正问题,共同进步!

 

第一部分:demo介绍

 

首先我们先看看使用神经网络模型的使用demo。

package org.mengpan.deeplearning.demo

import breeze.stats.{mean, stddev}
import org.mengpan.deeplearning.data.{Cat, GasCensor}
import org.mengpan.deeplearning.helper.{CatDataHelper, DlCollection, GasCensorDataHelper}
import org.mengpan.deeplearning.model.{Model, NeuralNetworkModel, ShallowNeuralNetworkModel}
import org.mengpan.deeplearning.utils.{MyDict, NormalizeUtils, PlotUtils}

/**
  * Created by mengpan on 2017/8/15.
  */
object ClassThreeNeuralNetworkDemo extends App{
  
// Dataset Download Website:http://archive.ics.uci.edu/ml/machine-learning-databases/00224/
  //
加载Gas Censor的数据集
 
val data: DlCollection[GasCensor] = GasCensorDataHelper.getAllData

 
//归一化数据特征矩阵
 
val normalizedCatData = NormalizeUtils.normalizeBy(data){col =>
    (col - mean(col)) / stddev(col)
  }

 
//获取training settest set
 
val (training, test) = normalizedCatData.split(0.8)

 
//分别获取训练集和测试集的featurelabel
 
val trainingFeature = training.getFeatureAsMatrix
 
val trainingLabel = training.getLabelAsVector
 
val testFeature = test.getFeatureAsMatrix
 
val testLabel = test.getLabelAsVector

 
//初始化算法模型
 
val nnModel: Model = new NeuralNetworkModel()
    .setHiddenLayerStructure(
Map(
      (
200, MyDict.ACTIVATION_RELU),
     
(100, MyDict.ACTIVATION_RELU)
    ))
    .setOutputLayerStructure((
1, MyDict.ACTIVATION_SIGMOID))
    .setLearningRate(
0.01)
    .setIterationTime(
5000)

 
//用训练集的数据训练算法
 
val trainedModel: Model = nnModel.train(trainingFeature, trainingLabel)

 
//测试算法获得算法优劣指标
 
val yPredicted = trainedModel.predict(testFeature)
 
val trainYPredicted = trainedModel.predict(trainingFeature)

 
val testAccuracy = trainedModel.accuracy(testLabel, yPredicted)
 
val trainAccuracy = trainedModel.accuracy(trainingLabel, trainYPredicted)
  println(
"\n The trainaccuracy of this model is: " + trainAccuracy)
  println(
"\n The testaccuracy of this model is: " + testAccuracy)

 
//对算法的训练过程中cost与迭代次数变化关系进行画图
 
val costHistory = trainedModel.getCostHistory
  PlotUtils.plotCostHistory(
costHistory)
}

 

对于神经网络的模型接口,我们采用了Scala中典型的链式编程法:

//初始化算法模型
val nnModel: Model = new NeuralNetworkModel()
  .setHiddenLayerStructure(Map(
    (200, MyDict.ACTIVATION_RELU),
    (100, MyDict.ACTIVATION_RELU)
  ))
  .setOutputLayerStructure((1, MyDict.ACTIVATION_SIGMOID))
  .setLearningRate(0.01)
  .setIterationTime(5000)

 

setHiddenLayerStructure方法用于设置神经网络隐含层的结构,接受多个二元元祖为参数,二元元祖的第一个参数为该层隐含层层的神经单元个数;第二个参数为该层隐含层层的激活函数类型。如上图中构建的神经网络拥有两个隐含层,第一个隐含层拥有200个神经元,激活函数类型为ReLU,第二个隐含层拥有100个神经元,激活函数类型也为ReLU。setOutputLayerStructure方法接受一个二元元祖为参数,元祖的含义与前面相同。

 

第二部分:NeuralNetworkModel类的具体实现

 

首先我把此类中的重要的方法解释一下,完整的代码附在本部分最后面。首先我们来看一下train()方法:

override def train(feature: DenseMatrix[Double], label: DenseVector[Double]): NeuralNetworkModel.this.type = {
  val numExamples = feature.rows
  val inputDim = feature.cols

  logger.debug("hidden layers: " + hiddenLayers)
  logger.debug("output layer: " + outputLayer)

  //随机初始化模型参数
  var paramsList: List[(DenseMatrix[Double], DenseVector[Double])] =
    initializeParams(numExamples, inputDim, hiddenLayers, outputLayer)

  (0 until this.iterationTime).foreach{i =>
    val forwardResList: List[ForwardRes] = forward(feature, paramsList,
      hiddenLayers, outputLayer)

    logger.debug(forwardResList)

    val cost = calCost(forwardResList.last, label)
    if (i % 100 == 0) {
      logger.info("Cost in " + i + "th time of iteration: " + cost)
    }
    costHistory.put(i, cost)

    val backwardResList: List[BackwardRes] = backward(feature, label, forwardResList,
      paramsList, hiddenLayers, outputLayer)

    logger.debug(backwardResList)

    paramsList = updateParams(paramsList, this.learningRate, backwardResList, i, cost)
  }

  this.paramsList = paramsList
  this
}

 

可以看到,在神经网络模型的train()方法中,有五个主要的私有功能函数:首先用initializeParams(numExamples, inputDim, hiddenLayers, outputLayer)初始化参数;然后在每一次迭代过程中:

l   先使用forward()计算前向传播的结果;

l   在使用calCost()计算此次的损失函数值;

l   然后使用backward()计算反向传播的结果;

l   最后使用updateParams()更新参数;

 

接下来看看随机初始化参数的方法initializeParams(),一些需要关注的语法点都写在注释里面了。

private def initializeParams(numExamples: Int, inputDim: Int,
                             hiddenLayers: Seq[Layer], outputLayer: Layer):
List[(DenseMatrix[Double], DenseVector[Double])] = {

  /*
   *把输入层,隐含层,输出层的神经元个数组合成一个Vector
   *inputDim=3outputDim=1hiddenDim=(3, 3, 2),则layersDim=(3, 3, 3, 2, 1)
   *两个List的操作符,A.::(b)为在A前面加上元素bA.:+(B)为在A的后面加上元素b
   *这里使用Vector存储layersDim,因为Vectorindexed sequence,访问任意位置的元素时间相同
  */
  val layersDim = hiddenLayers.map(_.numHiddenUnits)
    .toList
    .::(inputDim)
    .:+(outputLayer.numHiddenUnits)
    .toVector

  val numLayers = layersDim.length

  /*
   *W(l)的维度为(layersDim(l-1), layersDim(l))
   *b(l)的维度为(layersDim(l), )
   *注意随机初始化的数值在0-1之间,为保证模型稳定性,需在wb后面*0.01
  */
  (1 until numLayers).map{i =>
    val w = DenseMatrix.rand[Double](layersDim(i-1), layersDim(i)) * 0.01
    val b = DenseVector.rand[Double](layersDim(i)) * 0.01
    (w, b)
  }.toList
}

 

然后看看计算前向传播的函数forward()的具体实现,相关需要注意的知识点也写在注释里了,另外layer.forward()的实现我们在下一部分讲解Layer类时会解释:

private def forward(feature: DenseMatrix[Double],
                    params: List[(DenseMatrix[Double],
                      DenseVector[Double])],
                    hiddenLayers: Seq[Layer],
                    outputLayer: Layer): List[ForwardRes] = {
  var yi = feature

  /*
   *这里注意Scalazip的用法。假设A=List(1, 2, 3), B=List(3, 4), 
   * A.zip(B)  List((1, 3), (2, 4))
   * 复习:A.:+(b)的作用是在A后面加上b元素,注意因为immutable,实际上是生成了一个新对象
   */
  params.zip(hiddenLayers.:+(outputLayer))
    .map{f =>
      val w = f._1._1
      val b = f._1._2
      val layer = f._2

      //forward方法需要yPrevious, w, b三个参数
      val forwardRes = layer.forward(yi, w, b)
      yi = forwardRes.yCurrent

      forwardRes
    }
}

 

接下来看看计算损失函数值calCost()的具体实现:

private def calCost(res: ResultUtils.ForwardRes, label: DenseVector[Double]):
Double = {
  val yHat = res.yCurrent(::, 0)

  //log函数内加上pow(10.0, -9),防止出现log(0)从而NaN的情况
  -(label.t * log(yHat + pow(10.0, -9)) + (1.0 - label).t * log(1.0 - yHat + pow(10.0, -9))) / label.length.toDouble
}

 

剩余的backward()方法以及updateParams()的代码以及其他的NeuralNetworkModel类的完整代码如下:

package org.mengpan.deeplearning.model
import java.util

import breeze.linalg.{DenseMatrix, DenseVector}
import breeze.numerics.{log, pow}
import org.apache.log4j.Logger
import org.mengpan.deeplearning.layers.Layer
import org.mengpan.deeplearning.utils.{DebugUtils, LayerUtils, ResultUtils}
import org.mengpan.deeplearning.utils.ResultUtils.{BackwardRes, ForwardRes}

import scala.collection.mutable

/**
  * Created by mengpan on 2017/8/26.
  */
class NeuralNetworkModel extends Model{
  //记录log
  val logger = Logger.getLogger("NeuralNetworkModel")

  //神经网络的四个超参数
  override var learningRate: Double = _
  override var iterationTime: Int = _
  var hiddenLayerStructure: Map[Int, Byte] = _
  var outputLayerStructure: (Int, Byte) = _

  //记录每一次迭代cost变化的历史数据
  override val costHistory: mutable.TreeMap[Int, Double] = new mutable.TreeMap[Int, Double]()

  //神经网络模型的参数
  var paramsList: List[(DenseMatrix[Double], DenseVector[Double])] = _

  //神经网络的隐含层与输出层的结构,根据hiddenLayerStructureoutputLayerStructure两个超参数得到
  private var hiddenLayers: Seq[Layer] = _
  private var outputLayer: Layer = _

  def setHiddenLayerStructure(hiddenLayerStructure: Map[Int, Byte]): this.type = {
    if (hiddenLayerStructure.isEmpty) {
      throw new Exception("hidden layer should be at least one layer!")
    }

    this.hiddenLayerStructure = hiddenLayerStructure
    this.hiddenLayers = getHiddenLayers(this.hiddenLayerStructure)
    this
  }

  def setOutputLayerStructure(outputLayerStructure: (Int, Byte)): this.type = {
    this.outputLayerStructure = outputLayerStructure
    this.outputLayer = getOutputLayer(this.outputLayerStructure)
    this
  }

  override def train(feature: DenseMatrix[Double], label: DenseVector[Double]): NeuralNetworkModel.this.type = {
    val numExamples = feature.rows
    val inputDim = feature.cols

    logger.debug("hidden layers: " + hiddenLayers)
    logger.debug("output layer: " + outputLayer)

    //随机初始化模型参数
    var paramsList: List[(DenseMatrix[Double], DenseVector[Double])] =
      initializeParams(numExamples, inputDim, hiddenLayers, outputLayer)

    (0 until this.iterationTime).foreach{i =>
      val forwardResList: List[ForwardRes] = forward(feature, paramsList,
        hiddenLayers, outputLayer)

      logger.debug(forwardResList)

      val cost = calCost(forwardResList.last, label)
      if (i % 100 == 0) {
        logger.info("Cost in " + i + "th time of iteration: " + cost)
      }
      costHistory.put(i, cost)

      val backwardResList: List[BackwardRes] = backward(feature, label, forwardResList,
        paramsList, hiddenLayers, outputLayer)

      logger.debug(backwardResList)

      paramsList = updateParams(paramsList, this.learningRate, backwardResList, i, cost)
    }

    this.paramsList = paramsList
    this
  }

  override def predict(feature: DenseMatrix[Double]): DenseVector[Double] = {
    val forwardResList: List[ForwardRes] = forward(feature, this.paramsList,
      this.hiddenLayers, this.outputLayer)
    forwardResList.last.yCurrent(::, 0).map{yHat =>
      if (yHat > 0.5) 1.0 else 0.0
    }
  }

  private def getHiddenLayers(hiddenLayerStructure: Map[Int, Byte]): Seq[Layer] = {
    hiddenLayerStructure.map{structure =>
      getLayerByStructure(structure)
    }.toList
  }

  private def getOutputLayer(structure: (Int, Byte)): Layer = {
    getLayerByStructure(structure)
  }

  private def getLayerByStructure(structure: (Int, Byte)): Layer = {
    val numHiddenUnits = structure._1
    val activationType = structure._2

    val layer: Layer = LayerUtils.getLayerByActivationType(activationType)
      .setNumHiddenUnits(numHiddenUnits)
    layer
  }

  private def initializeParams(numExamples: Int, inputDim: Int,
                               hiddenLayers: Seq[Layer], outputLayer: Layer):
  List[(DenseMatrix[Double], DenseVector[Double])] = {

    /*
     *把输入层,隐含层,输出层的神经元个数组合成一个Vector
     *inputDim=3outputDim=1hiddenDim=(3, 3, 2),则layersDim=(3, 3, 3, 2, 1)
     *两个List的操作符,A.::(b)为在A前面加上元素BA.:+(B)为在A的后面加上元素B
     *这里使用Vector存储layersDim,因为Vectorindexed sequence,访问任意位置的元素时间相同
    */
    val layersDim = hiddenLayers.map(_.numHiddenUnits)
      .toList
      .::(inputDim)
      .:+(outputLayer.numHiddenUnits)
      .toVector

    val numLayers = layersDim.length

    /*
     *W(l)的维度为(layersDim(l-1), layersDim(l))
     *b(l)的维度为(layersDim(l), )
     *注意随机初始化的数值在0-1之间,为保证模型稳定性,需在wb后面*0.01
    */
    (1 until numLayers).map{i =>
      val w = DenseMatrix.rand[Double](layersDim(i-1), layersDim(i)) * 0.01
      val b = DenseVector.rand[Double](layersDim(i)) * 0.01
      (w, b)
    }.toList
  }

  private def forward(feature: DenseMatrix[Double],
                      params: List[(DenseMatrix[Double],
                        DenseVector[Double])],
                      hiddenLayers: Seq[Layer],
                      outputLayer: Layer): List[ForwardRes] = {
    var yi = feature

    /*
     *这里注意Scalazip的用法。假设A=List(1, 2, 3), B=List(3, 4), 
     * A.zip(B)  List((1, 3), (2, 4))
     * 复习:A.:+(b)的作用是在A后面加上b元素,注意因为immutable,实际上是生成了一个新对象
     */
    params.zip(hiddenLayers.:+(outputLayer))
      .map{f =>
        val w = f._1._1
        val b = f._1._2
        val layer = f._2

        //forward方法需要yPrevious, w, b三个参数
        val forwardRes = layer.forward(yi, w, b)
        yi = forwardRes.yCurrent

        forwardRes
      }
  }

  private def calCost(res: ResultUtils.ForwardRes, label: DenseVector[Double]):
  Double = {
    val yHat = res.yCurrent(::, 0)

    //log函数内加上pow(10.0, -9),防止出现log(0)从而NaN的情况
    -(label.t * log(yHat + pow(10.0, -9)) + (1.0 - label).t * log(1.0 - yHat + pow(10.0, -9))) / label.length.toDouble
  }


  private def backward(feature: DenseMatrix[Double], label: DenseVector[Double],
                       forwardResList: List[ResultUtils.ForwardRes],
                       paramsList: List[(DenseMatrix[Double], DenseVector[Double])],
                       hiddenLayers: Seq[Layer], outputLayer: Layer):
  List[BackwardRes] = {
    val yHat = forwardResList.last.yCurrent(::, 0)

    //+ pow(10.0, -9)防止出现被除数为0NaN的情况
    val dYL = -(label /:/ (yHat + pow(10.0, -9)) - (1.0 - label) /:/ (1.0 - yHat + pow(10.0, -9)))
    var dYCurrent = DenseMatrix.zeros[Double](feature.rows, 1)
    dYCurrent(::, 0) := dYL

    paramsList
      .zip(forwardResList)
      .zip(hiddenLayers.:+(outputLayer))
      .reverse
      .map{f =>
        val w = f._1._1._1
        val b = f._1._1._2
        val forwardRes = f._1._2
        val layer = f._2

        logger.debug(DebugUtils.matrixShape(w, "w"))
        logger.debug(layer)

        /*
         *backward方法需要dYCurrent, forwardRes, w, b四个参数
         * 其中,forwardRes中有用的为:yPrevious(计算dW)zCurrent(计算dZCurrent
         */
        val backwardRes = layer.backward(dYCurrent, forwardRes, w, b)
        dYCurrent = backwardRes.dYPrevious
        backwardRes
      }
      .reverse
  }

  private def updateParams(paramsList: List[(DenseMatrix[Double], DenseVector[Double])],
                           learningrate: Double,
                           backwardResList: List[ResultUtils.BackwardRes],
                           iterationTime: Int, cost: Double): List[(DenseMatrix[Double], DenseVector[Double])] = {
    paramsList.zip(backwardResList)
      .map{f =>
        val w = f._1._1
        val b = f._1._2
        val backwardRes = f._2
        val dw = backwardRes.dWCurrent
        val db = backwardRes.dBCurrent

        logger.debug(DebugUtils.matrixShape(w, "w"))
        logger.debug(DebugUtils.matrixShape(dw, "dw"))

        var adjustedLearningRate = this.learningRate

        //如果cost出现NaN则把学习率降低100
        adjustedLearningRate = if (cost.isNaN) adjustedLearningRate/100 else adjustedLearningRate

        w :-= dw * learningrate
        b :-= db * learningrate
        (w, b)
      }
  }

}

 

第三部分:Layer的实现

 

接下来我们看一下layer的实现,由于所有的layer类(ReluLayer,SigmoidLayer和TanhLayer)都是名叫Layer的trait的实现,我们先看看layer trait的代码。Layer trait包含三个方法,分别为setNumHiddenUnits:设定本layer中神经元的数量;forward:执行本layer的前向传播的计算;backward:执行本layer反向传播的计算。代码如下:

package org.mengpan.deeplearning.layers

import breeze.linalg.{DenseMatrix, DenseVector}
import org.apache.log4j.Logger
import org.mengpan.deeplearning.utils.ResultUtils.{BackwardRes, ForwardRes}
import org.mengpan.deeplearning.utils.{ActivationUtils, DebugUtils, GradientUtils}

/**
  * Created by mengpan on 2017/8/26.
  */
trait Layer{
  private val logger = Logger.getLogger("Layer")

  var numHiddenUnits: Int
  var activationFunc: Byte


  def setNumHiddenUnits(numHiddenUnits: Int): this.type = {
    this.numHiddenUnits = numHiddenUnits
    this
  }
s
  def forward(yPrevious: DenseMatrix[Double], w: DenseMatrix[Double],
              b: DenseVector[Double]): ForwardRes = {
    val numExamples = yPrevious.rows
    logger.debug(DebugUtils.matrixShape(yPrevious, "yPrevious"))
    logger.debug(DebugUtils.matrixShape(w, "w"))
    logger.debug(DebugUtils.vectorShape(b, "b"))
    val zCurrent = yPrevious * w + DenseVector.ones[Double](numExamples) * b.t
    val yCurrent = ActivationUtils.getActivationFunc(this.activationFunc)(zCurrent)
    logger.debug("yCurrent: " + yCurrent)
    ForwardRes(yPrevious, zCurrent, yCurrent)
  }

  def backward(dYCurrent: DenseMatrix[Double], forwardRes: ForwardRes,
               w: DenseMatrix[Double], b: DenseVector[Double]): BackwardRes = {
    val numExamples = dYCurrent.rows

    val yPrevious = forwardRes.yPrevious
    val zCurrent = forwardRes.zCurrent
    val yCurrent = forwardRes.yCurrent

    val dZCurrent = dYCurrent *:*
      GradientUtils.getGradByFuncType(this.activationFunc)(zCurrent)

    val dWCurrent = yPrevious.t * dZCurrent / numExamples.toDouble
    val dBCurrent = (DenseVector.ones[Double](numExamples).t * dZCurrent).t /
      numExamples.toDouble
    val dYPrevious = dZCurrent * w.t

    BackwardRes(dYPrevious, dWCurrent, dBCurrent)
  }

  override def toString: String = super.toString
}

 

对于其他具体的layer,其唯一的不同就是activationFunc不一样,如对于ReluLayer:

classReluLayer extends Layer{
  override var numHiddenUnits: Int = _
  override var activationFunc: Byte = MyDict.ACTIVATION_RELU
}

 

对于SigmoidLayer:

classSigmoidLayer extends Layer{
  override var numHiddenUnits: Int = _
  override var activationFunc: Byte = MyDict.ACTIVATION_SIGMOID
}

 

对于TanhLayer:

classTanhLayer extends Layer{
  override var numHiddenUnits: Int = _
  override var activationFunc: Byte = MyDict.ACTIVATION_TANH
}

 

第五部分:各种Utils类

代码如下:

 

ActivationUtils:

package org.mengpan.deeplearning.utils

import breeze.linalg.DenseMatrix
import breeze.numerics.{relu, sigmoid, tanh}
import org.apache.log4j.Logger

/**
  * Created by mengpan on 2017/8/26.
  */
object ActivationUtils {
  val logger = Logger.getLogger("ActivationUtils")

  def getActivationFunc(activationFuncType: Byte): DenseMatrix[Double] => DenseMatrix[Double] = {
    activationFuncType match {
      case MyDict.ACTIVATION_SIGMOID => sigmoid(_: DenseMatrix[Double])
      case MyDict.ACTIVATION_TANH => tanh(_: DenseMatrix[Double])
      case MyDict.ACTIVATION_RELU => relu(_: DenseMatrix[Double])
      case _ => logger.fatal("Wrong hidden activation function param given, use tanh by default")
        tanh(_: DenseMatrix[Double])
    }
  }
}

 

DebugUtils:

package org.mengpan.deeplearning.utils

import breeze.linalg.{DenseMatrix, DenseVector}

/**
  * Created by mengpan on 2017/8/26.
  */
object DebugUtils {
  def matrixShape(w: DenseMatrix[Double], objectName: String): String = {
    objectName + "'s shape: (" + w.rows + ", " + w.cols + ")"
  }

  def vectorShape(b: DenseVector[Double], objectName: String): String = {
    objectName + "'s shape: (" + b.length + ")"
  }


}

 

GradientUtils:

package org.mengpan.deeplearning.utils

import breeze.linalg.DenseMatrix
import breeze.numerics.{pow, sigmoid}

/**
  * Created by mengpan on 2017/8/25.
  */
object GradientUtils {
  def reluGrad(z: DenseMatrix[Double]): DenseMatrix[Double] = {
    val numRows = z.rows
    val numCols = z.cols

    val res = DenseMatrix.zeros[Double](numRows, numCols)

    (0 until numRows).foreach{i =>
      (0 until numCols).foreach{j =>
        res(i, j) = if (z(i, j) >= 0) 1.0 else 0.0
      }
    }

    res
  }

  def tanhGrad(z: DenseMatrix[Double]): DenseMatrix[Double] = {
    1.0 - pow(z, 2)
  }

  def sigmoidGrad(z: DenseMatrix[Double]): DenseMatrix[Double] = {
    val res = sigmoid(z) *:* (1.0 - sigmoid(z))
    res.map{d =>
      if(d < pow(10.0, -9)) pow(10.0, -9)
      else if (d > pow(10.0, 2)) pow(10.0, 2)
      else d
    }
  }

  def getGradByFuncType(activationFuncType: Byte): DenseMatrix[Double] => DenseMatrix[Double] = {
    activationFuncType match {
      case MyDict.ACTIVATION_TANH => tanhGrad
      case MyDict.ACTIVATION_RELU => reluGrad
      case MyDict.ACTIVATION_SIGMOID => sigmoidGrad
      case _ => throw new Exception("Unsupported type of activation function")
    }
  }
}

 

LayerUtils:

package org.mengpan.deeplearning.utils

import org.mengpan.deeplearning.layers.{Layer, ReluLayer, SigmoidLayer, TanhLayer}

/**
  * Created by mengpan on 2017/8/26.
  */
object LayerUtils {
  def getLayerByActivationType(activationType: Byte): Layer = {
    activationType match {
      case MyDict.ACTIVATION_TANH => new TanhLayer()
      case MyDict.ACTIVATION_RELU => new ReluLayer()
      case MyDict.ACTIVATION_SIGMOID => new SigmoidLayer()
      case _ => throw new Exception("Unsupported type of activation function")
    }
  }
}

 

NomalizeUtils:

package org.mengpan.deeplearning.utils

import breeze.linalg.{DenseMatrix, DenseVector}
import org.mengpan.deeplearning.helper.DlCollection
import org.mengpan.deeplearning.data.Data

/**
  * Created by mengpan on 2017/8/26.
  */
object NormalizeUtils {
  def normalizeBy[E <: Data](data: DlCollection[E])(normalizeFunc: DenseVector[Double]
                                    => DenseVector[Double]): DlCollection[E] = {
    val feature = data.getFeatureAsMatrix
    val numCols = feature.cols
    val numRows = feature.rows

    val normalizedFeature = DenseMatrix.zeros[Double](numRows, numCols)

    (0 until numCols).foreach{j =>
      val ithCol = feature(::, j)
      normalizedFeature(::, j) := normalizeFunc(ithCol)
    }


    var i = -1
    val res = data.map[E]{eachData =>
      i += 1
      eachData.updateFeature(normalizedFeature(i, ::).t)
    }

    res
  }
}

 

ResultUtils:

package org.mengpan.deeplearning.utils

import breeze.linalg.{DenseMatrix, DenseVector}

/**
  * Created by mengpan on 2017/8/26.
  */
object ResultUtils {
  case class ForwardRes(val yPrevious: DenseMatrix[Double],
                        val zCurrent: DenseMatrix[Double],
                        val yCurrent: DenseMatrix[Double]) {
    override def toString: String = "yPrevious:{" + yPrevious + "}\n" +
    "zCurrent: {" + zCurrent + "}\n" +
    "yCurrent: {" + yCurrent + "}\n"
  }
  case class BackwardRes(val dYPrevious: DenseMatrix[Double],
                         val dWCurrent: DenseMatrix[Double],
                         val dBCurrent: DenseVector[Double]) {
    override def toString: String = "dYPrevious:{" + dYPrevious + "}\n" +
      "dWCurrent: {" + dWCurrent + "}\n" +
      "dBCurrent: {" + dBCurrent + "}\n"
  }
}

 

以上就是使用Scala从头实现多层神经网络的程序。神经网络的理论固然重要,但是只要上一门好的课,有正确的指导,理解起来其实不是很难。因为从数学的角度来说,反向传播的最大难点应该是向量微积分,属于很基础的数学内容,作为数学系的学生我大三就在PDEs and Vector Calculus课上学过相关内容。或许是术业有专攻,因为我没有系统学过软件工程的相关知识,现在所拥有的相关知识都是实习中学到的很零散的,所以感觉现在自己最大的短板在于如何阻止一个好的项目结构,如何正确地使用开发框架提升开发效率,如何设计良好的接口。我的GitHub的项目地址为:https://github.com/pan5431333/coursera-deeplearning-practice-in-scala-remote,欢迎对代码进行指正,共同学习进步!

你可能感兴趣的:(Scala学习者,深度学习学习者,唯品会开发实习生,同济大学硕士在读)