上一章我们讲了如何使用Scala实现LogisticRegression,这一张跟随着吴恩达的脚步我们用Scala实现基础的深度神经网络。顺便再提一下,吴恩达对于深度神经网络的解释是我如今听过的最清楚的课,感叹一句果然越是大牛知识解释得越清晰明了。
本文分为以下四个部分。按照软件开发top-down的思路,第一部分我先展示一下使用构建好的神经网络对Gas Censor数据进行分类的demo,这一部分重点关注接口的顶层设计,如何达到易用、简洁、逻辑清晰以及新手友好,这是软件工程一个比较难的领域,所以我的接口设计不一定是最优的,欢迎在评论区提出自己的想法,我们共用探讨~
第二个部分给出了NeuralNetworkModel类的具体实现。从接口的角度来看,NeuralNetworkModel实现了名为Model的trait(类似于Java中的interface),所有的model都拥有一些common protocol,在本项目中所有的model都拥有setLearningRate,setIterationTime,train,predict,accuracy,getCostHistory方法,此外NeuralNetworkModel也拥有自己独有的setHiddenLayerStructure,setOutputLayerStructure方法。
第三部分我们介绍了各种神经网络Layer的实现,包含ReluLayer,SigmoidLayer,TanhLayer这三个类。所有的layer都实现了名叫Layer的trait,提供etNumHiddenUnits,forward和backward三个方法。
最后一部分我们介绍了各种Utils类,代码以及功能注释都附在文章最后,感兴趣的可以自己看一下。我的GitHub地址为:https://github.com/pan5431333/coursera-deeplearning-practice-in-scala-remote,欢迎clone代码,指正问题,共同进步!
第一部分:demo介绍
首先我们先看看使用神经网络模型的使用demo。
package org.mengpan.deeplearning.demo
import breeze.stats.{mean, stddev}
import org.mengpan.deeplearning.data.{Cat, GasCensor}
import org.mengpan.deeplearning.helper.{CatDataHelper, DlCollection, GasCensorDataHelper}
import org.mengpan.deeplearning.model.{Model, NeuralNetworkModel, ShallowNeuralNetworkModel}
import org.mengpan.deeplearning.utils.{MyDict, NormalizeUtils, PlotUtils}
/**
* Created by mengpan on 2017/8/15.
*/
object ClassThreeNeuralNetworkDemo extends App{
// Dataset Download Website:http://archive.ics.uci.edu/ml/machine-learning-databases/00224/
//加载Gas Censor的数据集
val data: DlCollection[GasCensor] = GasCensorDataHelper.getAllData
//归一化数据特征矩阵
val normalizedCatData = NormalizeUtils.normalizeBy(data){col =>
(col - mean(col)) / stddev(col)
}
//获取training set和test set
val (training, test) = normalizedCatData.split(0.8)
//分别获取训练集和测试集的feature和label
val trainingFeature = training.getFeatureAsMatrix
val trainingLabel = training.getLabelAsVector
val testFeature = test.getFeatureAsMatrix
val testLabel = test.getLabelAsVector
//初始化算法模型
val nnModel: Model = new NeuralNetworkModel()
.setHiddenLayerStructure(Map(
(200, MyDict.ACTIVATION_RELU),
(100, MyDict.ACTIVATION_RELU)
))
.setOutputLayerStructure((1, MyDict.ACTIVATION_SIGMOID))
.setLearningRate(0.01)
.setIterationTime(5000)
//用训练集的数据训练算法
val trainedModel: Model = nnModel.train(trainingFeature, trainingLabel)
//测试算法获得算法优劣指标
val yPredicted = trainedModel.predict(testFeature)
val trainYPredicted = trainedModel.predict(trainingFeature)
val testAccuracy = trainedModel.accuracy(testLabel, yPredicted)
val trainAccuracy = trainedModel.accuracy(trainingLabel, trainYPredicted)
println("\n The trainaccuracy of this model is: " + trainAccuracy)
println("\n The testaccuracy of this model is: " + testAccuracy)
//对算法的训练过程中cost与迭代次数变化关系进行画图
val costHistory = trainedModel.getCostHistory
PlotUtils.plotCostHistory(costHistory)
}
对于神经网络的模型接口,我们采用了Scala中典型的链式编程法:
//初始化算法模型 val nnModel: Model = new NeuralNetworkModel() .setHiddenLayerStructure(Map( (200, MyDict.ACTIVATION_RELU), (100, MyDict.ACTIVATION_RELU) )) .setOutputLayerStructure((1, MyDict.ACTIVATION_SIGMOID)) .setLearningRate(0.01) .setIterationTime(5000)
setHiddenLayerStructure方法用于设置神经网络隐含层的结构,接受多个二元元祖为参数,二元元祖的第一个参数为该层隐含层层的神经单元个数;第二个参数为该层隐含层层的激活函数类型。如上图中构建的神经网络拥有两个隐含层,第一个隐含层拥有200个神经元,激活函数类型为ReLU,第二个隐含层拥有100个神经元,激活函数类型也为ReLU。setOutputLayerStructure方法接受一个二元元祖为参数,元祖的含义与前面相同。
第二部分:NeuralNetworkModel类的具体实现
首先我把此类中的重要的方法解释一下,完整的代码附在本部分最后面。首先我们来看一下train()方法:
override def train(feature: DenseMatrix[Double], label: DenseVector[Double]): NeuralNetworkModel.this.type = { val numExamples = feature.rows val inputDim = feature.cols logger.debug("hidden layers: " + hiddenLayers) logger.debug("output layer: " + outputLayer) //随机初始化模型参数 var paramsList: List[(DenseMatrix[Double], DenseVector[Double])] = initializeParams(numExamples, inputDim, hiddenLayers, outputLayer) (0 until this.iterationTime).foreach{i => val forwardResList: List[ForwardRes] = forward(feature, paramsList, hiddenLayers, outputLayer) logger.debug(forwardResList) val cost = calCost(forwardResList.last, label) if (i % 100 == 0) { logger.info("Cost in " + i + "th time of iteration: " + cost) } costHistory.put(i, cost) val backwardResList: List[BackwardRes] = backward(feature, label, forwardResList, paramsList, hiddenLayers, outputLayer) logger.debug(backwardResList) paramsList = updateParams(paramsList, this.learningRate, backwardResList, i, cost) } this.paramsList = paramsList this }
可以看到,在神经网络模型的train()方法中,有五个主要的私有功能函数:首先用initializeParams(numExamples, inputDim, hiddenLayers, outputLayer)初始化参数;然后在每一次迭代过程中:
l 先使用forward()计算前向传播的结果;
l 在使用calCost()计算此次的损失函数值;
l 然后使用backward()计算反向传播的结果;
l 最后使用updateParams()更新参数;
接下来看看随机初始化参数的方法initializeParams(),一些需要关注的语法点都写在注释里面了。
private def initializeParams(numExamples: Int, inputDim: Int, hiddenLayers: Seq[Layer], outputLayer: Layer): List[(DenseMatrix[Double], DenseVector[Double])] = { /* *把输入层,隐含层,输出层的神经元个数组合成一个Vector *如inputDim=3,outputDim=1,hiddenDim=(3, 3, 2),则layersDim=(3, 3, 3, 2, 1) *两个List的操作符,A.::(b)为在A前面加上元素b,A.:+(B)为在A的后面加上元素b *这里使用Vector存储layersDim,因为Vector为indexed sequence,访问任意位置的元素时间相同 */ val layersDim = hiddenLayers.map(_.numHiddenUnits) .toList .::(inputDim) .:+(outputLayer.numHiddenUnits) .toVector val numLayers = layersDim.length /* *W(l)的维度为(layersDim(l-1), layersDim(l)) *b(l)的维度为(layersDim(l), ) *注意随机初始化的数值在0-1之间,为保证模型稳定性,需在w和b后面*0.01 */ (1 until numLayers).map{i => val w = DenseMatrix.rand[Double](layersDim(i-1), layersDim(i)) * 0.01 val b = DenseVector.rand[Double](layersDim(i)) * 0.01 (w, b) }.toList }
然后看看计算前向传播的函数forward()的具体实现,相关需要注意的知识点也写在注释里了,另外layer.forward()的实现我们在下一部分讲解Layer类时会解释:
private def forward(feature: DenseMatrix[Double], params: List[(DenseMatrix[Double], DenseVector[Double])], hiddenLayers: Seq[Layer], outputLayer: Layer): List[ForwardRes] = { var yi = feature /* *这里注意Scala中zip的用法。假设A=List(1, 2, 3), B=List(3, 4), 则 * A.zip(B) 为 List((1, 3), (2, 4)) * 复习:A.:+(b)的作用是在A后面加上b元素,注意因为immutable,实际上是生成了一个新对象 */ params.zip(hiddenLayers.:+(outputLayer)) .map{f => val w = f._1._1 val b = f._1._2 val layer = f._2 //forward方法需要yPrevious, w, b三个参数 val forwardRes = layer.forward(yi, w, b) yi = forwardRes.yCurrent forwardRes } }
接下来看看计算损失函数值calCost()的具体实现:
private def calCost(res: ResultUtils.ForwardRes, label: DenseVector[Double]): Double = { val yHat = res.yCurrent(::, 0) //在log函数内加上pow(10.0, -9),防止出现log(0)从而NaN的情况 -(label.t * log(yHat + pow(10.0, -9)) + (1.0 - label).t * log(1.0 - yHat + pow(10.0, -9))) / label.length.toDouble }
剩余的backward()方法以及updateParams()的代码以及其他的NeuralNetworkModel类的完整代码如下:
package org.mengpan.deeplearning.model import java.util import breeze.linalg.{DenseMatrix, DenseVector} import breeze.numerics.{log, pow} import org.apache.log4j.Logger import org.mengpan.deeplearning.layers.Layer import org.mengpan.deeplearning.utils.{DebugUtils, LayerUtils, ResultUtils} import org.mengpan.deeplearning.utils.ResultUtils.{BackwardRes, ForwardRes} import scala.collection.mutable /** * Created by mengpan on 2017/8/26. */ class NeuralNetworkModel extends Model{ //记录log val logger = Logger.getLogger("NeuralNetworkModel") //神经网络的四个超参数 override var learningRate: Double = _ override var iterationTime: Int = _ var hiddenLayerStructure: Map[Int, Byte] = _ var outputLayerStructure: (Int, Byte) = _ //记录每一次迭代cost变化的历史数据 override val costHistory: mutable.TreeMap[Int, Double] = new mutable.TreeMap[Int, Double]() //神经网络模型的参数 var paramsList: List[(DenseMatrix[Double], DenseVector[Double])] = _ //神经网络的隐含层与输出层的结构,根据hiddenLayerStructure与outputLayerStructure两个超参数得到 private var hiddenLayers: Seq[Layer] = _ private var outputLayer: Layer = _ def setHiddenLayerStructure(hiddenLayerStructure: Map[Int, Byte]): this.type = { if (hiddenLayerStructure.isEmpty) { throw new Exception("hidden layer should be at least one layer!") } this.hiddenLayerStructure = hiddenLayerStructure this.hiddenLayers = getHiddenLayers(this.hiddenLayerStructure) this } def setOutputLayerStructure(outputLayerStructure: (Int, Byte)): this.type = { this.outputLayerStructure = outputLayerStructure this.outputLayer = getOutputLayer(this.outputLayerStructure) this } override def train(feature: DenseMatrix[Double], label: DenseVector[Double]): NeuralNetworkModel.this.type = { val numExamples = feature.rows val inputDim = feature.cols logger.debug("hidden layers: " + hiddenLayers) logger.debug("output layer: " + outputLayer) //随机初始化模型参数 var paramsList: List[(DenseMatrix[Double], DenseVector[Double])] = initializeParams(numExamples, inputDim, hiddenLayers, outputLayer) (0 until this.iterationTime).foreach{i => val forwardResList: List[ForwardRes] = forward(feature, paramsList, hiddenLayers, outputLayer) logger.debug(forwardResList) val cost = calCost(forwardResList.last, label) if (i % 100 == 0) { logger.info("Cost in " + i + "th time of iteration: " + cost) } costHistory.put(i, cost) val backwardResList: List[BackwardRes] = backward(feature, label, forwardResList, paramsList, hiddenLayers, outputLayer) logger.debug(backwardResList) paramsList = updateParams(paramsList, this.learningRate, backwardResList, i, cost) } this.paramsList = paramsList this } override def predict(feature: DenseMatrix[Double]): DenseVector[Double] = { val forwardResList: List[ForwardRes] = forward(feature, this.paramsList, this.hiddenLayers, this.outputLayer) forwardResList.last.yCurrent(::, 0).map{yHat => if (yHat > 0.5) 1.0 else 0.0 } } private def getHiddenLayers(hiddenLayerStructure: Map[Int, Byte]): Seq[Layer] = { hiddenLayerStructure.map{structure => getLayerByStructure(structure) }.toList } private def getOutputLayer(structure: (Int, Byte)): Layer = { getLayerByStructure(structure) } private def getLayerByStructure(structure: (Int, Byte)): Layer = { val numHiddenUnits = structure._1 val activationType = structure._2 val layer: Layer = LayerUtils.getLayerByActivationType(activationType) .setNumHiddenUnits(numHiddenUnits) layer } private def initializeParams(numExamples: Int, inputDim: Int, hiddenLayers: Seq[Layer], outputLayer: Layer): List[(DenseMatrix[Double], DenseVector[Double])] = { /* *把输入层,隐含层,输出层的神经元个数组合成一个Vector *如inputDim=3,outputDim=1,hiddenDim=(3, 3, 2),则layersDim=(3, 3, 3, 2, 1) *两个List的操作符,A.::(b)为在A前面加上元素B,A.:+(B)为在A的后面加上元素B *这里使用Vector存储layersDim,因为Vector为indexed sequence,访问任意位置的元素时间相同 */ val layersDim = hiddenLayers.map(_.numHiddenUnits) .toList .::(inputDim) .:+(outputLayer.numHiddenUnits) .toVector val numLayers = layersDim.length /* *W(l)的维度为(layersDim(l-1), layersDim(l)) *b(l)的维度为(layersDim(l), ) *注意随机初始化的数值在0-1之间,为保证模型稳定性,需在w和b后面*0.01 */ (1 until numLayers).map{i => val w = DenseMatrix.rand[Double](layersDim(i-1), layersDim(i)) * 0.01 val b = DenseVector.rand[Double](layersDim(i)) * 0.01 (w, b) }.toList } private def forward(feature: DenseMatrix[Double], params: List[(DenseMatrix[Double], DenseVector[Double])], hiddenLayers: Seq[Layer], outputLayer: Layer): List[ForwardRes] = { var yi = feature /* *这里注意Scala中zip的用法。假设A=List(1, 2, 3), B=List(3, 4), 则 * A.zip(B) 为 List((1, 3), (2, 4)) * 复习:A.:+(b)的作用是在A后面加上b元素,注意因为immutable,实际上是生成了一个新对象 */ params.zip(hiddenLayers.:+(outputLayer)) .map{f => val w = f._1._1 val b = f._1._2 val layer = f._2 //forward方法需要yPrevious, w, b三个参数 val forwardRes = layer.forward(yi, w, b) yi = forwardRes.yCurrent forwardRes } } private def calCost(res: ResultUtils.ForwardRes, label: DenseVector[Double]): Double = { val yHat = res.yCurrent(::, 0) //在log函数内加上pow(10.0, -9),防止出现log(0)从而NaN的情况 -(label.t * log(yHat + pow(10.0, -9)) + (1.0 - label).t * log(1.0 - yHat + pow(10.0, -9))) / label.length.toDouble } private def backward(feature: DenseMatrix[Double], label: DenseVector[Double], forwardResList: List[ResultUtils.ForwardRes], paramsList: List[(DenseMatrix[Double], DenseVector[Double])], hiddenLayers: Seq[Layer], outputLayer: Layer): List[BackwardRes] = { val yHat = forwardResList.last.yCurrent(::, 0) //+ pow(10.0, -9)防止出现被除数为0,NaN的情况 val dYL = -(label /:/ (yHat + pow(10.0, -9)) - (1.0 - label) /:/ (1.0 - yHat + pow(10.0, -9))) var dYCurrent = DenseMatrix.zeros[Double](feature.rows, 1) dYCurrent(::, 0) := dYL paramsList .zip(forwardResList) .zip(hiddenLayers.:+(outputLayer)) .reverse .map{f => val w = f._1._1._1 val b = f._1._1._2 val forwardRes = f._1._2 val layer = f._2 logger.debug(DebugUtils.matrixShape(w, "w")) logger.debug(layer) /* *backward方法需要dYCurrent, forwardRes, w, b四个参数 * 其中,forwardRes中有用的为:yPrevious(计算dW),zCurrent(计算dZCurrent) */ val backwardRes = layer.backward(dYCurrent, forwardRes, w, b) dYCurrent = backwardRes.dYPrevious backwardRes } .reverse } private def updateParams(paramsList: List[(DenseMatrix[Double], DenseVector[Double])], learningrate: Double, backwardResList: List[ResultUtils.BackwardRes], iterationTime: Int, cost: Double): List[(DenseMatrix[Double], DenseVector[Double])] = { paramsList.zip(backwardResList) .map{f => val w = f._1._1 val b = f._1._2 val backwardRes = f._2 val dw = backwardRes.dWCurrent val db = backwardRes.dBCurrent logger.debug(DebugUtils.matrixShape(w, "w")) logger.debug(DebugUtils.matrixShape(dw, "dw")) var adjustedLearningRate = this.learningRate //如果cost出现NaN则把学习率降低100倍 adjustedLearningRate = if (cost.isNaN) adjustedLearningRate/100 else adjustedLearningRate w :-= dw * learningrate b :-= db * learningrate (w, b) } } }
第三部分:Layer的实现
接下来我们看一下layer的实现,由于所有的layer类(ReluLayer,SigmoidLayer和TanhLayer)都是名叫Layer的trait的实现,我们先看看layer trait的代码。Layer trait包含三个方法,分别为setNumHiddenUnits:设定本layer中神经元的数量;forward:执行本layer的前向传播的计算;backward:执行本layer反向传播的计算。代码如下:
package org.mengpan.deeplearning.layers import breeze.linalg.{DenseMatrix, DenseVector} import org.apache.log4j.Logger import org.mengpan.deeplearning.utils.ResultUtils.{BackwardRes, ForwardRes} import org.mengpan.deeplearning.utils.{ActivationUtils, DebugUtils, GradientUtils} /** * Created by mengpan on 2017/8/26. */ trait Layer{ private val logger = Logger.getLogger("Layer") var numHiddenUnits: Int var activationFunc: Byte def setNumHiddenUnits(numHiddenUnits: Int): this.type = { this.numHiddenUnits = numHiddenUnits this } s def forward(yPrevious: DenseMatrix[Double], w: DenseMatrix[Double], b: DenseVector[Double]): ForwardRes = { val numExamples = yPrevious.rows logger.debug(DebugUtils.matrixShape(yPrevious, "yPrevious")) logger.debug(DebugUtils.matrixShape(w, "w")) logger.debug(DebugUtils.vectorShape(b, "b")) val zCurrent = yPrevious * w + DenseVector.ones[Double](numExamples) * b.t val yCurrent = ActivationUtils.getActivationFunc(this.activationFunc)(zCurrent) logger.debug("yCurrent: " + yCurrent) ForwardRes(yPrevious, zCurrent, yCurrent) } def backward(dYCurrent: DenseMatrix[Double], forwardRes: ForwardRes, w: DenseMatrix[Double], b: DenseVector[Double]): BackwardRes = { val numExamples = dYCurrent.rows val yPrevious = forwardRes.yPrevious val zCurrent = forwardRes.zCurrent val yCurrent = forwardRes.yCurrent val dZCurrent = dYCurrent *:* GradientUtils.getGradByFuncType(this.activationFunc)(zCurrent) val dWCurrent = yPrevious.t * dZCurrent / numExamples.toDouble val dBCurrent = (DenseVector.ones[Double](numExamples).t * dZCurrent).t / numExamples.toDouble val dYPrevious = dZCurrent * w.t BackwardRes(dYPrevious, dWCurrent, dBCurrent) } override def toString: String = super.toString }
对于其他具体的layer,其唯一的不同就是activationFunc不一样,如对于ReluLayer:
classReluLayer extends Layer{ override var numHiddenUnits: Int = _ override var activationFunc: Byte = MyDict.ACTIVATION_RELU }
对于SigmoidLayer:
classSigmoidLayer extends Layer{ override var numHiddenUnits: Int = _ override var activationFunc: Byte = MyDict.ACTIVATION_SIGMOID }
对于TanhLayer:
classTanhLayer extends Layer{ override var numHiddenUnits: Int = _ override var activationFunc: Byte = MyDict.ACTIVATION_TANH }
第五部分:各种Utils类
代码如下:
ActivationUtils:
package org.mengpan.deeplearning.utils import breeze.linalg.DenseMatrix import breeze.numerics.{relu, sigmoid, tanh} import org.apache.log4j.Logger /** * Created by mengpan on 2017/8/26. */ object ActivationUtils { val logger = Logger.getLogger("ActivationUtils") def getActivationFunc(activationFuncType: Byte): DenseMatrix[Double] => DenseMatrix[Double] = { activationFuncType match { case MyDict.ACTIVATION_SIGMOID => sigmoid(_: DenseMatrix[Double]) case MyDict.ACTIVATION_TANH => tanh(_: DenseMatrix[Double]) case MyDict.ACTIVATION_RELU => relu(_: DenseMatrix[Double]) case _ => logger.fatal("Wrong hidden activation function param given, use tanh by default") tanh(_: DenseMatrix[Double]) } } }
DebugUtils:
package org.mengpan.deeplearning.utils import breeze.linalg.{DenseMatrix, DenseVector} /** * Created by mengpan on 2017/8/26. */ object DebugUtils { def matrixShape(w: DenseMatrix[Double], objectName: String): String = { objectName + "'s shape: (" + w.rows + ", " + w.cols + ")" } def vectorShape(b: DenseVector[Double], objectName: String): String = { objectName + "'s shape: (" + b.length + ")" } }
GradientUtils:
package org.mengpan.deeplearning.utils import breeze.linalg.DenseMatrix import breeze.numerics.{pow, sigmoid} /** * Created by mengpan on 2017/8/25. */ object GradientUtils { def reluGrad(z: DenseMatrix[Double]): DenseMatrix[Double] = { val numRows = z.rows val numCols = z.cols val res = DenseMatrix.zeros[Double](numRows, numCols) (0 until numRows).foreach{i => (0 until numCols).foreach{j => res(i, j) = if (z(i, j) >= 0) 1.0 else 0.0 } } res } def tanhGrad(z: DenseMatrix[Double]): DenseMatrix[Double] = { 1.0 - pow(z, 2) } def sigmoidGrad(z: DenseMatrix[Double]): DenseMatrix[Double] = { val res = sigmoid(z) *:* (1.0 - sigmoid(z)) res.map{d => if(d < pow(10.0, -9)) pow(10.0, -9) else if (d > pow(10.0, 2)) pow(10.0, 2) else d } } def getGradByFuncType(activationFuncType: Byte): DenseMatrix[Double] => DenseMatrix[Double] = { activationFuncType match { case MyDict.ACTIVATION_TANH => tanhGrad case MyDict.ACTIVATION_RELU => reluGrad case MyDict.ACTIVATION_SIGMOID => sigmoidGrad case _ => throw new Exception("Unsupported type of activation function") } } }
LayerUtils:
package org.mengpan.deeplearning.utils import org.mengpan.deeplearning.layers.{Layer, ReluLayer, SigmoidLayer, TanhLayer} /** * Created by mengpan on 2017/8/26. */ object LayerUtils { def getLayerByActivationType(activationType: Byte): Layer = { activationType match { case MyDict.ACTIVATION_TANH => new TanhLayer() case MyDict.ACTIVATION_RELU => new ReluLayer() case MyDict.ACTIVATION_SIGMOID => new SigmoidLayer() case _ => throw new Exception("Unsupported type of activation function") } } }
NomalizeUtils:
package org.mengpan.deeplearning.utils import breeze.linalg.{DenseMatrix, DenseVector} import org.mengpan.deeplearning.helper.DlCollection import org.mengpan.deeplearning.data.Data /** * Created by mengpan on 2017/8/26. */ object NormalizeUtils { def normalizeBy[E <: Data](data: DlCollection[E])(normalizeFunc: DenseVector[Double] => DenseVector[Double]): DlCollection[E] = { val feature = data.getFeatureAsMatrix val numCols = feature.cols val numRows = feature.rows val normalizedFeature = DenseMatrix.zeros[Double](numRows, numCols) (0 until numCols).foreach{j => val ithCol = feature(::, j) normalizedFeature(::, j) := normalizeFunc(ithCol) } var i = -1 val res = data.map[E]{eachData => i += 1 eachData.updateFeature(normalizedFeature(i, ::).t) } res } }
ResultUtils:
package org.mengpan.deeplearning.utils import breeze.linalg.{DenseMatrix, DenseVector} /** * Created by mengpan on 2017/8/26. */ object ResultUtils { case class ForwardRes(val yPrevious: DenseMatrix[Double], val zCurrent: DenseMatrix[Double], val yCurrent: DenseMatrix[Double]) { override def toString: String = "yPrevious:{" + yPrevious + "}\n" + "zCurrent: {" + zCurrent + "}\n" + "yCurrent: {" + yCurrent + "}\n" } case class BackwardRes(val dYPrevious: DenseMatrix[Double], val dWCurrent: DenseMatrix[Double], val dBCurrent: DenseVector[Double]) { override def toString: String = "dYPrevious:{" + dYPrevious + "}\n" + "dWCurrent: {" + dWCurrent + "}\n" + "dBCurrent: {" + dBCurrent + "}\n" } }
以上就是使用Scala从头实现多层神经网络的程序。神经网络的理论固然重要,但是只要上一门好的课,有正确的指导,理解起来其实不是很难。因为从数学的角度来说,反向传播的最大难点应该是向量微积分,属于很基础的数学内容,作为数学系的学生我大三就在PDEs and Vector Calculus课上学过相关内容。或许是术业有专攻,因为我没有系统学过软件工程的相关知识,现在所拥有的相关知识都是实习中学到的很零散的,所以感觉现在自己最大的短板在于如何阻止一个好的项目结构,如何正确地使用开发框架提升开发效率,如何设计良好的接口。我的GitHub的项目地址为:https://github.com/pan5431333/coursera-deeplearning-practice-in-scala-remote,欢迎对代码进行指正,共同学习进步!