2018-01-06

keywords: CNTK C# 神经网络

CNTK是微软搞的开源神经网络，因为其提供C#接口，并且自称速度最快，所以决定装一个用用。

1、CNTK安装

参见官方文档，很详细：Setup CNTK on your machine

主要有两步：

1）下载CNTK二进制版，分为单CPU版和带GPU版，我的笔记本不支持GPU，所以下的是单CPU版，CNTK for Windows v.2.3 CPU only ，下载后解压即可。

2）运行Install.bat

cd cntk\Scripts\install\windows

install.bat

脚本自动完成相关代码的安装，包括pyphon35、OpenCv、qt等很多东东。安装只需两步，下载&install，一切似乎很美好，但是，先别高兴太早，坑来了。。。

也不知道是我的网络问题还是美帝搞破坏，install的时候总是会异常终止。咋办呢，只有两个字，坚持啊兄弟！再次运行install，再三次运行install，再八次运行install。。。终于，OK了。

install还是很聪明的，不会每次从头安装，而是从上次中断的地方开始安装。install完成时是过去了半天、一天还是一年呢，不记得了，感觉一切好遥远。

这里还要提一句，使用cntk pyphon的兄弟，每次都要运行scripts目录下的cntkpy35.bat脚本，所以不妨将scripts目录放在环境变量path中。

另外，将cntk运行文件的目录也加在path中，方便以后使用。

2、使用C# API

CNTK的C# API HelloWorld程序是名叫TrainingCSharp的东东，也不知道是个干啥的，我本地的目录是C:\cntk23\Examples\TrainingCSharp，这次的目的就是让这货动起来。

编译

话不多说，直接运行Visual Studio，加载解决方案，编译，会出现很多错，主要是引用出了问题，没关系，重新生成项目即可。如果还有问题，那可能是你的VS没有装NuGit。

准备训练文件

F5运行，瞪，蹦出来一个错误

Input file '../../Tests/EndToEndTests/Image/Data\Train_cntk_text.txt' is not open.

还没有训练文件啊，赶快搞一个吧，从哪儿搞呢，微软说，从这儿搞 CNTK C#/.NET API training examples，那就搞吧，咋搞呢，意思就是说，在目录Examples\Image\DataSets\CIFAR-10下面，运行这句话

python install_cifar10.py

又是个坑，这句话运行完了就不动了，下载文件总该有个进度条吧，界面真不友好，一个字，等吧，这可好，等了快一天了就是不见动静，终于忍无可忍，按了一个Ctrl Z，奇迹出现了，下载完成，正在生成Txt文件，绳啊，这是下载程序有bug啊......也许是后门呢。

在CNTK目录下，新建目录Tests2\EndToEndTests\Image\Data，放入文件Train_cntk_text.txt；为什么是在CNTK目录下呢，因为项目生成的执行程序也在CNTK目录下的x64目录中。

修改数组维度

F5继续，瞪，又蹦出来一个异常：

System.ApplicationException:“Reached the maximum number of allowed errors while reading the input file (../../Tests/EndToEndTests/Image/Data\Train_cntk_text.txt).

想了半天没想明白，既然是跟Train_cntk_text.txt文件有关系，那就查吧，这个文件是由cifar10生成的，这是下载页面The CIFAR-10 dataset，看这句话：

The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.

也就是说，图片是32*32的彩色图片，再看程序，MNISTClassifier.TrainAndEvaluate中，是这两句

int[] imageDim = useConvolution ? new int[] { 28, 28, 1 } : new int[] { 784 };

int imageSize = 28 * 28;

改为（32*32 3通道）

int[] imageDim = useConvolution ? new int[] { 32, 32, 3 } : new int[] { 32 * 32 * 3 };

int imageSize = 32 * 32 * 3;

F5继续，瞪，又是个异常

System.ApplicationException:“GEMM convolution engine does not support this convolution configuration. It is possible to make GEMM engine work with this configuration by defining input/output/kernel using tensors of higher(+1) dimension. Geometry: Input: 16 x 16 x 12, Output: 16 x 16 x 72, Kernel: 3 x 3 x 4, Map: 1 x 1 x 24, Stride: 1 x 1 x 4, Sharing: (1, 1, 1), AutoPad: (1, 1, 1), LowerPad: 0 x 0 x 0, UpperPad: 0 x 0 x 0[CALL STACK] >

前面改的就感觉没完，果然，还需改下面，MNISTClassifier.CreateConvolutionalNeuralNetwork中

// 28x28x1 -> 14x14x4

int kernelWidth1 = 3, kernelHeight1 = 3, numInputChannels1 = 1, outFeatureMapCount1 = 4;

改为（因为我们是彩色三通道）

// 28x28x3 -> 14x14x12

int kernelWidth1 = 3, kernelHeight1 = 3, numInputChannels1 = 3, outFeatureMapCount1 = 4 * 3;

还有这句，也要改

//// 14x14x4 -> 7x7x8

//int kernelWidth2 = 3, kernelHeight2 = 3, numInputChannels2 = outFeatureMapCount1, outFeatureMapCount2 = 8;

// 14x14x12 -> 7x7x24

int kernelWidth2 = 3, kernelHeight2 = 3, numInputChannels2 = outFeatureMapCount1, outFeatureMapCount2 = 8 * 3;

放弃LSTM

F5继续，瞪呢？好久没有出现了，程序猿的潜意识告诉我，不可能！果然，瞪又来了，

Input file '../../Tests/EndToEndTests/Text/SequenceClassification/Data\Train.ctf' is not open.,

ctf文件是啥东东，查，原来跟Train_cntk_text.txt是一样的，好吧，建个目录Tests\EndToEndTests\Text\SequenceClassification\Data，加入Train.ctf文件，再次运行，

瞪，果然，

WARNING: Empty input row at offset 10759 in the input file (../../Tests/EndToEndTests/Text/SequenceClassification/Data\Train.ctf).

WARNING: Could not read a row (# 1) while loading sequence (id = 0) at offset 10759 in the input file (../../Tests/EndToEndTests/Text/SequenceClassification/Data\Train.ctf).

Message=Reached the maximum number of allowed errors while reading the input file (../../Tests/EndToEndTests/Text/SequenceClassification/Data\Train.ctf).

还是训练数据数组维度的问题，LSTMSequenceClassifier中进行如下修改

//const int inputDim = 2000;

//const int cellDim = 25;

//const int hiddenDim = 25;

//const int embeddingDim = 50;

//const int numOutputClasses = 5;

const int inputDim = 32 * 32;

const int cellDim = 3;

const int hiddenDim = 25;

const int embeddingDim = 50;

const int numOutputClasses = 10;

还是不行，终于晋楚杀着，直接在program.c中注释掉了这一句

LSTMSequenceClassifier.Train(device);

这下好了，终于没人闹心了，哈哈。

回顾

安装了CNTK，安装了cifar10样本数据，用C#跑通了逻辑回归（LogisticRegression）、multilayer perceptron （MLP）、 convolutional neural network（CNN），没有跑通 LSTM（长短记忆模型），虽然跑通了，但还不知道里面都是个啥，需要一个一个再分析实践，留到下节了。

CNTK2.3安装及C# API 之HelloWorld（CNTK C#入门1）

1、CNTK安装

2、使用C# API

编译

准备训练文件

修改数组维度

放弃LSTM

回顾

你可能感兴趣的:(CNTK2.3安装及C# API 之HelloWorld（CNTK C#入门1）)