深度学习进阶(六)--CNN卷积神经网络调试,错误历程总结

总结一下今天的学习过程

(注:我此刻的心情与刚刚三分钟前的心情是完全不一样的)

(昨天在想一些错误,今天又重拾信心重新配置GPU环境,结果很失败,不过现在好了,在寻思着今天干了什么的时候,无意间想到是不是自己方法入口不对啊。结果果然很幸运的被我猜到了,,,哈哈哈,我的心情又好了)

总结最大的体会:有的时候在代码不能运行的时候,可是尝试先看看学习代码,

起码从代码的入口调用看起、看清、看准,不要整个全网基本不会出现的错误让自己造出来了,更重要的是自己看不懂这个错误

起码从代码的入口调用看起、看清、看准,不要整个全网基本不会出现的错误让自己造出来了,更重要的是自己看不懂这个错误

起码从代码的入口调用看起、看清、看准,不要整个全网基本不会出现的错误让自己造出来了,更重要的是自己看不懂这个错误

哎 崩溃的一天

总结一下昨天失败的原因

1,编码问题

使用TextEncoding.exe文件将E:\Python\CRDA\cuda8\include中编码改成Unicode

2,什么出现宏定义啥的,解决的办法忘了,明天或许会帮同学重装,到时再补充

3,cuda9与cuda8再vs2015中冲突,vs2015中使用的的是cuda9,将其cuda9卸载和cuda8都卸载,重新安装cuda8

4,使用vs2015编译cudaruntime程序,并运行,生成一些所谓的必要文件,我也是傻懵的,结果少了很多错误,最好在64位和32位都debug编译

5,在编译cudaruntime程序时,会出现FIB类似的文件打不开或者无法找到,这些都可以在百度中寻找答案,建议:不要在意有关nvxxxx.dll文件找不到无法加载,然后自己又去DOS窗口使用regsvr32 C:WindowsSysWOW64\nvcuda.dll等等,这是非常愚蠢的,,

会出现一个已加载但找不到入口点DLLRegisterServer

这个可以尽情的百度吧、谷歌吧,结果你会奔溃的,除非你对dll动态库或者C特别熟,否则你会干耗死在这

只要不是nvxxx.dll的都可以忽略

6,其他的一些代码问题,百度解决一下即可

其中关于downsample与pool_2d,由于Python的版本不同,Python3中被替换

pooled_out = pool_2d(input=conv_out, ws=self.poolsize, ignore_border=True)这里是ws不是ds,不然会有一个警告

 UserWarning: DEPRECATION: the 'ds' parameter is not going to exist anymore as it is going to be replaced by the parameter 'ws'.
  pooled_out = pool_2d(input=conv_out, ds=self.poolsize, ignore_border=True)


conv2d这个库导入也要注意,有时候

from theano.tensor.nnet import conv2d

conv_out = conv2d(
            input=self.inpt, filters=self.w, filter_shape=self.filter_shape,
            input_shape=self.image_shape)


7,还有一个严重的错误是关于cudnn版本的

在cudnn-8.0-windows7-x64-v7中有一个XXX_DV4,类似cudnnGetPoolingNdDescriptor_v4的错误,说是找不到此文件,这里要将cudnn-8.0-windows7-x64-v7替换成cudnn-8.0-windows7-x64-v5.1,警告也解决了,错误也解决了

其他的错误一时半会想不出来了,反正就是太多太多了


最后说一下今天的最大的失误愚蠢

起码从代码的入口调用看起、看清、看准,不要整个全网基本不会出现的错误让自己造出来了,更重要的是自己看不懂这个错误

造成的错误:

Trying to run under a GPU.  If this is not desired, then modify network3.py
to set the GPU flag to False.
(, Elemwise{Cast{int32}}.0)
cost: Elemwise{add,no_inplace}.0
grads: [Elemwise{add,no_inplace}.0, GpuFromHost.0, Elemwise{add,no_inplace}.0, GpuFromHost.0, Elemwise{add,no_inplace}.0, GpuFromHost.0]
updates: [(, Elemwise{sub,no_inplace}.0), (, Elemwise{sub,no_inplace}.0), (w, Elemwise{sub,no_inplace}.0), (b, Elemwise{sub,no_inplace}.0), (w, Elemwise{sub,no_inplace}.0), (b, Elemwise{sub,no_inplace}.0)]
train_mb: 
Training mini-batch number 0
Traceback (most recent call last):
  File "E:\Python\Anaconda3\lib\site-packages\theano\compile\function_module.py", line 884, in __call__
    self.fn() if output_subset is None else\
ValueError: GpuReshape: cannot reshape input of shape (10, 20, 12, 12) to shape (10, 784).

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "E:\Python\NewPythonData\neural-networks-and-deep-learning\src\demo3.py", line 38, in 
    net.SGD(training_data,10,mini_batch_size,0.1,validation_data,test_data)
  File "E:\Python\NewPythonData\neural-networks-and-deep-learning\src\network3.py", line 179, in SGD
    train_mb(minibatch_index)
  File "E:\Python\Anaconda3\lib\site-packages\theano\compile\function_module.py", line 898, in __call__
    storage_map=getattr(self.fn, 'storage_map', None))
  File "E:\Python\Anaconda3\lib\site-packages\theano\gof\link.py", line 325, in raise_with_op
    reraise(exc_type, exc_value, exc_trace)
  File "E:\Python\Anaconda3\lib\site-packages\six.py", line 685, in reraise
    raise value.with_traceback(tb)
  File "E:\Python\Anaconda3\lib\site-packages\theano\compile\function_module.py", line 884, in __call__
    self.fn() if output_subset is None else\
ValueError: GpuReshape: cannot reshape input of shape (10, 20, 12, 12) to shape (10, 784).
Apply node that caused the error: GpuReshape{2}(GpuElemwise{add,no_inplace}.0, TensorConstant{[ 10 784]})
Toposort index: 51
Inputs types: [CudaNdarrayType(float32, 4D), TensorType(int32, vector)]
Inputs shapes: [(10, 20, 12, 12), (2,)]
Inputs strides: [(2880, 144, 12, 1), (4,)]
Inputs values: ['not shown', array([ 10, 784])]
Outputs clients: [[GpuElemwise{Composite{(scalar_sigmoid(i0) * i1)},no_inplace}(GpuReshape{2}.0, GpuFromHost.0)]]

HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.
错误具体我这里就不说了,

如果出现类似的错误,希望朋友们重头确认一下,自己输入的卷积层、采样层、全连接层的参数是否错误

改正后,结果非常激动

Trying to run under a GPU.  If this is not desired, then modify network3.py
to set the GPU flag to False.
(, Elemwise{Cast{int32}}.0)
cost: Elemwise{add,no_inplace}.0
grads: [Elemwise{add,no_inplace}.0, GpuFromHost.0, Elemwise{add,no_inplace}.0, GpuFromHost.0, Elemwise{add,no_inplace}.0, GpuFromHost.0]
updates: [(, Elemwise{sub,no_inplace}.0), (, Elemwise{sub,no_inplace}.0), (w, Elemwise{sub,no_inplace}.0), (b, Elemwise{sub,no_inplace}.0), (w, Elemwise{sub,no_inplace}.0), (b, Elemwise{sub,no_inplace}.0)]
train_mb: 
Training mini-batch number 0
Training mini-batch number 1000
Training mini-batch number 2000
Training mini-batch number 3000
Training mini-batch number 4000
Epoch 0: validation accuracy 93.73%
This is the best validation accuracy to date.
The corresponding test accuracy is 93.20%



scikit-neuralnetwork中的examples/plot_mlp.py

测试技测试成功

cd 到scikit-neuralnetwork源目录下(从GitHub上获取scikit-neuralnetwork源码https://github.com/aigamedev/scikit-neuralnetwork

python examples/plot_mlp.py --params activation

结果

深度学习进阶(六)--CNN卷积神经网络调试,错误历程总结_第1张图片



你可能感兴趣的:(深度学习)