win10下caffe+gpu使用问题总结

  • 环境:

gtx 1080Ti

i7

32g内存

三星sm951,256g固态NVME硬盘

vs2013+win10+cuda9.0+cudnn5.1

  • NVME的硬盘只支持win10的固态硬盘,无法安装linux系统,因此后续的问题都是基于windows环境下的

  • x64 无法调试,出现“调试监视器(MSVSMON.EXE)未能启动”错误

解决方案:打开vs2013后,从文件-->打开->项目里重新打开caffe,只能是暂时的解决方案


  • 使用社区版2013,专业版装时不知道是什么问题Nuget使用不了,在编译过程中protobuf调用出现1083的错误码,读取dll错误的问题,不清楚是什么原因。

  • opencv、glog、gflafs在配置gpu版本时出现以下错误

      error:NuGet Error:未知命令:“overlay”

      error:MSB4062 加载任务“NuGetPackageOverlay”失败问题

      反正使用Nuget怎么配都有问题

      解决方案:手动配置,工程下的packages.config文件写了依赖库的版本,按这个去下载对应的库,然后在属性文件CommonSettings.props中

      去配置:


        
        D:\wxf\opencv\build
        

        
        F:\caffe\NugetPackages\glog.0.3.3.0\build\native
        
        F:\caffe\NugetPackages\gflags.2.1.2.1\build\native



        $(OpencvPath)\x64\vc12\lib;$(LibraryPath)
        $(OpencvPath)\include;$(OpencvPath)\include\opencv;$(OpencvPath)\include\opencv2;$(IncludePath)



        $(GflagsPath)\x64\v120\dynamic\Lib;$(LibraryPath)
        $(GflagsPath)\include;$(IncludePath)



      
      $(GlogPath)\lib\x64\v120\Release\dynamic;$(LibraryPath)
      $(GlogPath)\include;$(IncludePath)
      
      
        opencv_objdetect2410.lib;
        opencv_ts2410.lib;
        opencv_video2410.lib;
        opencv_nonfree2410.lib;
        opencv_ocl2410.lib;
        opencv_photo2410.lib;
        opencv_stitching2410.lib;
        opencv_superres2410.lib;
        opencv_videostab2410.lib;
        opencv_calib3d2410.lib;
        opencv_contrib2410.lib;
        opencv_core2410.lib;
        opencv_features2d2410.lib;
        opencv_flann2410.lib;
        opencv_gpu2410.lib;
        opencv_highgui2410.lib;
        opencv_imgproc2410.lib;
        opencv_legacy2410.lib;
        opencv_ml2410.lib;
        libglog.lib;
        gflags.lib;
        gflags_nothreads.lib;
        $(CudaDependencies)
        



      $(GlogPath)\lib\x64\v120\Debug\dynamic;$(LibraryPath)
      $(GlogPath)\include;$(IncludePath)
      
        opencv_ml2410d.lib;
        opencv_calib3d2410d.lib;
        opencv_contrib2410d.lib;
        opencv_core2410d.lib;
        opencv_features2d2410d.lib;
        opencv_flann2410d.lib;
        opencv_gpu2410d.lib;
        opencv_highgui2410d.lib;
        opencv_imgproc2410d.lib;
        opencv_legacy2410d.lib;
        opencv_objdetect2410d.lib;
        opencv_ts2410d.lib;
        opencv_video2410d.lib;
        opencv_nonfree2410d.lib;
        opencv_ocl2410d.lib;
        opencv_photo2410d.lib;
        opencv_stitching2410d.lib;
        opencv_superres2410d.lib;
        opencv_videostab2410d.lib;
        libglog.lib;
        gflagsd.lib;
        gflags_nothreadsd.lib;
        $(CudaDependencies)
      
 

然后在项目的附加依赖项中添加$(OpencvDependencies),这样编译基本就没问题了。

  • 数据类型不匹配,把数据转为LMDB,但还是报错,反正就是其他地方都没问题了,但还是报错估计就是路径问题了,‘/’,‘\’是不一样的,路径最好都用‘/’。
 
  • 设置数据层为:
              name: "LeNet"

              layer {

              name: "data"

              type: "Input"

              top: "data"

              input_param { shape: { dim: 1 dim: 1 dim: 28 dim: 28 } }

              }

       错误:Check failed: labels_.size() == output_layer->channels() (1 vs. 10) Number of labels  is different from the output layer dimension.

       分类的标签文件不对,在label.txt中只写了个0,应该是0,1,2,3,4,5,6,7,8,9


  • 设置:input_param { shape: { dim: 1 dim: 3 dim: 28 dim: 28 } }时出现:
    Cannot copy param 0 weights from layer 'conv1'; shape mismatch.  Source param shape is 20 1 5 5 (500); target param shape is 20 3 5 5 (1500). To learn this layer's parameters from scratch rather than copying from a saved net, rename the layer.

     设置训练模型时的通道数的地方没找到,后面回来补充下。


  • 关于MNIST数据集测试不准确的问题

     之前感觉是均值文件的问题,因为训练LeNet时没加减均值的过程,但应该不至于有很高的错误率,后面在知乎上 https://www.zhihu.com/question/52047327 原来训练集是黑底白字的!将测试样本改为黑底白字后就ok了



你可能感兴趣的:(机器学习&深度学习)