Clément Godard, Oisin Mac Aodha, Michael Firman, Gabriel Brostow
论文地址:https://arxiv.org/abs/1806.01260
源码地址 https://github.com/nianticlabs/monodepth2
我的电脑环境:
Anaconda 4.4.10 ,Python3.6,Cuda 9.0,Ubuntu16.04,pytorch 安装了Cuda9.2对应的版本。
GPU Tesla P40
首先建立单独的空间
conda create monodepth2 python3.6
然后根据需要设计参数,输入运行指令。(参数参考options.py文件,以及experiments文件夹中的py文件)
python train.py --model_name M_odom --split odom --dataset kitti_odom --data_path PATH --png --height 128 --width 416
--model_name表示存储的文件夹的名字
--split 表示训练集类型
--dataset 表示数据集类型
--data_path 表示未经预处理的数据类型
--png表示使用的数据集图像是Png格式,根据文档,png格式运行速度会慢一些,可以使用作者给的指令转换成jpeg格式
--height 128 --width 416 表示处理后用于训练的图像分辨率
运行train.py时由于缺少一些文件会报错
报错
ModuleNotFoundError: No module named 'tensorboardX'
解决
pip install tensorboardX
报错
ModuleNotFoundError: No module named 'skimage'
解决
pip install scikit-image
报错
ModuleNotFoundError: No module named 'IPython'
解决
pip install Ipython
正常训练的话会显示以下信息
Training model named:
M_odom
Models and tensorboard events files are saved to:
/home/lindian/tmp
Training is using:
cuda
Using split:
odom
There are 36671 training items and 4075 validation items
以及训练的进度
epoch 0 | batch 0 | examples/s: 5.5 | loss: 0.13954 | time elapsed: 00h00m08s | time left: 00h00m00s
.
.
.
Training
epoch 19 | batch 1955 | examples/s: 28.6 | loss: 0.05622 | time elapsed: 11h35m57s | time left: 00h12m45s
如果训练的是深度网络
将数据集地址换成下载好的KITTI raw data就好了
Loading encoder weights...
Loading depth weights...
Loading pose_encoder weights...
Loading pose weights...
Cannot find Adam weights so Adam is randomly initialized
Training model named:
M_640x192
Models and tensorboard events files are saved to:
/home/lindian/tmp
Training is using:
cuda
Using split:
eigen_zhou
There are 39810 training items and 4424 validation items
depth比pose训练速度快,jpg格式比png格式训练快一些
epoch 0 | batch 0 | examples/s: 4.6 | loss: 0.06565 | time elapsed: 00h00m29s | time left: 00h00m00s
epoch 0 | batch 250 | examples/s: 12.0 | loss: 0.08074 | time elapsed: 00h07m09s | time left: 07h47m29s
.
.
.
epoch 4 | batch 2732 | examples/s: 13.5 | loss: 0.07128 | time elapsed: 07h26m34s | time left: 00h16m19s
测试结果
虽然作者说可以用png格式文件训练,而且可以成功训练,但是用png格式的数据集测试结果的时候会报错,所以我还是把png格式的数据集都转成了jpg格式。
然后开始测试
Depth
python export_gt_depth.py --data_path DATA_PATH --split eigen
python evaluate_depth.py --load_weights_folder 你的权重文件存储位置 --eval_mono --data_path DATA_PATH
得到结果
-> Computing predictions with size 640x192
-> Evaluating
Mono evaluation - using median scaling
Scaling ratios | med: 5.708 | std: 0.064
abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 |
& 0.110 & 0.814 & 4.707 & 0.190 & 0.882 & 0.960 & 0.981 \\
用png格式训练出来的网络结果居然好一丢丢(忽略不计吧)
-> Computing predictions with size 640x192
-> Evaluating
Mono evaluation - using median scaling
Scaling ratios | med: 5.749 | std: 0.064
abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 |
& 0.109 & 0.806 & 4.711 & 0.190 & 0.882 & 0.960 & 0.981 \\
试了一下文章给的权重,结果差不多,差异的原因应该是随机初始化不同。
-> Computing predictions with size 640x192
-> Evaluating
Mono evaluation - using median scaling
Scaling ratios | med: 5.601 | std: 0.064
abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 |
& 0.108 & 0.820 & 4.693 & 0.188 & 0.884 & 0.961 & 0.981 \\
Pose
png格式训练
序列9
-> Computing pose predictions
Trajectory error: 0.021, std: 0.009
-> Predictions saved to /home/lindian/.../M_odom/models/weights_19/poses.npy
序列10
Trajectory error: 0.016, std: 0.010
jpg格式训练
序列9
Trajectory error: 0.017, std: 0.009
序列10
Trajectory error: 0.015, std: 0.010
文章的
序列9
Trajectory error: 0.020, std: 0.009
序列10
Trajectory error: 0.014, std: 0.010