文章阅读 - MVSNet: Depth Inference for Unstructured Multi-view Stereo (ECCV 2018)

相关工作

MVS算法分类

  1. 直接点云重建 [1-2],缺点:难以做到充分的并行处理
  2. volumetric重建 [3-4],缺点:空间离散带来的误差以及内存消耗较大
  3. 深度图重建 [5-9]

基于学习的stereo

匹配两个patch [10-12]、cost正则化 [13-15]

基于学习的MVS

SurfaceNet [3]、LSM [4](只能用于小尺度场景)

网络结构

文章阅读 - MVSNet: Depth Inference for Unstructured Multi-view Stereo (ECCV 2018)_第1张图片

需要关注的几点:

1. 计算cost volumn时使用方差而不是均值 [16],可以更好的表示多视特征图间的差异。

文章阅读 - MVSNet: Depth Inference for Unstructured Multi-view Stereo (ECCV 2018)_第2张图片

2. 引入了probability volumn (沿着深度方向对cost volumn进行softmax的处理),不仅用于每个像素的深度的评估,而且用于表示深度评估的置信度。

3. 初始深度图计算 ,最直接的处理是argmax,但由于其不能提供子像素的评估,且反向传播时不可微,故这里的处理是沿着深度方向计算数学期望。

文章阅读 - MVSNet: Depth Inference for Unstructured Multi-view Stereo (ECCV 2018)_第3张图片

4. 概率图计算,对于错误匹配的像素点,他们的概率分布往往比较杂乱。

文章阅读 - MVSNet: Depth Inference for Unstructured Multi-view Stereo (ECCV 2018)_第4张图片

因此这里简单的采用四个最近的深度假设值的概率和计算得到概率图。

5. 深度优化中,加入参考图像并且采用残差学习的方式。此外,为了避免深度尺度保持在一个固定的范围,对初始深度图进行归一化到 [0,1] 之中,优化结束后再对其进行去归一化处理。

6. loss的计算

文章阅读 - MVSNet: Depth Inference for Unstructured Multi-view Stereo (ECCV 2018)_第5张图片

训练以及后处理

1. 训练

数据集:DTU [17]

2. 后处理

滤波:去除概率低于0.8的点;去除深度不一致的点。

融合 [18]

实验

1. DTU

文章阅读 - MVSNet: Depth Inference for Unstructured Multi-view Stereo (ECCV 2018)_第6张图片

文章阅读 - MVSNet: Depth Inference for Unstructured Multi-view Stereo (ECCV 2018)_第7张图片

反射区域以及弱纹理区域表现较好。

2. Tanks and Temples

文章阅读 - MVSNet: Depth Inference for Unstructured Multi-view Stereo (ECCV 2018)_第8张图片

文章阅读 - MVSNet: Depth Inference for Unstructured Multi-view Stereo (ECCV 2018)_第9张图片

3. Ablations

文章阅读 - MVSNet: Depth Inference for Unstructured Multi-view Stereo (ECCV 2018)_第10张图片

限制

ground truth深度图没有完整的遮挡以及背景信息。

参考文献

[1] Furukawa, Y., Ponce, J.: Accurate, dense, and robust multiview stereopsis. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) (2010)

[2] Lhuillier, M., Quan, L.: A quasi-dense approach to surface reconstruction from uncalibrated images. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) (2005)

[3] Ji, M., Gall, J., Zheng, H., Liu, Y., Fang, L.: Surfacenet: An end-to-end 3d neural network for multiview stereopsis. International Conference on Computer Vision (ICCV) (2017)

[4] Kar, A., H¨ane, C., Malik, J.: Learning a multi-view stereo machine. Advances in Neural Information Processing Systems (NIPS) (2017)

[5] Campbell, N.D., Vogiatzis, G., Hern´andez, C., Cipolla, R.: Using multiple hypotheses to improve depth-maps for multi-view stereo. European Conference on Computer Vision (ECCV) (2008)

[6] Galliani, S., Lasinger, K., Schindler, K.: Massively parallel multiview stereopsis by surface normal diffusion. International Conference on Computer Vision (ICCV) (2015)

[7] Sch¨onberger, J.L., Zheng, E., Frahm, J.M., Pollefeys, M.: Pixelwise view selection for unstructured multi-view stereo. European Conference on Computer Vision (ECCV) (2016)

[8] Tola, E., Strecha, C., Fua, P.: Efficient large-scale multi-view stereo for ultra highresolution image sets. Machine Vision and Applications (MVA) (2012)

[9] Yao, Y., Li, S., Zhu, S., Deng, H., Fang, T., Quan, L.: Relative camera refinement for accurate dense reconstruction. 3D Vision (3DV) (2017)

[10] Han, X., Leung, T., Jia, Y., Sukthankar, R., Berg, A.C.: Matchnet: Unifying feature and metric learning for patch-based matching. Computer Vision and Pattern Recognition (CVPR) (2015)

[11] Luo, W., Schwing, A.G., Urtasun, R.: Efficient deep learning for stereo matching. Computer Vision and Pattern Recognition (CVPR) (2016)

[12] Zbontar, J., LeCun, Y.: Stereo matching by training a convolutional neural network to compare image patches. Journal of Machine Learning Research (JMLR) (2016)

[13] Seki, A., Pollefeys, M.: Sgm-nets: Semi-global matching with neural networks. Computer Vision and Pattern Recognition Workshops (CVPRW) (2017)

[14] Kn¨obelreiter, P., Reinbacher, C., Shekhovtsov, A., Pock, T.: End-to-end training of hybrid cnn-crf models for stereo. Computer Vision and Pattern Recognition (CVPR) (2017)

[15] Kendall, A., Martirosyan, H., Dasgupta, S., Henry, P.: End-to-end learning of geometry and context for deep stereo regression. Computer Vision and Pattern Recognition (CVPR) (2017)

[16] Hartmann, W., Galliani, S., Havlena, M., Van Gool, L., Schindler, K.: Learned multi-patch similarity. International Conference on Computer Vision (ICCV) (2017)

[17] Aanæs, H., Jensen, R.R., Vogiatzis, G., Tola, E., Dahl, A.B.: Large-scale data for multiple-view stereopsis. International Journal of Computer Vision (IJCV) (2016)

[18] Merrell, P., Akbarzadeh, A., Wang, L., Mordohai, P., Frahm, J.M., Yang, R., Nist´er, D., Pollefeys, M.: Real-time visibility-based fusion of depth maps. International Conference on Computer Vision (ICCV) (2007)

文章地址:https://arxiv.org/abs/1804.02505

代码地址:https://github.com/YoYo000/MVSNet

你可能感兴趣的:(文章阅读 - MVSNet: Depth Inference for Unstructured Multi-view Stereo (ECCV 2018))