TCSVT论文结构整理

本文记录了在准备TCSVT论文的过程中发现的论文撰写规律,经验主要基于五篇论文,分别是:

  1. 文献一:《Accurate Image-Guided Stereo Matching With Efficient Matching Cost and Disparity Refinement》(2016)
  2. 文献二:《PMSC:PatchMatch-Based Superpixel Cut for accurate Stereo Matching》(2018)
  3. 文献三:《NIPM-sWMF: Towards Efficient FPGA Design for High-Definition Large-Disparity Stereo Matching》(2018)
  4. 文献四:《Robust Adaptive Normalized cross-Correlation for Stereo Matching Cost computation》(2017)
  5. 文献五:《Color Image-Guided Boundary-Inconsistent region refinement for Stereo Matching》。

更新于2019.01.17。

官网要求看这里。

TCSVT论文结构整理

  • 论文结构
  • 摘要
    • 问题综述
    • 算法概括(算法细节)
    • 实验结果
  • Index Terms
  • Introduction
    • 引子 + 应用场景
    • 现状概述
    • 论文结构介绍示例
  • Experiments and Discussion
    • 第一段(综述)
    • 第一节(实验设置)
    • 第二节(实验结果及比较)
  • Conclusion
    • 方法功能综述
    • 各部分概括(或优势重申)
    • 实验结果
    • 功能拓展
    • 未来计划
  • 其他问题

论文结构

论文结构基本相同,依次为:

  • Abstract --> Index Terms --> Introduction --> (Related Works/Background and Motivation) --> Proposed Method --> More Info --> Experimental Results/Experiments and Discussion/Experiments --> Conclusion

这个结构中间有轻微的差异,如大部分论文会将Related Works等放在Introduction里面作为一个小节,也有的论文会将Contributions也单独列出来作为一个小节。方法的介绍通常分为两章进行,具体的题目可以根据需要变动。结构中其他章节的名称和内容基本一致,实验部分通常直接命名Experimental Results,但也可例外。论文的篇幅通常为14页。

下面将按照论文的顺序,分析各个部分的撰写规则。

摘要

摘要部分通常的篇幅在24-30行(198-276字)查到了官网要求,在150-200字之间,通常在25行左右(24-26)。内容主要遵从以下结构:

  • 问题综述 --> 算法概括 (算法分部介绍) --> 实验结果及评价(总结)。

注意在算法概述中,应用一般现在时。

问题综述

问题综述通常用简单的一句话概括目前工作的意义(通常涉及应用场景和挑战)和目前方法存在的问题,从而引出文章中要介绍的工作。因此,这一部分的介绍需要与具体工作相关联,最好有衬托的作用。

问题综述例句:(高亮部分为应用场景举例)

  1. Large disparity stereo matching is critical to the application of stereo vision system especially for outdoor scenes. Nevertheless, how to efficiently design high accuracy large-disparity stereo matching on FPGA is still a grand challenge. // The computational complexity of previously proposed stereo matching is inevitably propotional to disparity range, hence their hardware designs become very inefficient when the disparity range is large.
  2. Stereo matching is a challenging task because stereo images are affected by many factors, such as radiometric distortion, sun and rain flare, flying snow, occlusion, textureless and noisy image regions, and object boundaries. // However, most of the existing methods for stereo matching aim to solve only one specific problem. As a result, their performance is degraded significantly when operating with stereo images captured under a variety of scenes and conditions. ——这个问题综述相比较上一个而言,可读性差了一点,可能是描述方式使得读起来很冗长。
  3. Stereo matching is a challenging problem, and high-accuracy stereo matching is still required in various computer vision applications, e.g., 3-D scanning, autonomous navigation, and 3-D reconstruction.
  4. Estimating the disparity and normal direction of one pixel simultaneously, instead of only disparity, also known as 3D label methods, can achieve much higher subpixel accuracy in the stereo matching problem. However, it is extremely difficult to assign an appropriate 3D label to each pixel from teh continuous label space R 3 \mathbb R^3 R3 while maintaining global consistency because of the infinite parameter space.

算法概括(算法细节)

通常在问题综述后面,会联系前面提到的困难和问题,给出论文中方法的简介。大部分论文都是先给出一句话的方法概括,简单说明算法的功能,随后再详细介绍算法的主要组成部分;但是也有一部分论文直接给出每个部分的功能概括。需要注意的是,在功能介绍的时候,通常会用简短的描述体现出算法的优势。

算法简介示例:(高亮部分为一句话的方法概括)

  1. In this paper, we propose a novel algorithm called PatchMatch-based superpixel cut to assign 3D labels of an image more accurately. In order to achieve robust and precise stereo matching between local windows, we develop a bilayer matching cost, where a bottom-up scheme is exploited to design the two layers. The bottom layer is employed to measure the similarity between small square patches locally by exploiting a pretrained convolutional neural network, and then, the top layer is developed to assemble the local matching costs in large irregular windows induced by the tangent planes of object surfaces. To optimize the spatial smoothness of local assignments, we propose a novel strategy to update 3D labels. In the procudure of optimization, both segmentation information and random refinement of PatchMatch are exploited to update candidate 3D label set for each pixel with high probability of achieving lower loss. Since pairwise energy of general candidate label sets violates the submodular property of graph cut, we propose a novel multilayer superpixel structure to group candidate label sets into candidate assignments, which thereby can be efficiently fused by α \alpha α-expansion graph cut.
  2. Motivated by the original PatchMatch and weighted median filtering algorithms, this paper proposes a NIPM-sWMF algorithm to significantly reduce the computational complexity of stereo matching and make it independent of disparity range. Moreover, we also propose a fully pipelined architecture design on FPGA that employs several hardware techniques to efficiently implement the proposed NIPM-sWMF.
  3. In this paper, we propose a novel matching cost function based on adaptive normalized cross-correlation (ANCC). We demonstrate several weaknesses of ANCC and propose techniques to resolve them. In addition, we employ available information, such as intensity mean, intensity variance, and support window radius, to estimate the parameters of the proposed matching cost function.
  4. Therefore, we present a novel image-guided stereo matching algorithm, which employs the efficient combined matching cost and multistep disparity refinement, to improve the accuracy of existing local stereo matching algorithms. Different from all the other methods, we introduce a guidance image for the whole algorithm. This filter-based guidance image is generated by extracting the enhanced information from the raw stereo image. The combined matching cost consists of the novel double-RGB gradient, the improved lightweight census transform, and the image color. This cost measurement is robust against imag noise and textureless regions in computing the matching cust. Furthermore, a new systemic multistep refinement process, which includes outlier classification, four-direction propagation, leftmost propagation, and an exponential step filter, is proposed to remove the outliers in the raw disparity map.

实验结果

这一部分通常给出具有说明性的实验结果。如果有benchmark,通常为在benchmark内的排名;如果没有,就给出实际指标,最终给一句总结,最好最后的总结与前面描述的问题呼应。如果在benchmark上的排名是第一就直接说rank first,否则就只说比state-of-the-art方法好就可以,但是要给出具体的数据支撑。

实验结果示例:(高亮部分为算法效果总结)

  1. Extensive experiments demonstrate that our method can achieve higher subpixel accuracy in different data sets, and currently ranks first on the new challenging Middlebury 3.0 benchmark among all the existing methods.
  2. The disparity quality of the proposed NIPS-sWMF algorithm is evaluated on both KITTI2015 and Middlebury V3 stereo datasets, and the proposed architecture design is implemented and synthesized on Xilinx FPGA. Evaluation results demonstrate that, the proposed NIPM-SWMF design on FPGA reaches the real-time performance of 1920x1080@60Hz at the disparity range of 128, and can achieve almost the same disparity estimation accuracy, 4.5x processing throughput, while reducing the hardware cost of LUT, Register, DSP and BRAM by 40%, 47%, 100% and 68% respectively, compared with the reference stereo matching design. Therefore, the proposed NIPM-sWMF design is an efficient way to address the challenge of large-disparity stereo matching.
  3. Compared with ANCC, the proposed matching cost function reduces the error rates from 24.1% to 17.8% in the Middlebury data set and from 64.1% to 26.4% in the KITTI data set. In addition, for noisy stereo pairs, the proposed function reduces the error rate from 73.6% to 37.3%. The qualitative and quantitative experimental results based on stereo images in different data sets under various conditions show that our proposed matching cost function outperforms state-of-the-art matching cost functions in indoor and outdoor stereo images having various radiometric distortions.
  4. Experiments on the Middlebury benchmark demonstrate our algorithm’s superior performance that it ranks first among the 158 submitted algorithms. Moreover, the proposed method is also robust on the 30 Middlebury data sets and the real-world Karlsruhe Institute of Technology and Toyota Technological Institute benchmark.

Index Terms

关键词(索引词)个数在3-5个,通常为词或固定短语。各关键词之间描述的内容不应相同,主要概括算法的各项技术或功能。

Introduction

Introduction的篇幅在1-2页,内容有一定差异,与论文内容相关性较大。对于TCSVT的论文,这一部分可以包括Related Work作为一节,也可以将Related Work单独作为一章来写,甚至可以没有单独的部分。但是一定要包含算法的优势(最好是很明显地写出来,比如作为一节,或者将每一条优势都给出项目标注),最好也包含整篇论文的结构。

Introduction的第一段主要用于引出话题,其主要结构可以表示为:

  • 引子 --> 应用场景 --> 现状概述或问题描述 --> 存在的问题。

引子 + 应用场景

应用场景这一部分通常会跟有参考文献,但不是必须。

  1. Stereo matching is one of the most active research topics in computer vision society. With the accuracy and processing speed improvement of stereo matching, stereo vision systems have been widely used in a wide variey of application areas, including mobile robots, intelligent surveillance and autonomous vehicles.
  2. It is the goal of stereo correspondence to reconstruct the depth information from left and right images of a scene. It is still researched actively as it it important for applications such as interactive robot navigation, self-driving cars, view interpolation, and 3D reconstruction.
  3. Generation of dense disparity map from a pair of stereo images is a popular topic in computer vision, because it plays a crucial role in many applications, including autonomous navigation, 3-D scanning, 3-D tracking, and 3-D reconstruction.
  4. Stereo matching has been one of the most core problems in computer vision for a long time. Recently, the wide use of 3D labels increases the accuracy of stereo matching algorithms to a large extent.

现状概述

在现状概述这一部分,许多立体视觉匹配的论文都会提到立体匹配的四个步骤:

  • the matching cost computation
  • cost aggregation
  • disparity computation/ optimization
  • disparity refinement processes

论文结构介绍示例

这一部分给出了Introduction中最后介绍论文结构的几个示例:

  1. The rest of this paper is organized as follows. Section II introduces the … Section III proposes the … algorithm. Section IV presents … architecture design. Section V discusses the experimental results. Finally, the conclusion is drawn in Section VI.
  2. The remainder of this paper is organized as follows. The … and … model are introduced in Section II. In Section III, the … is proposed. An improved … function is presented in Section IV. In Section V, a … is adopted to achieve higher accuracy. The experimental process and results are elaborated in Section VI. A brief summary is given in Section VII. ——个人感觉这种结构没有上面那种结构清晰,可读性差一些。

Experiments and Discussion

这一部分不同的论文名称不同,这里举几个例子:
Experiments and Discussion
Experiments
Results
Experimental Results

其中比较常用的(博主看见更多的)是最后一个。

这一部分主要就是展示实验数据并进行分析,不同的论文说明的角度和方式都不同,但是也可以尝试总结一下其中的套路。

先总结一下这一部分通常应该包括的内容(结构):

  • 第一段综述 --> 实验环境、参数等设定 --> 实验结果及分析(有时候可以分作两节,也可以并在一起说)

第一段(综述)

博主看见的大部分论文都有这一段,当然也有论文开门见山,不要这一段,所以具体情况要具体分析。对于有这一段的论文,其主要说明的问题是用于比较的数据集、用于比较的算法等等。但是都很笼统,1~2句话就可以了。

第一节(实验设置)

这一部分主要说明实验的各种设定,包括硬件条件、软件条件、用到了什么数据库(可对数据库进行简要说明)、什么平台、参数设置是多少等等一系列的问题。这部分就根据具体的实验情况进行客观描述就可以了,比较简单。说得详细、清楚即可。

第二节(实验结果及比较)

这一部分是本章的重点,也是最不好写的地方。博主尽量分析其中的套路,欢迎在评论区补充。

首先,本章应包括所有(有用)的实验数据,包括具体的数据类结果也包括直观的图形类结果(如果有的话),其作用应当是能够证明算法的先进性。这里在展示了数据后要有一定的说明。

其次,最难的地方就是如何与其他算法做比较。这一部分一定要小心,说不好可就得罪人了。大部分论文在实验结果上只是直接呈上各个算法的数据,然后说一句自己的算法比较好,然后客观分析数据。

Conclusion

结论部分的篇幅差异相对摘要要大一点,但范围通常在15-25行之间。主要结构为:

  • 方法功能综述 --> 各部分概括(或优势重申) --> 实验结果 --> 功能拓展 --> 未来计划。

方法功能综述

注意在结论中通常使用一般过去时,但也有用现在时的。

示例:

  1. In this paper, we introduced a new matching cost function that operates robustly and accurately with stereo images in different indoor and outdoor scenes and situations.
  2. A high-accuracy local stereo matching system has been proposed in this paper. The image-guided matching structure is valid and can be extensively adopted.
  3. This work concerns the efficient FPGA design for high-definition large-disparity stereo matching.
  4. In this paper, we present a novel 3D label-based accurate stereo matching approach with the second-order smoothness regularization.

各部分概括(或优势重申)

这一部分种类比较多,每篇论文都有自己的特点。有的是概括各个部分的细节,有的论文是总结算法或结构的优势,有的论文是重申算法具有的功能,等等。

示例:

  1. We propose a bilayer matching cost to combine the pixel similarity values generated from CNN and the slanted patch matching of PatchMatch stereo, which keeps the advantages of both measurements. To generate candidate 3D label sets for each pixel, we propose a novel strategy by propagation on multilayer superpixels and random refinement. Global regularization is defined as the pixel grid MRF energy with 3D labels, which can be directly optimized by α \alpha α-expansion graph cut due to our novelly designed proposal structure.
  2. The proposed matching cost function exploited available information to estimate the parameters of the proposed matching cost function.
  3. Our system is efficient based on these key factors: the filter-based guidance image, the double-RGB gradient, the combined cost measurement, the exponential step aggregation structure, and the systematic efficient multistep refinement, including outlier classification, four-direction propagation, leftmost propagation, and exponential step filtering.
  4. In particular, we propose an efficient NIPS-SWMF stereo matcing algorithm, which includes a non-iterative PatchMatch for disparity computation and a separable weighted median filtering for disparity refinement respectively. The proposed NIPS-sWMF algorithm is independent of the disparity range d d d and can significantly reduce the computational complexity of disparity computation and refinement as well. Then we further design a fully pipelined architecure for the proposed NIPM-sWMF and exploit several techniques to improve processing capability and reduce hardware overhead.

实验结果

这一部分用几句话简单总结实验结果,主要用于突出算法的优势和应用场景(数据库)。

  1. Experiments demonstrate that our algorithm offers excellent high accuracy performance in both indoor and outdoor environments. This algorithm can be regargded as one of the state-of-the-art stereo methods.
  2. We evaluated and compared the proposed matching cost function and several state-of-the-art matching cost functions using various stereo matching data sets, and the experimental results show that the proposed matching cost method was wuperior to the state-of-the-art matching cost methods across the data sets.
  3. Evaluation shows that the proposed method obtains highly accurate disparity maps, and currently ranks first on the official Middlebury 3.0 benchmark among all existing methods. Compared with the other global methods that are also based on PatchMatch, the proposed method achieves better results on Middlebury 2006 data set, and is more efficient for high-resolution images.
  4. By extensive evaluation on KITTI2015 and Middlebury V3 stereo datasets, the proposed NIPM-sWMF can achieve comparable disparity accuracy with those more sophisticated stereo matching designs. Implementation results on FPGA further show that the proposed NIPM-sWMF substantially outperforms previous stereo matching designs in terms of real-time performance and hardware efficiency especially when the disparity range is large. The experimental results clearly demonstrate that the propsoed NIPM-sWMF is an efficient solution for large-disparity stereo matching system.

功能拓展

这一部分主要介绍了算法未在论文中说明的可能的拓展(如果有的话)。

  1. In addition, although the proposed NIPM-sWMF design is implemented and evaluated on FPGA in this paper, it can be employed to efficient integrated circuit design as well.
  2. Furthermore, this study can be extended in terms of optimizing the computational complexity and reducing the number of parameters.

未来计划

这一部分主要简述在本文的基础上,下一步的工作方向。

  1. Future research will be applied to additional indoor and outdoor scenes.
  2. High computation time is a main limitation of our proposed matching cost function. Since the proposed matching cost is based on windows, it can be processed in parallel. Therefore, implementing it in a field-programmable gate array is one solution.
  3. In the future, we would like to add semantic information to our PMSC model, and extend our algorithm to multiview scenario.

其他问题

  1. 与之前的工作不同,近几年的论文逐渐会出现用第一人称做主语的情况,但比较常用的通常是We、Our,而非其他第一人称。

更多内容,欢迎加入星球讨论。

你可能感兴趣的:(笔记)