【转自】http://www.cnblogs.com/youth0826/archive/2012/12/04/2801481.html
21世纪初最有影响力的20篇计算机视觉期刊论文
选取论文的原则:
(1)期刊论文,主要来源于以下期刊:TPAMI,IJCV,TIP,CVIU,IVC,MVA,PR,JMIV,IJPRAI…
(2)发表在2000年以后
(3)SCI检索次数大于1000,来源于Web of Science数据库,2012年12月初的检索结果
Top 20 榜单如下:
[1] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision, vol. 60, no. 2, pp. 91-110, Nov, 2004. (Cited=5663)
[2] J. B. Shi, and J. Malik, “Normalized cuts and image segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 888-905, Aug, 2000. (Cited=2165)
[3] T. F. Chan, and L. A. Vese, “Active contours without edges,” IEEE Transactions on Image Processing, vol. 10, no. 2, pp. 266-277, Feb, 2001. (Cited=2153)
[4] D. Comaniciu, and P. Meer, “Mean shift: A robust approach toward feature space analysis,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 5, pp. 603-619, May, 2002. (Cited=1910)
[5] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: From error visibility to structural similarity,”IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600-612, Apr, 2004. (Cited=1879)
[6] A. W. M. Smeulders, M. Worring, S. Santini, A. Gupta, and R. Jain, “Content-based image retrieval at the end of the early years,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 12, pp. 1349-1380, Dec, 2000. (Cited=1697)
[7] P. Viola, and M. J. Jones, “Robust real-time face detection,” International Journal of Computer Vision, vol. 57, no. 2, pp. 137-154, May, 2004. (Cited=1634)
[8] A. K. Jain, R. P. W. Duin, and J. C. Mao, “Statistical pattern recognition: A review,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 1, pp. 4-37, Jan, 2000. (Cited=1546)
[9] Z. Y. Zhang, “A flexible new technique for camera calibration,” IEEE Transactions on Pattern Analysis and Machine Intelligence,vol. 22, no. 11, pp. 1330-1334, Nov, 2000. (Cited=1516)
[10] B. Zitova, and J. Flusser, “Image registration methods: a survey,” Image and Vision Computing, vol. 21, no. 11, pp. 977-1000, Oct, 2003. (Cited=1422)
[11] S. Belongie, J. Malik, and J. Puzicha, “Shape matching and object recognition using shape contexts,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 4, pp. 509-522, Apr, 2002. (Cited=1321)
[12] P. J. Phillips, H. Moon, S. A. Rizvi, and P. J. Rauss, “The FERET evaluation methodology for face-recognition algorithms,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 10, pp. 1090-1104, Oct, 2000. (Cited=1298)
[13] Y. Boykov, O. Veksler, and R. Zabih, “Fast approximate energy minimization via graph cuts,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 11, pp. 1222-1239, Nov, 2001. (Cited=1197)
[14] D. Scharstein, and R. Szeliski, “A taxonomy and evaluation of dense two-frame stereo correspondence algorithms,” International Journal of Computer Vision, vol. 47, no. 1-3, pp. 7-42, Apr-Jun, 2002. (Cited=1174)
[15] K. Mikolajczyk, and C. Schmid, “A performance evaluation of local descriptors,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 10, pp. 1615-1630, Oct, 2005. (Cited=1166)
[16] D. Comaniciu, V. Ramesh, and P. Meer, “Kernel-based object tracking,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 5, pp. 564-577, May, 2003. (Cited=1109)
[17] T. Ojala, M. Pietikainen, and T. Maenpaa, “Multiresolution gray-scale and rotation invariant texture classification with local binary patterns,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 7, pp. 971-987, Jul, 2002. (Cited=1076)
[18] C. Stauffer, and W. E. L. Grimson, “Learning patterns of activity using real-time tracking,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 747-757, Aug, 2000. (Cited=1070)
[19] M. H. Yang, D. J. Kriegman, and N. Ahuja, “Detecting faces in images: A survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 1, pp. 34-58, Jan, 2002. (Cited=1032)
[20] T. F. Cootes, G. J. Edwards, and C. J. Taylor, “Active appearance models,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 6, pp. 681-685, Jun, 2001. (Cited=989)
补充2篇TPAMI,山老师推荐的,南理工杨健老师的2DPCA和浙大何晓飞老师的LPP。
[1] J. Yang, D. Zhang, A. Frangi, and J. Yang, “Two-dimensional PCA: a new approach to appearance-based face representation and recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 1, pp. 131-137, Jan, 2004. (Cited=625)
[2] X. He, S. Yan, Y. Hu, P. Niyogi, and H. Zhang, “Face recognition using Laplacianfaces,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 3, pp. 328-340, Mar, 2005. (Cited=724)
简单小结:
3篇IJCV,14篇TPAMI,2篇TIP,1篇IVC倒是有些意外,不过是综述性质的文章,也是情理之中。
欢迎各位大牛对每篇文章进行点评。
posted on 2012-12-04 15:39Shicai Yang 阅读(2724) 评论(5) 编辑收藏
#1楼2012-12-05 19:40Paul84引用率果然还是能反应论文水平的。SIFT果然是神作支持(0)反对(0)#2楼[楼主] 2012-12-05 20:30Shicai Yang引用邓亚峰新浪博客评论,http://blog.sina.com.cn/s/blog_6ae183910101cdq8.html
看了一下这20篇博文,总体上感觉很熟悉,很多文章确实是视觉领域2000年以来的重要工作,于是就很想聊一下自己的看法,和大家交流。
我简单按照所属领域划分统计了一下,其中:
与人脸检测识别直接相关的有3篇,间接相关的1篇,直接相关的包括:7(基于adaboost的人脸检测),19(人脸检测的综述),12(经典的人脸识别测试集FERET的说明),间接相关的是AAM,AAM是人脸alignment的最经典文章,但是由于其提出时并没有现定于人脸,所以划分为间接相关;
与局部描述子直接相关的有3篇,分别是1(sift),15(local descriptor的综述),17(LBP);
与立体视觉相关的有3篇:9(摄像机标注方法),10(图像配准综述),14(这篇文章我不熟悉,所以这样划分是否合理也不是很确信);
与图像分割直接相关的有2篇,间接相关的有1篇,直接相关的包括:2(normlized cut),13(graph-cut的应用),间接相关的是4(mean-shift),mean-shfit可以应用于图像分割,也可以应用于目标跟踪,算是间接相关;
与物体检测跟踪(为了简单,把检测和跟踪划分到一起,其实这两类方法在思路上差别很大,单独划分更合理,但是从功能角度确实很相似)直接相关的2篇,间接相关的有1篇:3(基于轮廓的物体检测方法),16(基于kernel的tracking方法)一篇是检测,一篇是跟踪,而基于mean-shift的跟踪方法是particle filter之前最经典的方法,算是间接相关;
与图像检索匹配相关的文章2篇:6(综述),11(shape context);
此外还包括一篇统计模式识别的综述(8);
一篇图像质量评估的方法(5);
行为识别:18(一篇关于行为识别的系统的文章,用到了跟踪、摄像机标定、行为识别等);
(其中mean-shift被计算了两次,所以和是21篇)
按照每个领域的文章数量由多到少的顺序我讲一下自己的观点:
人脸识别领域的论文出现次数最多(这可能和我自己重点关注人脸识别领域有关),是让我比较意外的,不过这说明从2000年开始,人脸识别方法得到了大家的重点关注,是视觉领域的一个热点。这几篇文章中,viola的基于adaboost+haar的人脸检测方法是经典中的经典,其思想不仅被广泛应用于物体检测领域,同时,在启发了狠多特征选择领域的工作,同时,也帮助adaboost一跃成为和svm并列的两大machine learning利器,我一直认为这篇文章和和lowe的sift都是视觉领域工程方面的经典之作;而AAM方法,也是十分重要的模型,尤其是对于脸部特征点定位而言,基于aam的改进工作极大促进了特征点定位的精度,其实这种全局形状约束+局部表观模型的思路在其他如物体检测领域也有很多类似思路;
局部描述子,基本上是视觉领域表示方面十年来最大的一个亮点和趋势。我们知道模式识别包含两个方面的内容,一个是特征提取,用于提供更有鉴别力的表示,一个是机器学习,用来对于特征表示之后的数据上学习得到分类模型。而局部描述子已经成为特征表示的一个共识,无论是物体匹配检索、物体检测识别,采用多种局部描述子表示已经成为基本选择。而这其中,sift无疑是影响最大的工作之一,其在物体匹配、物体检索、物体检测、物体识别等领域都有大量应用。而LBP特征,作为一种局部描述特征,是继Gabor之后最重要的纹理描述特征(当然,sift的变种hog也是之一),其有很多变种,在人脸识别、物体检测、目标分类领域也得到了大量应用;
立体视觉我不熟悉,我就不献丑了。对于9和14两篇文章有这么高的引用率,不知道谁能讲讲背后的背景?
图像分割可以通过将分割问题看作是一个分组问题,然后定义一个优化目标,通过最优化这个目标来得到最优的模型参数。而graph-cut,是将每个pixel看作图中的一个节点,将图像分割转化为一个图分割问题。而graph-cut是通过最小分割和最小流来得到一个最优模型,2(normlized-cut)是为了克服graph-cut的缺点的一种改进,第2高的引用率应该说明这种方法可以应用于很多领域,包括图像分割和聚类。
物体检测跟踪领域是一个十分活跃的领域,而3和16有这么高的引用率很出乎我的意料,因为这两种思路在最近基本都不算是主流思路,也许是当年曾经火过一段时间吧;物体检测识别的经典思路应该有两种,一种是基于滑动窗口搜索的,就是viola人脸检测文章中使用过的,后来,有基于hog+svm的行人检测的经典论文也是相近的框架,还有一种是基于图像全局表示的,经典的方法是基于bag of word的方法,在图像检测、图像分类、图像检索等领域都有重要应用;而跟踪的方法,比较经典的包括基于mean shift的方法,基于particle filter的方法,以及基于online-learning的方法,而后来的发展,也越来越将detection和tracking结合到一起,将二者结合到一个框架,其本质思想就是把tracking看作是一个区分前景目标和背景目标的分类问题,而detection提供了前景目标的off-line模型,tracking提供了on-line模型。这个领域论文很多,有一篇综述写得还不错。
图像检索是当前视觉领域一个十分热的方向,在搜索引擎、购物等领域受到很多关注;而6和11远远不能涵盖这个领域的经典。11是早期物体匹配的经典方法,但是,现在用的已经很少。而这方面的最经典工作还应该是基于bag of word的工作。这个工作借鉴了文字搜索领域的工作,通过视觉词将图像转化为类似文字中的文章,视觉特征用视觉词频表示,然后通过倒排的方式,使得大规模图像检索成为可能。
其它的三篇文章,8作为综述,确实没有什么疑问。5有这么高的引用,我有点迷惑。图像质量评估虽然在实际中十分重要,但是是个不太活跃的领域,有这么高的应用有点意外。18的工作比较早,当时能提出基于跟踪、摄像机标定、行为识别和事件检测这样的框架,确实有很棒的前瞻性,可能是因为近几年智能视频监控应用火热之后,做这个方向的工作多起来的原因吧。
一点建议:
1,建议将综述排除出去,20篇论文里面有5篇综述,虽然综述很重要,5篇也不是很多,但是在引用次数上综述确实会占很大便宜,建议将综述排除出去,只对提出新方法的文章单独排名,才能让大家对哪些新方法的影响力最广更了解;
2,我不知道这个排名方法是否存在瑕疵,或者是数据是否完全正确,因为我觉得从目前的结论来看,这20篇文章还不能算是最有影响力的20篇;
以上观点如果有错误,请大家拍砖提醒。支持(0)反对(0)#3楼[楼主] 2012-12-05 20:31Shicai Yang山世光老师的评论:
亚峰的评述很棒,推荐大家看看。补充一下,关于[5],作者王舟严格讲不是cv,ml领域的,不过他这篇文章确实在视频编码领域有着非常大的影响力。另外,关于LBP,其虽然与SIFT差距很大,但其对很多描述子的提出如BRIEF有直接启发,而且成了人脸识别和纹理分类领域的基准测试算法,所以引用多!支持(0)反对(0)#4楼[楼主] 2012-12-05 20:55Shicai Yang额外补充15篇提名的,ISI cited数量在580-900之间,排名在上述榜单之后的,以供参考。
[1]K. Mikolajczyk, and C. Schmid, “Scale & affine invariant interest point detectors,” International Journal of Computer Vision, vol. 60, no. 1, pp. 63-86, Oct, 2004. (Cited=879)
[2]I. Haritaoglu, D. Harwood, and L. S. Davis, “W-4: Real-time surveillance of people and their activities,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 809-830, Aug, 2000. (Cited=877)
[3]A. S. Georghiades, P. N. Belhumeur, and D. J. Kriegman, “From few to many: Illumination cone models for face recognition under variable lighting and pose,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 6, pp. 643-660, Jun, 2001. (Cited=845)
[4]Y. Boykov, and V. Kolmogorov, “An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 9, pp. 1124-1137, Sep, 2004. (Cited=805)
[5]X. F. He, S. C. Yan, Y. X. Hu, P. Niyogi, and H. J. Zhang, “Face recognition using Laplacianfaces,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 3, pp. 328-340, Mar, 2005. (Cited=724)
[6]M. N. Do, and M. Vetterli, “The contourlet transform: An efficient directional multiresolution image representation,” Ieee Transactions on Image Processing, vol. 14, no. 12, pp. 2091-2106, Dec, 2005. (Cited=690)
[7]L. A. Vese, and T. F. Chan, “A multiphase level set framework for image segmentation using the Mumford and Shah model,” International Journal of Computer Vision, vol. 50, no. 3, pp. 271-293, Dec, 2002. (Cited=669)
[8]A. M. Martinez, and A. C. Kak, “PCA versus LDA,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 2, pp. 228-233, Feb, 2001. (Cited=658)
[9]J. Yang, D. Zhang, A. F. Frangi, and J. Y. Yang, “Two-dimensional PCA: A new approach to appearance-based face representation and recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 1, pp. 131-137, Jan, 2004. (Cited=625)
[10]V. Kolmogorov, and R. Zabih, “What energy functions can be minimized via graph cuts?,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 2, pp. 147-159, Feb, 2004. (Cited=625)
[11]S. G. Chang, B. Yu, and M. Vetterli, “Adaptive wavelet thresholding for image denoising and compression,” Ieee Transactions on Image Processing, vol. 9, no. 9, pp. 1532-1546, Sep, 2000. (Cited=609)
[12]K. Mikolajczyk, T. Tuytelaars, C. Schmid, A. Zisserman, J. Matas, F. Schaffalitzky, T. Kadir, and L. van Gool, “A comparison of affine region detectors,” International Journal of Computer Vision, vol. 65, no. 1-2, pp. 43-72, Nov, 2005. (Cited=607)
[13]J. Z. Wang, J. Li, and G. Wiederhold, “SIMPLIcity: Semantics-sensitive integrated matching for picture libraries,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 9, pp. 947-963, Sep, 2001. (Cited=599)
[14]H. C. Peng, F. H. Long, and C. Ding, “Feature selection based on mutual information: Criteria of max-dependency, max-relevance, and min-redundancy,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 8, pp. 1226-1238, Aug, 2005. (Cited=597)
[15]H. Bay, A. Ess, T. Tuytelaars, and L. Van Gool, “Speeded-Up Robust Features (SURF),” Computer Vision and Image Understanding, vol. 110, no. 3, pp. 346-359, Jun, 2008. (Cited=584)