这一章是计算机视觉部分,主要侧重在底层特征提取,视频分析,跟踪,目标检测和识别方面等方面。对于自己不太熟悉的领域比如摄像机标定和立体视觉,仅仅列出上google上引用次数比较多的文献。有一些刚刚出版的文章,个人非常喜欢,也列出来了。
本章的下载地址:
http://iask.sina.com.cn/u/2252291285/ish?folderid=868772
活动表观模型和活动轮廓模型基本思想来源Snake,现在在人脸三维建模方面得到了很成功的应用,这里列出了三篇最初最经典的文章。对这个领域有兴趣的可以从这三篇文章开始入手。
[1998 ECCV] ActiveAppearance Models
[2001 PAMI] ActiveAppearance Models
[1995 CVIU]Active ShapeModels-Their Training and Application
背景建模一直是视频分析尤其是目标检测中的一项关键技术。虽然最近一直有一些新技术的产生,demo效果也很好,比如基于dynamical texture的方法。但最经典的还是Stauffer等在1999年和2000年提出的GMM方法,他们最大的贡献在于不用EM去做高斯拟合,而是采用了一种迭代的算法,这样就不需要保存很多帧的数据,节省了buffer。Zivkovic在2004年的ICPR和PAMI上提出了动态确定高斯数目的方法,把混合高斯模型做到了极致。这种方法效果也很好,而且易于实现。在OpenCV中有现成的函数可以调用。在背景建模大家族里,无参数方法(2000 ECCV)和Vibe方法也值得关注。
[1997 PAMI] PfinderReal-Time Tracking of the Human Body
[1999 CVPR] Adaptivebackground mixture models for real-time tracking
[1999 ICCV] WallflowerPrinciples and Practice of Background Maintenance
[2000 ECCV] Non-parametricModel for Background Subtraction
[2000 PAMI] LearningPatterns of Activity Using Real-Time Tracking
[2002 PIEEE] Backgroundand foreground modeling using nonparametric kernel density estimation forvisual surveillance
[2004 ICPR] Improvedadaptive Gaussian mixture model for background subtraction
[2004 PAMI] Recursiveunsupervised learning of finite mixture models
[2006 PRL] Efficientadaptive density estimation per image pixel for the task of backgroundsubtraction
[2011 TIP] ViBe AUniversal Background Subtraction Algorithm for Video Sequences
词袋,在这方面暂时没有什么研究。列出三篇引用率很高的文章,以后逐步解剖之。
[2003 ICCV] Video Google AText Retrieval Approach to Object Matching in Videos
[2004 ECCV] VisualCategorization with Bags of Keypoints
[2006 CVPR] Beyond bags offeatures Spatial pyramid matching for recognizing natural scene categories
BRIEF是BinaryRobust Independent Elementary Features的简称,是近年来比较受关注的特征描述的方法。ORB也是基于BRIEF的。
[2010 ECCV] BRIEF BinaryRobust Independent Elementary Features
[2011 ICCV] ORB anefficient alternative to SIFT or SURF
[2012 PAMI] BRIEFComputing a Local Binary Descriptor Very Fast
非常不熟悉的领域。仅仅列出了十来篇重要的文献,供以后学习。
[1979 Marr] AComputational Theory of Human Stereo Vision
[1985] Computationalvision and regularization theory
[1987 IEEE] A versatilecamera calibration technique for high-accuracy 3D machine vision metrologyusing off-the-shelf TV cameras and lenses
[1987] ProbabilisticSolution of Ill-Posed Problems in Computational Vision
[1988 PIEEE] Ill-PosedProblems in Early Vision
[1989 IJCV] KalmanFilter-based Algorithms for Estimating Depth from Image Sequences
[1990 IJCV] RelativeOrientation
[1990 IJCV] Usingvanishing points for camera calibration
[1992 ECCV] Cameraself-calibration Theory and experiments
[1992 IJCV] A theory ofself-calibration of a moving camera
[1992 PAMI] Cameracalibration with distortion models and accuracy evaluation
[1994 IJCV] TheFundamental Matrix Theory, Algorithms, and Stability Analysis
[1994 PAMI] a stereomatching algorithm with an adaptive window theory and experiment
[1999 ICCV] Flexiblecamera calibration by viewing a plane from unknown orientations
[1999 IWAR] Markertracking and hmd calibration for a video-based augmented reality conferencingsystem
[2000 PAMI] A flexible newtechnique for camera calibration
这里面主要来源于图像检索,早期的图像检测基本基于全局的特征,其中最显著的就是颜色特征。这一部分可以和前面的Color知识放在一起的。
[1995 SPIE] Similarity ofcolor images
[1996 PR] IMAGE RETRIEVALUSING COLOR AND SHAPE
[1996] comparing imagesusing color coherence vectors
[1997 ] Image IndexingUsing Color Correlograms
[2001 TIP] An EfficientColor Representation for Image Retrieval
[2009 CVIU] Performanceevaluation of local colour invariants
大红大热的DPM,在OpenCV中有一个专门的topic讲DPM和latent svm
[2008 CVPR] ADiscriminatively Trained, Multiscale, Deformable Part Model
[2010 CVPR] Cascade ObjectDetection with Deformable Part Models
[2010 PAMI] ObjectDetection with Discriminatively Trained Part-Based Models
距离变换,在OpenCV中也有实现。用来在二值图像中寻找种子点非常方便。
[1986 CVGIP] DistanceTransformations in Digital Images
[2008 ACM] 2D EuclideanDistance Transform Algorithms A Comparative Survey
最成熟最有名的当属Haar+Adaboost
[1998 PAMI] NeuralNetwork-Based Face Detection
[2002 PAMI] Detectingfaces in images a survey
[2002 PAMI] Face Detectionin Color Images
[2004 IJCV] RobustReal-Time Face Detection
不熟悉,简单罗列之。
[1991] Face RecognitionUsing Eigenfaces
[2000 PAMI] AutomaticAnalysis of Facial Expressions The State of the Art
[2000] Face Recognition ALiterature Survey
[2006 PR] Face recognitionfrom a single image per person A survey
[2009 PAMI] Robust FaceRecognition via Sparse Representation
用机器学习的方法来提取角点,号称很快很好。
[2006 ECCV] Machinelearning for high-speed corner detection
[2010 PAMI] Faster andBetter A Machine Learning Approach to Corner Detection
这里的特征主要都是各种不变性特征,SIFT,Harris,MSER等也属于这一类。把它们单独列出来是因为这些方法更流行一点。关于不变性特征,王永明与王贵锦合著的《图像局部不变性特征与描述》写的还不错。Mikolajczyk在2005年的PAMI上的文章以及2007年的综述是不错的学习材料。
[1989 PAMI] On thedetection of dominant points on digital curves
[1997 IJCV] SUSAN—A NewApproach to Low Level Image Processing
[2004 IJCV] MatchingWidely Separated Views Based on Affine Invariant Regions
[2004 IJCV] Scale &Affine Invariant Interest Point Detectors
[2005 PAMI] A performanceevaluation of local descriptors
[2006 IJCV] A Comparisonof Affine Region Detectors
[2007 FAT] Local InvariantFeature Detectors - A Survey
[2011 IJCV] Evaluation ofInterest Point Detectors and Feature Descriptors
[2012 PAMI] LDAHashImproved Matching with Smaller Descriptors
虽然过去了很多年,Harris角点检测仍然广泛使用,而且基于它有很多变形。如果仔细看了这种方法,从直观也可以感觉到这是一种很稳健的方法。
[1988 Harris] A combinedcorner and edge detector
HoG方法也在OpenCV中实现了:HOGDescriptor。
[2005 CVPR] Histograms ofOriented Gradients for Human Detection
NavneetDalalThesis.pdf
[1993 PAMI] ComparingImages Using the Hausdorff Distance
图像拼接,另一个相关的词是Panoramic。在Computer Vision: Algorithms and Applications一书中,有专门一章是讨论这个问题。这里的两面文章一篇是综述,一篇是这方面很经典的文章。
[2006 Fnd] Image Alignmentand Stitching A Tutorial
[2007 IJCV] AutomaticPanoramic Image Stitching using Invariant Features
KLT跟踪算法,基于Lucas-Kanade提出的配准算法。除了三篇很经典的文章,最后一篇给出了OpenCV实现KLT的细节。
[1981] An Iterative ImageRegistration Technique with an Application to Stereo Vision full version
[1994 CVPR] Good Featuresto Track
[2004 IJCV] Lucas-Kanade 20 Years On A Unifying Framework
Pyramidal Implementationof the Lucas Kanade Feature Tracker OpenCV
LBP。OpenCV的Cascade分类器也支持LBP,用来取代Haar特征。
[2002 PAMI]Multiresolution gray-scale and rotation Invariant Texture Classification withLocal Binary Patterns
[2004 ECCV] FaceRecognition with Local Binary Patterns
[2006 PAMI] FaceDescription with Local Binary Patterns
[2011 TIP]Rotation-Invariant Image and Video Description With Local Binary PatternFeatures
关于Low level vision的两篇很不错的文章
[1998 TIP] A generalframework for low level vision
[2000 IJCV] LearningLow-Level Vision
均值漂移算法,在跟踪中非常流行的方法。Comaniciu在这个方面做出了重要的贡献。最后三篇,一篇是CVIU上的top download文章,一篇是最新的PAMI上关于Mean Shift的文章,一篇是OpenCV实现的文章。
[1995 PAMI] Mean shift,mode seeking, and clustering
[2002 PAMI] Mean shift arobust approach toward feature space analysis
[2003 CVPR] Mean-shiftblob tracking through scale space
[2009 CVIU] Objecttracking using SIFT features and mean shift
[2012 PAMI] Mean ShiftTrackers with Cross-Bin Metrics
OpenCV Computer VisionFace Tracking For Use in a Perceptual User Interface
这篇文章发表在2002年的BMVC上,后来直接录用到2004年的IVC上,内容差不多。MSER在Sonka的书里面也有提到。
[2002 BMVC] Robust WideBaseline Stereo from Maximally Stable Extremal Regions
[2003] MSER AuthorPresentation
[2004 IVC] Robustwide-baseline stereo from maximally stable extremal regions
[2011 PAMI] Are MSERFeatures Really Interesting
首先要说的是第一篇文章的作者,Kah-Kay Sung。他是MIT的博士,后来到新加坡国立任教,极具潜力的一个老师。不幸的是,他和他的妻子都在2000年的新加坡空难中遇难,让人唏嘘不已。
http://en.wikipedia.org/wiki/Singapore_Airlines_Flight_006
最后一篇文章也是Fua课题组的,作者给出的demo效果相当好。
[1998 PAMI] Example-basedlearning for view-based human face detection
[2000 CVPR] A Statistical Method for 3D Object Detection Applied to Faces and Cars
[2003 IJCV] Learning theStatistics of People in Images and Video
[2011 PAMI] Learning toDetect a Salient Object
[2012 PAMI] A Real-TimeDeformable Detector
跟踪也是计算机视觉中的经典问题。粒子滤波,卡尔曼滤波,KLT,mean shift,光流都跟它有关系。这里列出的是传统意义上的跟踪,尤其值得一看的是2008的Survey和2003年的Kernel based tracking。
[2003 PAMI] Kernel-basedobject tracking
[2007 PAMI] TrackingPeople by Learning Their Appearance
[2008 ACM] Object TrackingA Survey
[2008 PAMI] Segmentationand Tracking of Multiple Humans in Crowded Environments
[2011 PAMI] Hough Forestsfor Object Detection, Tracking, and Action Recognition
[2011 PAMI] Robust ObjectTracking with Online Multiple Instance Learning
[2012 IJCV] PWP3DReal-Time Segmentation and Tracking of 3D Objects
一个非常成熟的领域,已经很好的商业化了。
[1992 IEEE] Historical reviewof OCR research and development
Video OCR A Survey andPractitioner's Guide
光流法,视频分析所必需掌握的一种算法。
[1981 AI] DetermineOptical Flow
[1994 IJCV] Performance ofoptical flow techniques
[1995 ACM] The Computationof Optical Flow
[2004 TR] TutorialComputing 2D and 3D Optical Flow
[2005 BOOK] Optical FlowEstimation
[2008 ECCV] LearningOptical Flow
[2011 IJCV] A Database andEvaluation Methodology for Optical Flow
粒子滤波,主要给出的是综述以及1998 IJCV上的关于粒子滤波发展早期的经典文章。
[1998 IJCV] CONDENSATION—ConditionalDensity Propagation for Visual Tracking
[2002 TSP] A tutorial onparticle filters for online nonlinear non-Gaussian Bayesian tracking
[2002 TSP] Particlefilters for positioning, navigation, and tracking
[2003 SPM] particle filter
仍然是综述类,关于行人和人体的运动检测和动作识别。
[1999 CVIU] Visualanalysis of human movement_ A survey
[2001 CVIU] A Survey ofComputer Vision-Based Human Motion Capture
[2005 TIP] Image changedetection algorithms a systematic survey
[2006 CVIU] a survey ofavdances in vision based human motion capture
[2007 CVIU] Vision-basedhuman motion analysis An overview
[2007 IJCV] PedestrianDetection via Periodic Motion Analysis
[2007 PR] A survey ofskin-color modeling and detection methods
[2010 IVC] A survey onvision-based human action recognition
[2012 PAMI] PedestrianDetection An Evaluation of the State of the Art
当相机越来越傻瓜化的时候,自动场景识别就非常重要。这是比拼谁家的Auto功能做的比较好的时候了。
[2001 IJCV] Modeling theShape of the Scene A Holistic Representation of the Spatial Envelope
[2001 PAMI] Visual WordAmbiguity
[2007 PAMI] A ThousandWords in a Scene
[2010 PAMI] EvaluatingColor Descriptors for Object and Scene Recognition
[2011 PAMI] CENTRIST AVisual Descriptor for Scene Categorization
[2003 PAMI] Detectingmoving shadows-- algorithms and evaluation
关于形状,主要是两个方面:形状的表示和形状的识别。形状的表示主要是从边缘或者区域当中提取不变性特征,用来做检索或者识别。这方面Sonka的书讲的比较系统。2008年的那篇综述在这方面也讲的不错。至于形状识别,最牛的当属J Malik等提出的Shape Context。
[1993 PR] IMPROVED MOMENTINVARIANTS FOR SHAPE DISCRIMINATION
[1993 PR] PatternRecognition by Affine Moment Invariants
[1996 PR] IMAGE RETRIEVALUSING COLOR AND SHAPE
[2001 SMI] Shape matchingsimilarity measures and algorithms
[2002 PAMI] Shape matchingand object recognition using shape contexts
[2004 PR] Review of shaperepresentation and description techniques
[2006 PAMI] IntegralInvariants for Shape Matching
[2008] A Survey of ShapeFeature Extraction Techniques
关于SIFT,实在不需要介绍太多,一万多次的引用已经说明问题了。SURF和PCA-SIFT也是属于这个系列。后面列出了几篇跟SIFT有关的问题。
[1999 ICCV] Objectrecognition from local scale-invariant features
[2000 IJCV] Evaluation ofInterest Point Detectors
[2003 CVIU] Speeded-UpRobust Features (SURF)
[2004 CVPR] PCA-SIFT AMore Distinctive Representation for Local Image Descriptors
[2004 IJCV] DistinctiveImage Features from Scale-Invariant Keypoints
[2010 IJCV] ImprovingBag-of-Features for Large Scale Image Search
[2011 PAMI] SIFTflow DenseCorrespondence across Scenes and its Applications
Simultaneous Localization and Mapping, 同步定位与建图。
SLAM问题可以描述为: 机器人在未知环境中从一个未知位置开始移动,在移动过程中根据位置估计和地图进行自身定位,同时在自身定位的基础上建造增量式地图,实现机器人的自主定位和导航。
[2002 PAMI] SimultaneousLocalization and Map-Building Using Active Vision
[2007 PAMI] MonoSLAMReal-Time Single Camera SLAM
纹理特征也是物体识别和检索的一个重要特征集。
[1973] Textural featuresfor image classification
[1979 ] Statistical andstructural approaches to texture
[1996 PAMI] Texturefeatures for browsing and retrieval of image data
[2002 PR] Brief review ofinvariant texture analysis methods
[2012 TIP] Color LocalTexture Features for Color Face Recognition
Kadal创立了TLD,跟踪学习检测同步进行,达到稳健跟踪的目的。他的两个导师也是大名鼎鼎,一个是发明MSER的Matas,一个是Mikolajczyk。他还创立了一个公司TLDVision s.r.o. 这里给出了他的系列文章,最后一篇是刚出来的PAMI。
[2009] Online learning ofrobust object detectors during unstable tracking
[2010 CVPR] P-N LearningBootstrapping Binary Classifiers by Structural Constraints
[2010 ICIP] FACE-TLDTRACKING-LEARNING-DETECTION APPLIED TO FACES
[2012 PAMI]Tracking-Learning-Detection
前面两个是两个很有名的视频监控系统,里面包含了很丰富的信息量,比如CMU的那个系统里面的背景建模算法也是相当简单有效的。最后一篇是比较近的综述。
[2000 CMU TR] A System forVideo Surveillance and Monitoring
[2000 PAMI] W4-- real-timesurveillance of people and their activitie
[2008 MVA] The evolutionof video surveillance an overview
Haar+Adaboost的弱弱联手,组成了最强大的利器。在OpenCV里面有它的实现,也可以选择用LBP来代替Haar特征。
[2001 CVPR] Rapid objectdetection using a boosted cascade of simple features
[2004 IJCV] RobustReal-time Face Detection