原文:http://blog.csdn.net/lansatiankongxxc/article/details/12978207
1.搜狗实验室数据集:
http://www.sogou.com/labs/dl/p.html
互联网图片库来自sogou图片搜索所索引的部分数据。其中收集了包括人物、动物、建筑、机械、风景、运动等类别,总数高达2,836,535张图片。对于每张图片,数据集中给出了图片的原图、缩略图、所在网页以及所在网页中的相关文本。200多G
2
http://www.imageclef.org/
IMAGECLEF致力于位图片相关领域提供一个基准(检索、分类、标注等等) Cross Language Evaluation Forum (CLEF) 。从2003年开始每年举行一次比赛.
http://staff.science.uva.nl/~xirong/index.PHP?n=Main.Dataset
3
Xiaorong Li 维护的数据集。PhD ,Intelligent Systems Lab Amsterdam.research on video and image retrieval.
- Flickr-3.5M: A collection of 3.5 million social-tagged images.
- Social20: A ground-truth set for tag-based social image retrieval.
- Biconcepts2012test: A ground-truth set for retrieving bi-concepts (concept pairs) in unlabeled images.
- neg4free: A set of negative examples automatically harvested from social-tagged images for 20 PASCAL VOC concepts.
4
wikipedia featured articles 函数图片(以及特征)以及对应的wiki文本。可以看看文章A New Approach to Cross-Modal Multimedia Retrieval,还有一批文章On the Role of Correlation and Abstraction in Cross-Modal Multimedia Retrieval不过还没有下载链接
http://www.svcl.ucsd.edu/projects/crossmodal/
5
http://lms.comp.nus.edu.sg/research/NUS-WIDE.htm
To our knowledge, this is the largest real-world web image dataset comprising over 269,000 images with over 5,000 user-provided tags, and ground-truth of 81 concepts for the entire dataset. The dataset is much larger than the popularly available Corel and Caltech 101 datasets. Though some datasets comprise over 3 million images, they only have ground-truth for a small fraction of images. Our proposed NUS-WIDE dataset has the ground-truth for the entire dataset.
6.
http://www.cs.washington.edu/research/imagedatabase/
7.
http://lear.inrialpes.fr/~jegou/data.php
Jegou的数据集,不过Jegou是专门做CBIR的,图像有ground truth,没有标注。
8.
http://www.robots.ox.ac.uk/~vgg/data/oxbuildings/
vgg的osford building dataset。也是专门CBIR的数据。
9.
http://acmmm13.org/submissions/call-for-multimedia-grand-challenge-solutions/msr-bing-grand-challenge-on-image-retrieval-scientific-track/
The dataset for the Microsoft Image Grand Challenge on Image Retrieval
另外介绍cvpaper上的整理的数据集
http://www.cvpapers.com/index.html
Participate in Reproducible Research
Detection
-
PASCAL VOC 2009 dataset
-
Classification/Detection Competitions, Segmentation Competition, Person Layout Taster Competition datasets
-
LabelMe dataset
-
LabelMe is a web-based image annotation tool that allows researchers to label images and share the annotations with the rest of the
-
community. If you use the database, we only ask that you contribute to it, from time to time, by using the labeling tool.
-
BioID Face Detection Database
-
1521 images with human faces, recorded under natural conditions, i.e. varying illumination and complex background. The eye positions
-
have been set manually.
-
CMU/VASC & PIE Face dataset
-
Yale Face dataset
-
Caltech
-
Cars, Motorcycles, Airplanes, Faces, Leaves, Backgrounds
-
Caltech 101
-
Pictures of objects belonging to 101 categories
-
Caltech 256
-
Pictures of objects belonging to 256 categories
-
Daimler Pedestrian Detection Benchmark
-
15,560 pedestrian and non-pedestrian samples (image cut-outs) and 6744 additional full images not containing pedestrians for
-
bootstrapping. The test set contains more than 21,790 images with 56,492 pedestrian labels (fully visible or partially occluded),
-
captured from a vehicle in urban traffic.
-
MIT Pedestrian dataset
-
CVC Pedestrian Datasets
-
CVC Pedestrian Datasets
-
CBCL Pedestrian Database
-
MIT Face dataset
-
CBCL Face Database
-
MIT Car dataset
-
CBCL Car Database
-
MIT Street dataset
-
CBCL Street Database
-
INRIA Person Data Set
-
A large set of marked up images of standing or walking people
-
INRIA car dataset
-
A set of car and non-car images taken in a parking lot nearby INRIA
-
INRIA horse dataset
-
A set of horse and non-horse images
-
H3D Dataset
-
3D skeletons and segmented regions for 1000 people in images
-
HRI RoadTraffic dataset
-
A large-scale vehicle detection dataset
-
BelgaLogos
-
10000 images of natural scenes, with 37 different logos, and 2695 logos instances, annotated with a bounding box.
-
FlickrBelgaLogos
-
10000 images of natural scenes grabbed on Flickr, with 2695 logos instances cut and pasted from the BelgaLogos dataset.
-
FlickrLogos-32
-
The dataset FlickrLogos-32 contains photos depicting logos and is meant for the evaluation of multi-class logo detection/recognition
-
as well as logo retrieval methods on real-world images. It consists of 8240 images downloaded from Flickr.
-
TME Motorway Dataset
-
30000+ frames with vehicle rear annotation and classification (car and trucks) on motorway/highway sequences. Annotation
-
semi-automatically generated using laser-scanner data. Distance estimation and consistent target ID over time available.
-
PHOS (Color Image Database for illumination invariant feature selection)
-
Phos is a color image database of 15 scenes captured under different illumination conditions. More particularly, every scene
-
of the database contains 15 different images: 9 images captured under various strengths of uniform illumination, and 6 images
-
under different degrees of non-uniform illumination. The images contain objects of different shape, color and texture and can
-
be used for illumination invariant feature detection and selection.
-
CaliforniaND: An Annotated Dataset For Near-Duplicate Detection In Personal Photo Collections
-
California-ND contains 701 photos taken directly from a real user's personal photo collection, including many challenging
-
non-identical near-duplicate cases, without the use of artificial image transformations. The dataset is annotated by 10 different
-
subjects, including the photographer, regarding near duplicates.
Classification
-
PASCAL VOC 2009 dataset
-
Classification/Detection Competitions, Segmentation Competition, Person Layout Taster Competition datasets
-
Caltech
-
Cars, Motorcycles, Airplanes, Faces, Leaves, Backgrounds
-
Caltech 101
-
Pictures of objects belonging to 101 categories
-
Caltech 256
-
Pictures of objects belonging to 256 categories
-
ETHZ Shape Classes
-
A dataset for testing object class detection algorithms. It contains 255 test images and features five diverse shape-based
-
classes (apple logos, bottles, giraffes, mugs, and swans).
-
Flower classification data sets
-
17 Flower Category Dataset
-
Animals with attributes
-
A dataset for Attribute Based Classification. It consists of 30475 images of 50 animals classes with six pre-extracted
-
feature representations for each image.
-
Stanford Dogs Dataset
-
Dataset of 20,580 images of 120 dog breeds with bounding-box annotation, for fine-grained image categorization.
Recognition
-
Face and Gesture Recognition Working Group FGnet
-
Face and Gesture Recognition Working Group FGnet
-
Feret
-
Face and Gesture Recognition Working Group FGnet
-
PUT face
-
9971 images of 100 people
-
Labeled Faces in the Wild
-
A database of face photographs designed for studying the problem of unconstrained face recognition
-
Urban scene recognition
-
Traffic Lights Recognition, Lara's public benchmarks.
-
PubFig: Public Figures Face Database
-
The PubFig database is a large, real-world face dataset consisting of 58,797 images of 200 people collected from the internet.
-
Unlike most other existing face datasets, these images are taken in completely uncontrolled situations with non-cooperative subjects.
-
YouTube Faces
-
The data set contains 3,425 videos of 1,595 different people. The shortest clip duration is 48 frames, the longest clip is 6,070
-
frames, and the average length of a video clip is 181.3 frames.
-
MSRC-12: Kinect gesture data set
-
The Microsoft Research Cambridge-12 Kinect gesture data set consists of sequences of human movements, represented as
-
body-part locations, and the associated gesture to be recognized by the system.
-
QMUL underGround Re-IDentification (GRID) Dataset
-
This dataset contains 250 pedestrian image pairs + 775 additional images captured in a busy underground station for the research
-
on person re-identification.
-
Person identification in TV series
-
Face tracks, features and shot boundaries from our latest CVPR 2013 paper. It is obtained from 6 episodes of Buffy the Vampire
-
Slayer and 6 episodes of Big Bang Theory.
-
ChokePoint Dataset
-
ChokePoint is a video dataset designed for experiments in person identification/verification under real-world surveillance
-
conditions. The dataset consists of 25 subjects (19 male and 6 female) in portal 1 and 29 subjects (23 male and 6 female) in portal 2.
Tracking
-
BIWI Walking Pedestrians dataset
-
Walking pedestrians in busy scenarios from a bird eye view
-
"Central" Pedestrian Crossing Sequences
-
Three pedestrian crossing sequences
-
Pedestrian Mobile Scene Analysis
-
The set was recorded in Zurich, using a pair of cameras mounted on a mobile platform. It contains 12'298 annotated pedestrians
-
in roughly 2'000 frames.
-
Head tracking
-
BMP image sequences.
-
KIT AIS Dataset
-
Data sets for tracking vehicles and people in aerial image sequences.
-
MIT Traffic Data Set
-
MIT traffic data set is for research on activity analysis and crowded scenes. It includes a traffic video sequence of 90 minutes
-
long. It is recorded by a stationary camera.
Segmentation
点击打开链接
-
Image Segmentation with A Bounding Box Prior dataset
-
Ground truth database of 50 images with: Data, Segmentation, Labelling - Lasso, Labelling - Rectangle
-
PASCAL VOC 2009 dataset
-
Classification/Detection Competitions, Segmentation Competition, Person Layout Taster Competition datasets
-
Motion Segmentation and OBJCUT data
-
Cows for object segmentation, Five video sequences for motion segmentation
-
Geometric Context Dataset
-
Geometric Context Dataset: pixel labels for seven geometric classes for 300 images
-
Crowd Segmentation Dataset
-
This dataset contains videos of crowds and other high density moving objects. The videos are collected mainly from the BBC
-
Motion Gallery and Getty Images website. The videos are shared only for the research purposes. Please consult the terms and
-
conditions of use of these videos from the respective websites.
-
CMU-Cornell iCoseg Dataset
-
Contains hand-labelled pixel annotations for 38 groups of images, each group containing a common foreground. Approximately
-
17 images per group, 643 images total.
-
Segmentation evaluation database
-
200 gray level images along with ground truth segmentations
-
The Berkeley Segmentation Dataset and Benchmark
-
Image segmentation and boundary detection. Grayscale and color segmentations for 300 images, the images are divided into
-
a training set of 200 images, and a test set of 100 images.
-
Weizmann horses
-
328 side-view color images of horses that were manually segmented. The images were randomly collected from the WWW.
-
Saliency-based video segmentation with sequentially updated priors
-
10 videos as inputs, and segmented image sequences as ground-truth
Foreground/Background
-
Wallflower Dataset
-
For evaluating background modelling algorithms
-
Foreground/Background Microsoft Cambridge Dataset
-
Foreground/Background segmentation and Stereo dataset from Microsoft Cambridge
-
Stuttgart Artificial Background Subtraction Dataset
-
The SABS (Stuttgart Artificial Background Subtraction) dataset is an artificial dataset for pixel-wise evaluation of
-
background models.
Saliency Detection (source)
-
AIM
-
120 Images / 20 Observers (Neil D. B. Bruce and John K. Tsotsos 2005).
-
LeMeur
-
27 Images / 40 Observers (O. Le Meur, P. Le Callet, D. Barba and D. Thoreau 2006).
-
Kootstra
-
100 Images / 31 Observers (Kootstra, G., Nederveen, A. and de Boer, B. 2008).
-
DOVES
-
101 Images / 29 Observers (van der Linde, I., Rajashekar, U., Bovik, A.C., Cormack, L.K. 2009).
-
Ehinger
-
912 Images / 14 Observers (Krista A. Ehinger, Barbara Hidalgo-Sotelo, Antonio Torralba and Aude Oliva 2009).
-
NUSEF
-
758 Images / 75 Observers (R. Subramanian, H. Katti, N. Sebe1, M. Kankanhalli and T-S. Chua 2010).
-
JianLi
-
235 Images / 19 Observers (Jian Li, Martin D. Levine, Xiangjing An and Hangen He 2011).
-
Extended Complex Scene Saliency Dataset (ECSSD)
-
ECSSD contains 1000 natural images with complex foreground or background. For each image, the ground truth mask of
-
salient object(s) is provided.
Video Surveillance
-
CAVIAR
-
For the CAVIAR project a number of video clips were recorded acting out the different scenarios of interest. These include
-
people walking alone, meeting with others, window shopping, entering and exitting shops, fighting and passing out and last,
-
but not least, leaving a package in a public place.
-
ViSOR
-
ViSOR contains a large set of multimedia data and the corresponding annotations.
Multiview
-
3D Photography Dataset
-
Multiview stereo data sets: a set of images
-
Multi-view Visual Geometry group's data set
-
Dinosaur, Model House, Corridor, Aerial views, Valbonne Church, Raglan Castle, Kapel sequence
-
Oxford reconstruction data set (building reconstruction)
-
Oxford colleges
-
Multi-View Stereo dataset (Vision Middlebury)
-
Temple, Dino
-
Multi-View Stereo for Community Photo Collections
-
Venus de Milo, Duomo in Pisa, Notre Dame de Paris
-
IS-3D Data
-
Dataset provided by Center for Machine Perception
-
CVLab dataset
-
CVLab dense multi-view stereo image database
-
3D Objects on Turntable
-
Objects viewed from 144 calibrated viewpoints under 3 different lighting conditions
-
Object Recognition in Probabilistic 3D Scenes
-
Images from 19 sites collected from a helicopter flying around Providence, RI. USA. The imagery contains approximately
-
a full circle around each site.
-
Multiple cameras fall dataset
-
24 scenarios recorded with 8 IP video cameras. The first 22 first scenarios contain a fall and confounding events, the last 2
-
ones contain only confounding events.
Action
-
UCF Sports Action Dataset
-
This dataset consists of a set of actions collected from various sports which are typically featured on broadcast television
-
channels such as the BBC and ESPN. The video sequences were obtained from a wide range of stock footage websites
-
including BBC Motion gallery, and GettyImages.
-
UCF Aerial Action Dataset
-
This dataset features video sequences that were obtained using a R/C-controlled blimp equipped with an HD camera mounted
-
on a gimbal.The collection represents a diverse pool of actions featured at different heights and aerial viewpoints. Multiple
-
instances of each action were recorded at different flying altitudes which ranged from 400-450 feet and were performed by
-
different actors.
-
UCF YouTube Action Dataset
-
It contains 11 action categories collected from YouTube.
-
Weizmann action recognition
-
Walk, Run, Jump, Gallop sideways, Bend, One-hand wave, Two-hands wave, Jump in place, Jumping Jack, Skip.
-
UCF50
-
UCF50 is an action recognition dataset with 50 action categories, consisting of realistic videos taken from YouTube.
-
ASLAN
-
The Action Similarity Labeling (ASLAN) Challenge.
-
MSR Action Recognition Datasets
-
The dataset was captured by a Kinect device. There are 12 dynamic American Sign Language (ASL) gestures, and 10 people.
-
Each person performs each gesture 2-3 times.
-
KTH Recognition of human actions
-
Contains six types of human actions (walking, jogging, running, boxing, hand waving and hand clapping) performed several
-
times by 25 subjects in four different scenarios: outdoors, outdoors with scale variation, outdoors with different clothes and
-
indoors.
-
Hollywood-2 Human Actions and Scenes dataset
-
Hollywood-2 datset contains 12 classes of human actions and 10 classes of scenes distributed over 3669 video clips and
-
approximately 20.1 hours of video in total.
-
Collective Activity Dataset
-
This dataset contains 5 different collective activities : crossing, walking, waiting, talking, and queueing and 44 short video
-
sequences some of which were recorded by consumer hand-held digital camera with varying view point.
-
Olympic Sports Dataset
-
The Olympic Sports Dataset contains YouTube videos of athletes practicing different sports.
-
SDHA 2010
-
Surveillance-type videos
-
VIRAT Video Dataset
-
The dataset is designed to be realistic, natural and challenging for video surveillance domains in terms of its resolution,
-
background clutter, diversity in scenes, and human activity/event categories than existing action recognition datasets.
-
HMDB: A Large Video Database for Human Motion Recognition
-
Collected from various sources, mostly from movies, and a small proportion from public databases, YouTube and Google
-
videos. The dataset contains 6849 clips divided into 51 action categories, each containing a minimum of 101 clips.
-
Stanford 40 Actions Dataset
-
Dataset of 9,532 images of humans performing 40 different actions, annotated with bounding-boxes.
-
50Salads dataset
-
Fully annotated dataset of RGB-D video data and data from accelerometers attached to kitchen objects capturing 25 people
-
preparing two mixed salads each (4.5h of annotated data). Annotated activities correspond to steps in the recipe and include
-
phase (pre-/ core-/ post) and the ingredient acted upon.
Human pose/Expression
-
AFEW (Acted Facial Expressions In The Wild)/SFEW (Static Facial Expressions In The Wild)
-
Dynamic temporal facial expressions data corpus consisting of close to real world environment extracted from movies.
-
ETHZ CALVIN Dataset
Image stitching
-
IPM Vision Group Image Stitching datasets
-
Images and parameters for registeration
Medical
-
VIP Laparoscopic / Endoscopic Dataset
-
Collection of endoscopic and laparoscopic (mono/stereo) videos and images
Misc
-
Zurich Buildings Database
-
ZuBuD Image Database contains over 1005 images about Zurich city building.
-
Color Name Data Sets
-
Mall dataset
-
The mall dataset was collected from a publicly accessible webcam for crowd counting and activity profiling research.
-
QMUL Junction Dataset
-
A busy traffic dataset for research on activity analysis and behaviour understanding.
CVOnline的数据集
http://homepages.inf.ed.ac.uk/rbf/CVonline/CVentry.htm
Index by Topic
- Action Databases
- Biological/Medical
- Face Databases
- Fingerprints
- General Images
- Gesture Databases
- Image, Video and Shape Database Retrieval
- Object Databases
- People, Pedestrian, Eye/Iris, Template Detection/Tracking Databases
- Segmentation
- Surveillance
- Textures
- General Videos
- Other Collection Pages
- Miscellaneous Topics
Action Databases
- 50 Salads - fully annotated 4.5 hour dataset of RGB-D video + accelerometer data, capturing 25 people preparing
- two mixed salads each (Dundee University, Sebastian Stein)
- ASLAN Action similarity labeling challenge database (Orit Kliper-Gross)
- Berkeley MHAD: A Comprehensive Multimodal Human Action Database (Ferda Ofli)
- BEHAVE Interacting Person Video Data with markup (Scott Blunsden, Bob Fisher, Aroosha Laghaee)
- CVBASE06: annotated sports videos (Janez Pers)
- G3D - synchronised video, depth and skeleton data for 20 gaming actions captured with Microsoft Kinect (Victoria Bloom)
- Hollywood 3D - 650 3D action recognition in the wild videos, 14 action classes (Simon Hadfield)
- Human Actions and Scenes Dataset (Marcin Marszalek, Ivan Laptev, Cordelia Schmid)
- HumanEva: Synchronized Video and Motion Capture Dataset for Evaluation of Articulated Human Motion (Brown University)
- i3DPost Multi-View Human Action Datasets (Hansung Kim)
- i-LIDS video event image dataset (Imagery library for intelligent detection systems) (Paul Hosner)
- INRIA Xmas Motion Acquisition Sequences (IXMAS) (INRIA)
- JPL First-Person Interaction dataset - 7 types of human activity videos taken from a first-person viewpoint (Michael S. Ryoo, JPL)
- KTH human action recognition database (KTH CVAP lab)
- LIRIS human activities dataset - 2 cameras, annotated, depth images (Christian Wolf, et al)
- MuHAVi - Multicamera Human Action Video Data (Hossein Ragheb)
- Oxford TV based human interactions (Oxford Visual Geometry Group)
- Rochester Activities of Daily Living Dataset (Ross Messing)
- SDHA Semantic Description of Human Activities 2010 contest - aerial views (Michael S. Ryoo, J. K. Aggarwal, Amit K. Roy-Chowdhury)
- SDHA Semantic Description of Human Activities 2010 contest - Human Interactions (Michael S. Ryoo, J. K. Aggarwal, Amit K. Roy-Chowdhury)
- TUM Kitchen Data Set of Everyday Manipulation Activities (Moritz Tenorth, Jan Bandouch)
- TV Human Interaction Dataset (Alonso Patron-Perez)
- Univ of Central Florida - Feature Films Action Dataset (Univ of Central Florida)
- Univ of Central Florida - YouTube Action Dataset (sports) (Univ of Central Florida)
- Univ of Central Florida - 50 Action Category Recognition in Realistic Videos (3 GB) (Kishore Reddy)
- UCF 101 action dataset 101 action classes, over 13k clips and 27 hours of video data (Univ of Central Florida)
- Univ of Central Florida - Sports Action Dataset (Univ of Central Florida)
- Univ of Central Florida - ARG Aerial camera, Rooftop camera and Ground camera (UCF Computer Vision Lab)
- UCR Videoweb Multi-camera Wide-Area Activities Dataset (Amit K. Roy-Chowdhury)
- Verona Social interaction dataset (Marco Cristani)
- Videoweb (multicamera) Activities Dataset (B. Bhanu, G. Denina, C. Ding, A. Ivers, A. Kamal, C. Ravishankar, A. Roy-Chowdhury, B. Varda)
- ViHASi: Virtual Human Action Silhouette Data (userID: VIHASI password: virtual$virtual) (Hossein Ragheb, Kingston University)
- WorkoutSU-10 Kinect dataset for exercise actions (Ceyhun Akgul)
- YouCook - 88 open-source YouTube cooking videos with annotations (Jason Corso)
- WVU Multi-view action recognition dataset (Univ. of West Virginia)
Biological/Medical
- Computed Tomography Emphysema Database (Lauge Sorensen)
- Dermoscopy images (Eric Ehrsam)
- DIADEM: Digital Reconstruction of Axonal and Dendritic Morphology Competition (Allen Institute for Brain Science et al)
- DIARETDB1 - Standard Diabetic Retinopathy Database (Lappeenranta Univ of Technology)
- DRIVE: Digital Retinal Images for Vessel Extraction (Univ of Utrecht)
- MiniMammographic Database (Mammographic Image Analysis Society)
- MIT CBCL Automated Mouse Behavior Recognition datasets (Nicholas Edelman)
- Retinal fundus images - Ground truth of vascular bifurcations and crossovers (Univ of Groningen)
- Spine and Cardiac data (Digital Imaging Group of London Ontario, Shuo Li)
- Univ of Central Florida - DDSM: Digital Database for Screening Mammography (Univ of Central Florida)
- VascuSynth - 120 3D vascular tree like structures with ground truth (Mengliu Zhao, Ghassan Hamarneh)
- York Cardiac MRI dataset (Alexander Andreopoulos)
Face Databases
- 3D Mask Attack Database (3DMAD) - 76500 frames of 17 persons using Kinect RGBD with eye positions (Sebastien Marcel)
- Audio-visual database for face and speaker recognition (Mobile Biometry MOBIO http://www.mobioproject.org/)
- BANCA face and voice database (Univ of Surrey)
- Binghampton Univ 3D static and dynamic facial expression database (Lijun Yin, Peter Gerhardstein and teammates)
- BioID face database (BioID group)
- Biwi 3D Audiovisual Corpus of Affective Communication - 1000 high quality, dynamic 3D scans of faces, recorded while pronouncing a set of English sentences.
- CMU Facial Expression Database (CMU/MIT)
- CMU/MIT Frontal Faces (CMU/MIT)
- CMU/MIT Frontal Faces (CMU/MIT)
- CMU Pose, Illumination, and Expression (PIE) Database (Simon Baker)
- CSSE Frontal intensity and range images of faces (Ajmal Mian)
- Face Recognition Grand Challenge datasets (FRVT - Face Recognition Vendor Test)
- FaceTracer Database - 15,000 faces (Neeraj Kumar, P. N. Belhumeur, and S. K. Nayar)
- FDDB: Face Detection Data set and Benchmark - studying unconstrained face detection (University of Massachusetts Computer Vision Laboratory)
- FG-Net Aging Database of faces at different ages (Face and Gesture Recognition Research Network)
- Facial Recognition Technology (FERET) Database (USA National Institute of Standards and Technology)
- Hong Kong Face Sketch Database
- Japanese Female Facial Expression (JAFFE) Database (Michael J. Lyons)
- LFW: Labeled Faces in the Wild - unconstrained face recognition. Re-labeled Faces in the Wild - original images, but aligned using "deep funneling" method. (University of Massachusetts, Amherst)
- Manchester Annotated Talking Face Video Dataset (Timothy Cootes)
- MIT Collation of Face Databases (Ethan Meyers)
- MORPH (Craniofacial Longitudinal Morphological Face Database) (University of North Carolina Wilmington)
- MIT CBCL Face Recognition Database (Center for Biological and Computational Learning)
- NIST mugshot identification database (USA National Institute of Standards and Technology)
- ORL face database: 40 people with 10 views (ATT Cambridge Labs)
- Oxford: faces, flowers, multi-view, buildings, object categories, motion segmentation, affine covariant regions, misc (Oxford Visual Geometry Group)
- PubFig: Public Figures Face Database (Neeraj Kumar, Alexander C. Berg, Peter N. Belhumeur, and Shree K. Nayar)
- SCface - Surveillance Cameras Face Database (Mislav Grgic, Kresimir Delac, Sonja Grgic, Bozidar Klimpak))
- Trondheim Kinect RGB-D Person Re-identification Dataset (Igor Barros Barbosa)
- UB KinFace Database - University of Buffalo kinship verification and recognition database
- XM2VTS Face video sequences (295): The extended M2VTS Database (XM2VTS) - (Surrey University)
- Yale Face Database - 11 expressions of 10 people (A. Georghaides)
- Yale Face Database B - 576 viewing conditions of 10 people (A. Georghaides)
Fingerprints
- FVC fingerpring verification competition 2002 dataset (University of Bologna)
- FVC fingerpring verification competition 2004 dataset (University of Bologna)
- FVC - a subset of FVC (Fingerprint Verification Competition) 2002 and 2004 fingerprint image databases, manually extracted minutiae data & associated documents (Umut Uludag)
- NIST fingerprint databases (USA National Institute of Standards and Technology)
- SPD2010 Fingerprint Singular Points Detection Competition (SPD 2010 committee)
General Images
- Aerial color image dataset (Swiss Federal Institute of Technology)
- AMOS: Archive of Many Outdoor Scenes (20+m) (Nathan Jacobs)
- Brown Univ Large Binary Image Database (Ben Kimia)
- Columbia Multispectral Image Database (F. Yasuma, T. Mitsunaga, D. Iso, and S.K. Nayar)
- HIPR2 Image Catalogue of different types of images (Bob Fisher et al)
- Hyperspectral images of natural scenes - 2002 (David H. Foster)
- Hyperspectral images of natural scenes - 2004 (David H. Foster)
- ImageNet Linguistically organised (WordNet) Hierarchical Image Database - 10E7 images, 15K categories (Li Fei-Fei, Jia Deng, Hao Su, Kai Li)
- ImageNet Large Scale Visual Recognition Challenge (Alex Berg, Jia Deng, Fei-Fei Li)
- OTCBVS Thermal Imagery Benchmark Dataset Collection (Ohio State Team)
- McGill Calibrated Colour Image Database (Adriana Olmos and Fred Kingdom)
- Tiny Images Dataset 79 million 32x32 color images (Fergus, Torralba, Freeman)
Gesture Databases
- FG-Net Aging Database of faces at different ages (Face and Gesture Recognition Research Network)
- Hand gesture and marine silhouettes (Euripides G.M. Petrakis)
- IDIAP Hand pose/gesture datasets (Sebastien Marcel)
- Sheffield gesture database - 2160 RGBD hand gesture sequences, 6 subjects, 10 gestures, 3 postures, 3 backgrounds, 2 illuminations (Ling Shao)
Image, Video and Shape Database Retrieval
- Brown Univ 25/99/216 Shape Databases (Ben Kimia)
- IAPR TC-12 Image Benchmark (Michael Grubinger)
- IAPR-TC12 Segmented and annotated image benchmark (SAIAPR TC-12): (Hugo Jair Escalante)
- ImageCLEF 2010 Concept Detection and Annotation Task (Stefanie Nowak)
- ImageCLEF 2011 Concept Detection and Annotation Task - multi-label classification challenge in Flickr photos
- CLEF-IP 2011 evaluation on patent images
- McGill 3D Shape Benchmark (Siddiqi, Zhang, Macrini, Shokoufandeh, Bouix, Dickinson)
- NIST SHREC 2010 - Shape Retrieval Contest of Non-rigid 3D Models (USA National Institute of Standards and Technology)
- NIST SHREC - other NIST retrieval contest databases and links (USA National Institute of Standards and Technology)
- NIST TREC Video Retrieval Evaluation Database (USA National Institute of Standards and Technology)
- Princeton Shape Benchmark (Princeton Shape Retrieval and Analysis Group)
- Queensland cross media dataset - millions of images and text documents for "cross-media" retrieval (Yi Yang)
- TOSCA 3D shape database (Bronstein, Bronstein, Kimmel)
Object Databases
- 2.5D/3D Datasets of various objects and scenes (Ajmal Mian)
- Amsterdam Library of Object Images (ALOI): 100K views of 1K objects (University of Amsterdam/Intelligent Sensory Information Systems)
- Caltech 101 (now 256) category object recognition database (Li Fei-Fei, Marco Andreeto, Marc'Aurelio Ranzato)
- Columbia COIL-100 3D object multiple views (Columbia University)
- Densely sampled object views: 2500 views of 2 objects, eg for view-based recognition and modeling (Gabriele Peters, Universiteit Dortmund)
- German Traffic Sign Detection Benchmark (Ruhr-Universitat Bochum)
- GRAZ-02 Database (Bikes, cars, people) (A. Pinz)
- Linkoping 3D Object Pose Estimation Database (Fredrik Viksten and Per-Erik Forssen)
- Microsoft Object Class Recognition image databases (Antonio Criminisi, Pushmeet Kohli, Tom Minka, Carsten Rother, Toby Sharp, Jamie Shotton, John Winn)
- Microsoft salient object databases (labeled by bounding boxes) (Liu, Sun Zheng, Tang, Shum)
- MIT CBCL Car Data (Center for Biological and Computational Learning)
- MIT CBCL StreetScenes Challenge Framework: (Stan Bileschi)
- NEC Toy animal object recognition or categorization database (Hossein Mobahi)
- NORB 50 toy image database (NYU)
- PASCAL Image Database (motorbikes, cars, cows) (PASCAL Consortium)
- PASCAL 2007 Challange Image Database (motorbikes, cars, cows) (PASCAL Consortium)
- PASCAL 2008 Challange Image Database (PASCAL Consortium)
- PASCAL 2009 Challange Image Database (PASCAL Consortium)
- PASCAL 2010 Challange Image Database (PASCAL Consortium)
- PASCAL 2011 Challange Image Database (PASCAL Consortium)
- PASCAL 2012 Challange Image Database Category classification, detection, and segmentation, and still-image action classification (PASCAL Consortium)
- UIUC Car Image Database (UIUC)
- UIUC Dataset of 3D object categories (S. Savarese and L. Fei-Fei)
- Venezia 3D object-in-clutter recognition and segmentation (Emanuele Rodola)
People, Pedestrian, Eye/Iris, Template Detection/Tracking Databases
- 3D KINECT Gender Walking data base (L. Igual, A. Lapedriza, R. Borràs from UB, CVC and UOC, Spain)
- Caltech Pedestrian Dataset (P. Dollar, C. Wojek, B. Schiele and P. Perona)
- CASIA gait database (Chinese Academy of Sciences)
- CASIA-IrisV3 (Chinese Academy of Sciences, T. N. Tan, Z. Sun)
- CAVIAR project video sequences with tracking and behavior ground truth (CAVIAR team/Edinburgh University - EC project IST-2001-37540)
- Daimler Pedestrian Detection Benchmark 21790 images with 56492 pedestrians plus empty scenes (M. Enzweiler, D. M. Gavrila)
- Driver Monitoring Video Dataset (RobeSafe + Jesus Nuevo-Chiquero)
- Edinburgh overhead camera person tracking dataset (Bob Fisher, Bashia Majecka, Gurkirt Singh, Rowland Sillito)
- Eyetracking database summary (Stefan Winkler)
- HAT database of 27 human attributes (Gaurav Sharma, Frederic Jurie)
- INRIA Person Dataset (Navneet Dalal)
- ISMAR09 ground truth video dataset for template-based (i.e. planar) tracking algorithms (Sebastian Lieberknecht)
- MIT CBCL Pedestrian Data (Center for Biological and Computational Learning)
- MIT eye tracking database (1003 images) (Judd et al)
- Notre Dame Iris Image Dataset (Patrick J. Flynn)
- PETS 2009 Crowd Challange dataset (Reading University & James Ferryman)
- PETS: Performance Evaluation of Tracking and Surveillance (Reading University & James Ferryman)
- PETS Winter 2009 workshop data (Reading University & James Ferryman)
- UBIRIS: Noisy Visible Wavelength Iris Image Databases (University of Beira)
- Univ of Central Florida - Crowd Dataset (Saad Ali)
- Univ of Central Florida - Crowd Flow Segmentation datasets (Saad Ali)
- York Univ Eye Tracking Dataset (120 images) (Neil Bruce)
Segmentation
- Alpert et al. Segmentation evaluation database (Sharon Alpert, Meirav Galun, Ronen Basri, Achi Brandt)
- Berkeley Segmentation Dataset and Benchmark (David Martin and Charless Fowlkes)
- GrabCut Image database (C. Rother, V. Kolmogorov, A. Blake, M. Brown)
- LabelMe images database and online annotation tool (Bryan Russell, Antonio Torralba, Kevin Murphy, William Freeman)
Surveillance
- AVSS07: Advanced Video and Signal based Surveillance 2007 datasets (Andrea Cavallaro)
- ETISEO Video Surveillance Download Datasets (INRIA Orion Team and others)
- Heriot Watt Summary of datasets for human tracking and surveillance (Zsolt Husz)
- SPEVI: Surveillance Performance EValuation Initiative (Queen Mary University London)
- Udine Trajectory-based anomalous event detection dataset - synthetic trajectory datasets with outliers (Univ of Udine Artificial Vision and Real Time Systems Laboratory)
Textures
- Color texture images by category (textures.forrest.cz)
- Columbia-Utrecht Reflectance and Texture Database (Columbia & Utrecht Universities)
- DynTex: Dynamic texture database (Renaud Piteri, Mark Huiskes and Sandor Fazekas)
- Oulu Texture Database (Oulu University)
- Prague Texture Segmentation Data Generator and Benchmark (Mikes, Haindl)
- Uppsala texture dataset of surfaces and materials - fabrics, grains, etc.
- Vision Texture (MIT Media Lab)
General Videos
- Large scale YouTube video dataset - 156,823 videos (2,907,447 keyframes) crawled from YouTube videos (Yi Yang)
Other Collections
- CANTATA Video and Image Database Index site (Multitel)
- Computer Vision Homepage list of test image databases (Carnegie Mellon Univ)
- ETHZ various, including 3D head pose, shape classes, pedestrians, pedestrians, buildings (ETH Zurich, Computer Vision Lab)
- Leibe's Collection of people/vehicle/object databases (Bastian Leibe)
- Lotus Hill Image Database Collection with Ground Truth (Sealeen Ren, Benjamin Yao, Michael Yang)
- Oxford Misc, including Buffy, Flowers, TV characters, Buildings, etc (Oxford Visual geometry Group)
- PEIPA Image Database Summary (Pilot European Image Processing Archive)
- Univ of Bern databases on handwriting, online documents, string edit and graph matching (Univ of Bern, Computer Vision and Artificial Intelligence)
- USC Annotated Computer Vision Bibliography database publication summary (Keith Price)
- USC-SIPI image databases: texture, aerial, favorites (eg. Lena) (USC Signal and Image Processing Institute)
Miscellaneous
- 3D mesh watermarking benchmark dataset (Guillaume Lavoue)
- Active Appearance Models datasets (Mikkel B. Stegmann)
- Aircraft tracking (Ajmal Mian)
- Cambridge Motion-based Segmentation and Recognition Dataset (Brostow, Shotton, Fauqueur, Cipolla)
- Catadioptric camera calibration images (Yalin Bastanlar)
- Chars74K dataset - 74 English and Kannada characters (Teo de Campos - [email protected])
- COLD (COsy Localization Database) - place localization (Ullah, Pronobis, Caputo, Luo, and Jensfelt)
- Columbia Camera Response Functions: Database (DoRF) and Model (EMOR) (M.D. Grossberg and S.K. Nayar)
- Columbia Database of Contaminants' Patterns and Scattering Parameters (Jinwei Gu, Ravi Ramamoorthi, Peter Belhumeur, Shree Nayar)
- Dense outdoor correspondence ground truth datasets, for optical flow and local keypoint evaluation (Christoph Strecha)
- DTU controlled motion and lighting image dataset (135K images) (Henrik Aanaes)
- EISATS: .enpeda.. Image Sequence Analysis Test Site (Auckland University Multimedia Imaging Group)
- FlickrLogos-32 - 8240 images of 32 product logos (Stefan Romberg)
- Flowchart images (Allan Hanbury)
- Geometric Context - scene interpretation images (Derek Hoiem)
- Image/video quality assessment database summary (Stefan Winkler)
- INRIA feature detector evaluation sequences (Krystian Mikolajczyk)
- INRIA's PERCEPTION's database of images and videos gathered with several synchronized and calibrated cameras (INRIA Rhone-Alpes)
- INRIA's Synchronized and calibrated binocular/binaural data sets with head movements (INRIA Rhone-Alpes)
- KITTI dataset for stereo, optical flow and visual odometry (Geiger, Lenz, Urtasun)
- Large scale 3D point cloud data from terrestrial LiDAR scanning (Andreas Nuechter)
- Linkoping Rolling Shutter Rectification Dataset (Per-Erik Forssen and Erik Ringaby)
- Middlebury College stereo vision research datasets (Daniel Scharstein and Richard Szeliski)
- MPI-Sintel optical flow evaluation dataset (Michael Black)
- Multiview stereo images with laser based groundtruth (ESAT-PSI/VISICS,FGAN-FOM,EPFL/IC/ISIM/CVLab)
- The Cancer Imaging Archive (National Cancer Institute)
- NCI Cancer Image Archive - prostate images (National Cancer Institute)
- NIST 3D Interest Point Detection (Helin Dutagaci, Afzal Godil)
- NRCS natural resource/agricultural image database (USDA Natural Resources Conservation Service)
- Occlusion detection test data (Andrew Stein)
- The Open Video Project (Gary Marchionini, Barbara M. Wildemuth, Gary Geisler, Yaxiao Song)
- Pics 'n' Trails - Dataset of Continuously archived GPS and digital photos (Gamhewage Chaminda de Silva)
- PRINTART: Artistic images of prints of well known paintings, including detail annotations. A benchmark for automatic annotation and retrieval tasks with this database was published at ECCV. (Nuno Miguel Pinho da Silva)
- RAWSEEDS SLAM benchmark datasets (Rawseeds Project)
- Robotic 3D Scan Repository - 3D point clouds from robotic experiments of scenes (Osnabruck and Jacobs Universities)
- ROMA (ROad MArkings) : Image database for the evaluation of road markings extraction algorithms (Jean-Philippe Tarel, et al)
- Stuttgart Range Image Database - 66 views of 45 objects
- UCL Ground Truth Optical Flow Dataset (Oisin Mac Aodha)
- Univ of Genoa Datasets for disparity and optic flow evaluation (Manuela Chessa)
- Validation and Verification of Neural Network Systems (Francesco Vivarelli)
- VSD: Technicolor Violent Scenes Dataset - a collection of ground-truth files based on the extraction of violent events in movies
- WILD: Weather and Illumunation Database (S. Narasimhan, C. Wang. S. Nayar, D. Stolyarov, K. Garg, Y. Schechner, H. Peri)