计算机视觉数据集大全 - Part1

转载自http://homepages.inf.ed.ac.uk/rbf/CVonline/Imagedbase.htm

Index by Topic

  1. Action Databases
  2. Agriculture
  3. Attribute recognition
  4. Autonomous Driving
  5. Biological/Medical
  6. Camera calibration
  7. Face and Eye/Iris Databases
  8. Fingerprints
  9. General Images
  10. General RGBD and depth datasets
  11. General Videos
  12. Hand, Hand Grasp, Hand Action and Gesture Databases
  13. Image, Video and Shape Database Retrieval
  14. Object Databases
  15. People (static and dynamic), human body pose
  16. People Detection and Tracking Databases (See also Surveillance)
  17. Remote Sensing
  18. Robotics
  19. Scenes or Places, Scene Segmentation or Classification
  20. Segmentation
  21. Simultaneous Localization and Mapping
  22. Surveillance and Tracking (See also People)
  23. Textures
  24. Urban Datasets
  25. Vision and Natural Language
  26. Other Collection Pages
  27. Miscellaneous Topics

Other helpful sites are:

  1. Academic Torrents - computer vision - a set of 30+ large datasets available in BitTorrent form
  2. Machine learning datasets - see CV tab
  3. YACVID - a tagged index to some computer vision datasets

Action Databases

See also: Action Recognition's dataset summary with league tables (Gall, Kuehne, Bhattarai).

  1. 20bn-Something-Something - densely-labeled video clips that show humans performing predefined basic actions with everyday objects (Twenty Billion Neurons GmbH) [Before 28/12/19]
  2. 3D online action dataset - There are seven action categories (Microsoft and Nanyang Technological University) [Before 28/12/19]
  3. 50 Salads - fully annotated 4.5 hour dataset of RGB-D video + accelerometer data, capturing 25 people preparing two mixed salads each (Dundee University, Sebastian Stein) [Before 28/12/19]
  4. A first-person vision dataset of office activities (FPVO) - FPVO contains first-person video segments of office activities collected using 12 participants. (G. Abebe, A. Catala, A. Cavallaro) [Before 28/12/19]
  5. ActivityNet - A Large-Scale Video Benchmark for Human Activity Understanding (200 classes, 100 videos per class, 648 video hours) (Heilbron, Escorcia, Ghanem and Niebles) [Before 28/12/19]
  6. Action Detection in Videos - MERL Shopping Dataset consists of 106 videos, each of which is a sequence about 2 minutes long (Michael Jones, Tim Marks) [Before 28/12/19]
  7. Actor and Action Dataset - 3782 videos, seven classes of actors performing eight different actions (Xu, Hsieh, Xiong, Corso) [Before 28/12/19]
  8. An analyzed collation of various labeled video datasets for action recognition (Kevin Murphy) [Before 28/12/19]
  9. AQA-7 - Dataset for assessing the quality of 7 different actions. It contains 1106 action samples and AQA scores. (Parmar, Morris) [29/12/19]
  10. ASLAN Action similarity labeling challenge database (Orit Kliper-Gross) [Before 28/12/19]
  11. Attribute Learning for Understanding Unstructured Social Activity - Database of videos containing 10 categories of unstructured social events to recognise, also annotated with 69 attributes. (Y. Fu Fudan/QMUL, T. Hospedales Edinburgh/QMUL) [Before 28/12/19]
  12. Audio-Visual Event (AVE) dataset- AVE dataset contains 4143 YouTube videos covering 28 event categories and videos in AVE dataset are temporally labeled with audio-visual event boundaries. (Yapeng Tian, Jing Shi, Bochen Li, Zhiyao Duan, and Chenliang Xu) [Before 28/12/19]
  13. AVA: A Video Dataset of Atomic Visual Action- 80 atomic visual actions in 430 15-minute movie clips. (Google Machine Perception Research Group) [Before 28/12/19]
  14. BBDB - Baseball Database (BBDB) is a large-scale baseball video dataset that contains 4200 hours of full baseball game videos with 400,000 temporally annotated activity segments. (Shim, Minho, Young Hwi, Kyungmin, Kim, Seon Joo) [Before 28/12/19]
  15. BEHAVE Interacting Person Video Data with markup (Scott Blunsden, Bob Fisher, Aroosha Laghaee) [Before 28/12/19]
  16. BU-action Datasets - Three image action datasets (BU101, BU101-unfiltered, BU203-unfiltered) that have 1:1 correspondence with classes of the video datasets UCF101 and ActivityNet. (S. Ma, S. A. Bargal, J. Zhang, L. Sigal, S. Sclaroff.) [Before 28/12/19]
  17. Berkeley MHAD: A Comprehensive Multimodal Human Action Database (Ferda Ofli) [Before 28/12/19]
  18. Berkeley Multimodal Human Action Database - five different modalities to expand the fields of application (University of California at Berkeley and Johns Hopkins University) [Before 28/12/19]
  19. Breakfast dataset - It's a dataset with 1712 video clips showing 10 kitchen activities, which are hand segmented into 48 atomic action classes . (H. Kuehne, A. B. Arslan and T. Serre ) [Before 28/12/19]
  20. Bristol Egocentric Object Interactions Dataset - Contains videos shot from a first-person (egocentric) point of view of 3-5 users performing tasks in six different locations (Dima Damen, Teesid Leelaswassuk and Walterio Mayol-Cuevas, Bristol University) [Before 28/12/19]
  21. Brown Breakfast Actions Dataset - 70 hours, 4 million frames of 10 different breakfast preparation activities (Kuehne, Arslan and Serre) [Before 28/12/19]
  22. CAD-120 dataset - focuses on high level activities and object interactions (Cornell University) [Before 28/12/19]
  23. CAD-60 dataset - The CAD-60 and CAD-120 data sets comprise of RGB-D video sequences of humans performing activities (Cornell University) [Before 28/12/19]
  24. CATER: A diagnostic dataset for Compositional Actions and TEmporal Reasoning - A synthetic video understanding benchmark, with tasks that by-design require temporal reasoning to be solved (Girdhar, Ramanan) [29/12/19]
  25. CVBASE06: annotated sports videos (Janez Pers) [Before 28/12/19]
  26. Charades Dataset - 10,000 videos from 267 volunteers, each annotated with multiple activities, captions, objects, and temporal localizations. (Sigurdsson, Varol, Wang, Laptev, Farhadi, Gupta) [Before 28/12/19]
  27. Composable activities dataset - Different combinations of 26 atomic actions formed 16 activity classes which were performed by 14 subjects and annotations were provided (Pontificia Universidad Catolica de Chile and Universidad del Norte) [Before 28/12/19]
  28. Continuous Multimodal Multi-view Dataset of Human Fall - The dataset consists of both normal daily activities and simulated falls for evaluating human fall detection. (Thanh-Hai Tran) [Before 28/12/19]
  29. Cornell Activity Datasets CAD 60, CAD 120 (Cornell Robot Learning Lab) [Before 28/12/19]
  30. DMLSmartActions dataset - Sixteen subjects performed 12 different actions in a natural manner. (University of British Columbia) [Before 28/12/19]
  31. DemCare dataset - DemCare dataset consists of a set of diverse data collection from different sensors and is useful for human activity recognition from wearable/depth and static IP camera, speech recognition for Alzheimmer's disease detection and physiological data for gait analysis and abnormality detection. (K. Avgerinakis, A.Karakostas, S.Vrochidis, I. Kompatsiaris) [Before 28/12/19]
  32. Depth-included Human Action video dataset - It contains 23 different actions (CITI in Academia Sinica) [Before 28/12/19]
  33. DogCentric Activity Dataset - first-person videos taken from a camera mounted on top of a *dog* (Michael Ryoo) [Before 28/12/19]
  34. Edinburgh ceilidh overhead video data - 16 ground-truthed dances viewed from overhead, where the 10 dancers follow a structured dance pattern (2 different dances). The dataset is useful for highly structured behavior understanding (Aizeboje, Fisher) [Before 28/12/19]
  35. EPIC-KITCHENS - egocentric video recorded by 32 participants in their native kitchen environments, non-scripted daily activities, 11.5M frames, 39.6K frame-level action segments and 454.2K object bounding boxes (Damen, Doughty, Fidler, et al) [Before 28/12/19]
  36. EPFL crepe cooking videos - 6 types of structured cooking activity (12) videos in 1920x1080 resolution (Lee, Ognibene, Chang, Kim and Demiris) [Before 28/12/19]
  37. ETS Hockey Game Event Data Set - This data set contains footage of two hockey games captured using fixed cameras. (M.-A. Carbonneau, A. J. Raymond, E. Granger, and G. Gagnon) [Before 28/12/19]
  38. The Falling Detection dataset - Six subjects in two sceneries performed a series of actions continuously (University of Texas) [Before 28/12/19]
  39. FCVID: Fudan-Columbia Video Dataset - 91,223 Web videos annotated manually according to 239 categories (Jiang, Wu, Wang, Xue, Chang) [Before 28/12/19]
  40. G3D - synchronised video, depth and skeleton data for 20 gaming actions captured with Microsoft Kinect (Victoria Bloom) [Before 28/12/19]
  41. G3Di - This dataset contains 12 subjects split into 6 pairs (Kingston University) [Before 28/12/19]
  42. Gaming 3D dataset - real-time action recognition in gaming scenario (Kingston University) [Before 28/12/19]
  43. Georgia Tech Egocentric Activities - Gaze(+) - videos of where people look at and their gaze location (Fathi, Li, Rehg) [Before 28/12/19]
  44. HMDB: A Large Human Motion Database (Serre Lab) [Before 28/12/19]
  45. Hollywood 3D dataset - 650 3D video clips, across 14 action classes (Hadfield and Bowden) [Before 28/12/19]
  46. Human Actions and Scenes Dataset (Marcin Marszalek, Ivan Laptev, Cordelia Schmid) [Before 28/12/19]
  47. Human Searches Search sequences of human annotators that were tasked to spot actions in AVA and THUMOS14 datasets. (Alwassel, H., Caba Heilbron, F., Ghanem, B.) [Before 28/12/19]
  48. Hollywood Extended - 937 video clips with a total of 787720 frames containing sequences of 16 different actions from 69 Hollywood movies. (Bojanowski, Lajugie, Bach, Laptev, Ponce, Schmid, and Sivic) [Before 28/12/19]
  49. HumanEva: Synchronized Video and Motion Capture Dataset for Evaluation of Articulated Human Motion (Brown University) [Before 28/12/19]
  50. I-LIDS video event image dataset (Imagery library for intelligent detection systems) (Paul Hosner) [Before 28/12/19]
  51. I3DPost Multi-View Human Action Datasets (Hansung Kim) [Before 28/12/19]
  52. IAS-lab Action dataset - contain sufficient variety of actions and number of people performing the actions (IAS Lab at the University of Padua) [Before 28/12/19]
  53. ICS-FORTH MHAD101 Action Co-segmentation - 101 pairs of long-term action sequences that share one or multiple common actions to be co-segmented, contains both 3d skeletal and video related frame-based features (Universtiy of Crete and FORTH-ICS, K. Papoutsakis) [Before 28/12/19]
  54. IIIT Extreme Sports - 160 first person (egocentric) sport videos from YouTube with frame level annotations of 18 action classes. (Suriya Singh, Chetan Arora, and C. V. Jawahar. Trajectory Aligned) [Before 28/12/19]
  55. INRIA Xmas Motion Acquisition Sequences (IXMAS) (INRIA) [Before 28/12/19]
  56. InfAR Dataset -Infrared Action Recognition at Different Times Neurocomputing(Chenqiang Gao, Yinhe Du, Jiang Liu, Jing Lv, Luyu Yang, Deyu Meng, Alexander G. Hauptmann) [Before 28/12/19]
  57. JHMDB: Joints for the HMDB dataset (J-HMDB) based on 928 clips from HMDB51 comprising 21 action categories (Jhuang, Gall, Zuffi, Schmid and Black) [Before 28/12/19]
  58. JPL First-Person Interaction dataset - 7 types of human activity videos taken from a first-person viewpoint (Michael S. Ryoo, JPL) [Before 28/12/19]
  59. Jena Action Recognition Dataset - Aibo dog actions (Korner and Denzler) [Before 28/12/19]
  60. K3Da - Kinect 3D Active dataset - K3Da (Kinect 3D active) is a realistic clinically relevant human action dataset containing skeleton, depth data and associated participant information (D. Leightley, M. H. Yap, J. Coulson, Y. Barnouin and J. S. McPhee) [Before 28/12/19]
  61. Kinetics Human Action Video Dataset - 300,000 video clips, 400 human action classe, 10 second clips, single action per clip (Kay, Carreira, et al) [Before 28/12/19]
  62. KIT Robo-Kitchen Activity Data Set - 540 clips of 17 people performing 12 complex kitchen activities. (L. Rybok, S. Friedberger, U. D. Hanebeck, R. Stiefelhagen) [Before 28/12/19]
  63. KTH human action recognition database (KTH CVAP lab) [Before 28/12/19]
  64. Karlsruhe Motion, Intention, and Activity Data set (MINTA) - 7 types of activities of daily living including fully motion primitive segments. (D. Gehrig, P. Krauthausen, L. Rybok, H. Kuehne, U. D. Hanebeck, T. Schultz, R. Stiefelhagen) [Before 28/12/19]
  65. Leeds Activity Dataset--Breakfast (LAD--Breakfast) - It is composed of 15 annotated videos, representing five different people having breakfast or other simple meal; (John Folkesson et al.) [Before 28/12/19]
  66. LIRIS Human Activities Dataset - contains (gray/rgb/depth) videos showing people performing various activities (Christian Wolf, et al, French National Center for Scientific Research) [Before 28/12/19]
  67. MEXaction2 action detection and localization dataset - To support the development and evaluation of methods for 'spotting' instances of short actions in a relatively large video database: 77 hours, 117 videos (Michel Crucianu and Jenny Benois-Pineau) [Before 28/12/19]
  68. MLB-YouTube - Dataset for activity recognition in baseball videos (AJ Piergiovanni, Michael Ryoo) [Before 28/12/19]
  69. Moments in Time Dataset - Moments in Time Dataset 1M 3-second videos annotated with action type, the largest dataset of its kind for action recognition and understanding in video. (Monfort, Oliva, et al.) [Before 28/12/19]
  70. MPII Cooking Activities Dataset for fine-grained cooking activity recognition, which also includes the continuous pose estimation challenge (Rohrbach, Amin, Andriluka and Schiele) [Before 28/12/19]
  71. MPII Cooking 2 Dataset - A large dataset of fine-grained cooking activities, an extension of the MPII Cooking Activities Dataset. (Rohrbach, Rohrbach, Regneri, Amin, Andriluka, Pinkal, Schiele) [Before 28/12/19]
  72. MSR-Action3D - benchmark RGB-D action dataset (Microsoft Research Redmond and University of Wollongong) [Before 28/12/19]
  73. MSRActionPair dataset - : Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences (University of Central Florida and Microsoft) [Before 28/12/19]
  74. MSRC-12 Kinect gesture data set - 594 sequences and 719,359 frames from people performing 12 gestures (Microsoft Research Cambridge) [Before 28/12/19]
  75. MSRC-12 dataset - sequences of human movements, represented as body-part locations, and the associated gesture (Microsoft Research Cambridge and University of Cambridge) [Before 28/12/19]
  76. MSRDailyActivity3D Dataset - There are 16 activities (Microsoft and the Northwestern University) [Before 28/12/19]
  77. ManiAc RGB-D action dataset: different manipulation actions, 15 different versions, 30 different objects manipulated, 20 long and complex chained manipulation sequences (Eren Aksoy) [Before 28/12/19]
  78. Mivia dataset - It consists of 7 high-level actions performed by 14 subjects. (Mivia Lab at the University of Salemo) [Before 28/12/19]
  79. MTL-AQA - Multitask learning dataset for assessing quality of Olympic Diving. More than 1500 samples. It contains videos of action samples, fine-grained action class, expert commentary (AQA-oriented captions), AQA scores from judges. Videos from multiple views included wherever available. Can be used for captioning, and fine-grained action recognition, apart from AQA. (Parmar, Morris) [29/12/19]
  80. MuHAVi - Multicamera Human Action Video Data (Hossein Ragheb) [Before 28/12/19]
  81. Multi-modal action detection (MAD) Dataset - It contains 35 sequential actions performed by 20 subjects. (CarnegieMellon University) [Before 28/12/19]
  82. Multiview 3D Event dataset - This dataset includes 8 categories of events performed by 8 subjects (University of California at Los Angles) [Before 28/12/19]
  83. Nagoya University Extremely Low-resolution FIR Image Action Dataset - Action recognition dataset captured by a 16x16 low-resolution FIR sensor. (Nagoya University) [Before 28/12/19]
  84. NTU RGB+D Action Recognition Dataset - NTU RGB+D is a large scale dataset for human action recognition(Amir Shahroudy) [Before 28/12/19]
  85. Northwestern-UCLA Multiview Action 3D - There are 10 action categories:(Northwestern University and University of California at Los Angles) [Before 28/12/19]
  86. Office Activity Dataset - It consists of skeleton data acquired by Kinect 2.0 from different subjects performing common office activities. (A. Franco, A. Magnani, D. Maiop) [Before 28/12/19]
  87. Oxford TV based human interactions (Oxford Visual Geometry Group) [Before 28/12/19]
  88. PA-HMDB51 - human action video (592) dataset with potential privacy leak attributes annotated: skin color, gender, face, nudity, and relationship (Wang, Wu, Wang, Wang, Jin) [Before 28/12/19]
  89. Parliament - The Parliament dataset is a collection of 228 video sequences, depicting political speeches in the Greek parliament. (Michalis Vrigkas, Christophoros Nikou, Ioannins A. kakadiaris) [Before 28/12/19]
  90. Procedural Human Action Videos - This dataset contains about 40,000 videos for human action recognition that had been generated using a 3D game engine. The dataset contains about 6 million frames which can be used to train and evaluate models not only action recognition but also models for depth map estimation, optical flow, instance segmentation, semantic segmentation, 3D and 2D pose estimation, and attribute learning. (Cesar Roberto de Souza) [Before 28/12/19]
  91. RGB-D activity dataset - Each video in the dataset contains 2-7 actions involving interaction with different objects. (Cornell University and Stanford University) [Before 28/12/19]
  92. RGBD-Action-Completion-2016 - This dataset includes 414 complete/incomplete object interaction sequences, spanning six actions and presenting RGB, depth and skeleton data. (Farnoosh Heidarivincheh, Majid Mirmehdi, Dima Damen) [Before 28/12/19]
  93. RGB-D-based Action Recognition Datasets - Paper that includes the list and links of different rgb-d action recognition datasets. (Jing Zhang, Wanqing Li, Philip O. Ogunbona, Pichao Wang, Chang Tang) [Before 28/12/19]
  94. RGBD-SAR Dataset - RGBD-SAR Dataset (University of Electronic Science and Technology of China and Microsoft) [Before 28/12/19]
  95. Rochester Activities of Daily Living Dataset (Ross Messing) [Before 28/12/19]
  96. SBU Kinect Interaction Dataset - It contains eight types of interactions (Stony Brook University) [Before 28/12/19]
  97. SBU-Kinect-Interaction dataset v2.0 - It comprises of RGB-D video sequences of humans performing interaction activities (Kiwon Yun etc.) [Before 28/12/19]
  98. SDHA Semantic Description of Human Activities 2010 contest - Human Interactions (Michael S. Ryoo, J. K. Aggarwal, Amit K. Roy-Chowdhury) [Before 28/12/19]
  99. SDHA Semantic Description of Human Activities 2010 contest - aerial views (Michael S. Ryoo, J. K. Aggarwal, Amit K. Roy-Chowdhury) [Before 28/12/19]
  100. SFU Volleyball Group Activity Recognition - 2 levels annotations dataset (9 players' actions and 8 scene's activity) for volleyball videos. (M. Ibrahim, S. Muralidharan, Z. Deng, A. Vahdat, and G. Mori / Simon Fraser University) [Before 28/12/19]
  101. SYSU 3D Human-Object Interaction Dataset - Forty subjects perform 12 distinct activities (Sun Yat-sen University) [Before 28/12/19]
  102. ShakeFive Dataset - contains only two actions, namely hand shake and high five. (Universiteit Utrecht) [Before 28/12/19]
  103. ShakeFive2 - A dyadic human interaction dataset with limb level annotations on 8 classes in 153 HD videos (Coert van Gemeren, Ronald Poppe, Remco Veltkamp) [Before 28/12/19]
  104. SoccerNet - Scalable dataset for action spotting in soccer videos: 500 soccer games fully annotated with main actions (goal, cards, subs) and more than 13K soccer games annotated with 500K commentaries for event captioning and game summarization. (Silvio Giancola, Mohieddine Amine, Tarek Dghaily, Bernard Ghanem) [Before 28/12/19]
  105. Sports Videos in the Wild (SVW) - SVW is comprised of 4200 videos captured solely with smartphones by users of Coach Eye smartphone app, a leading app for sports training developed by TechSmith corporation. (Seyed Morteza Safdarnejad, Xiaoming Liu) [Before 28/12/19]
  106. Stanford Sport Events dataset (Jia Li) [Before 28/12/19]
  107. THU-READ(Tsinghua University RGB-D Egocentric Action Dataset) - THU-READ is a large-scale dataset for action recognition in RGBD videos with pixel-layer hand annotation. (Yansong Tang, Yi Tian, Jiwen Lu, Jianjiang Feng, Jie Zhou) [Before 28/12/19]
  108. THUMOS - Action Recognition in Temporally Untrimmed Videos! - 430 hours of video data and 45 million frames (Gorban, Idrees, Jiang, Zamir, Laptev Shah, Sukthanka) [Before 28/12/19]
  109. Toyota Smarthome dataset - Dataset for Real-world activities of Daily Living (Toyota Motors Europe & INRIA Sophia Antipolis) [30/12/19]
  110. TUM Kitchen Data Set of Everyday Manipulation Activities (Moritz Tenorth, Jan Bandouch) [Before 28/12/19]
  111. TV Human Interaction Dataset (Alonso Patron-Perez) [Before 28/12/19]
  112. The TJU dataset - contains 22 actions performed by 20 subjects in two different environments; a total of 1760 sequences. (Tianjin University) [Before 28/12/19]
  113. UCF-iPhone Data Set - 9 Aerobic actions were recorded from (6-9) subjects using the Inertial Measurement Unit (IMU) on an Apple iPhone 4 smartphone. (Corey McCall, Kishore Reddy and Mubarak Shah) [Before 28/12/19]
  114. UCI Human Activity Recognition Using Smartphones Data Set - recordings of 30 subjects performing activities of daily living (ADL) while carrying a waist-mounted smartphone with embedded inertial sensors (Anguita, Ghio, Oneto, Parra, Reyes-Ortiz) [Before 28/12/19]
  115. UNLV Dive & Gymvault - Dataset for assessing quality of Olympic Diving and Olympic Gymnastic Vault. It consists of videos of action samples and corresponding action quality scores. (Parmar, Morris) [29/12/19]
  116. The UPCV action dataset - The dataset consists of 10 actions performed by 20 subjects twice. (University of Patras) [Before 28/12/19]
  117. UC-3D Motion Database - Available data types encompass high resolution Motion Capture, acquired with MVN Suit from Xsens and Microsoft Kinect RGB and depth images. (Institute of Systems and Robotics, Coimbra, Portugal) [Before 28/12/19]
  118. UCF 101 action dataset 101 action classes, over 13k clips and 27 hours of video data (Univ of Central Florida) [Before 28/12/19]
  119. UCF-Crime Dataset: Real-world Anomaly Detection in Surveillance Videos - A large-scale dataset for real-world anomaly detection in surveillance videos. It consists of 1900 long and untrimmed real-world surveillance videos (of 128 hours), with 13 realistic anomalies such as fighting, road accident, burglary, robbery, etc. as well as normal activities. (Center for Research in Computer Vision, University of Central Florida) [Before 28/12/19]
  120. UCFKinect - The dataset is composed of 16 actions (University of Central Florida Orlando) [Before 28/12/19]
  121. UCLA Human-Human-Object Interaction (HHOI) Dataset Vn1 - Human interactions in RGB-D videos (Shu, Ryoo, and Zhu) [Before 28/12/19]
  122. UCLA Human-Human-Object Interaction (HHOI) Dataset Vn2 - Human interactions in RGB-D videos (version 2) (Shu, Gao, Ryoo, and Zhu) [Before 28/12/19]
  123. UCR Videoweb Multi-camera Wide-Area Activities Dataset (Amit K. Roy-Chowdhury) [Before 28/12/19]
  124. UTD-MHAD - Eight subjects performed 27 actions four times. (University of Texas at Dallas) [Before 28/12/19]
  125. UTKinect dataset - Ten types of human actions were performed twice by 10 subjects (University of Texas) [Before 28/12/19]
  126. UWA3D Multiview Activity Dataset - Thirty activities were performed by 10 individuals (University of Western Australia) [Before 28/12/19]
  127. Univ of Central Florida - 50 Action Category Recognition in Realistic Videos (3 GB) (Kishore Reddy) [Before 28/12/19]
  128. Univ of Central Florida - ARG Aerial camera, Rooftop camera and Ground camera (UCF Computer Vision Lab) [Before 28/12/19]
  129. Univ of Central Florida - Feature Films Action Dataset (Univ of Central Florida) [Before 28/12/19]
  130. Univ of Central Florida - Sports Action Dataset (Univ of Central Florida) [Before 28/12/19]
  131. Univ of Central Florida - YouTube Action Dataset (sports) (Univ of Central Florida) [Before 28/12/19]
  132. Unsegmented Sports News Videos - Database of 74 sports news videos tagged with 10 categories of sports. Designed to test multi-label video tagging. (T. Hospedales, Edinburgh/QMUL) [Before 28/12/19]
  133. Utrecht Multi-Person Motion Benchmark (UMPM). - a collection of video recordings of people together with a ground truth based on motion capture data. (N.P. van der Aa, X. Luo, G.J. Giezeman, R.T. Tan, R.C. Veltkamp.) [Before 28/12/19]
  134. VIRAT Video Dataset - event recognition from two broad categories of activities (single-object and two-objects) which involve both human and vehicles. (Sangmin Oh et al) [Before 28/12/19]
  135. Verona Social interaction dataset (Marco Cristani) [Before 28/12/19]
  136. ViHASi: Virtual Human Action Silhouette Data (userID: VIHASI password: virtual$virtual) (Hossein Ragheb, Kingston University) [Before 28/12/19]
  137. Videoweb (multicamera) Activities Dataset (B. Bhanu, G. Denina, C. Ding, A. Ivers, A. Kamal, C. Ravishankar, A. Roy-Chowdhury, B. Varda) [Before 28/12/19]
  138. WVU Multi-view action recognition dataset (Univ. of West Virginia) [Before 28/12/19]
  139. WorkoutSU-10 Kinect dataset for exercise actions (Ceyhun Akgul) [Before 28/12/19]
  140. WorkoutSU-10 dataset - contains exercise actions selected by professional trainers for therapeutic purposes. (Sabanci University) [Before 28/12/19]
  141. Wrist-mounted camera video dataset - object manipulation (Ohnishi, Kanehira, Kanezaki, Harada) [Before 28/12/19]
  142. YouCook - 88 open-source YouTube cooking videos with annotations (Jason Corso) [Before 28/12/19]
  143. YouTube-8M Dataset -A Large and Diverse Labeled Video Dataset for Video Understanding Research(Google Inc.) [Before 28/12/19]

Agriculture

  1. Aberystwyth Leaf Evaluation Dataset - Timelapse plant images with hand marked up leaf-level segmentations for some time steps, and biological data from plant sacrifice. (Bell, Jonathan; Dee, Hannah M.) [Before 28/12/19]
  2. Fieldsafe - A multi-modal dataset for obstacle detection in agriculture. (Aarhus University) [Before 28/12/19]
  3. KOMATSUNA dataset - The datasets is designed for instance segmentation, tracking and reconstruction for leaves using both sequential multi-view RGB images and depth images. (Hideaki Uchiyama, Kyushu University) [Before 28/12/19]
  4. Leaf counting dataset - Dataset for estimating the growth stage of small plants. (Aarhus University) [Before 28/12/19]
  5. Leaf Segmentation ChallengeTobacco and arabidopsis plant images (Hanno Scharr, Massimo Minervini, Andreas Fischbach, Sotirios A. Tsaftaris) [Before 28/12/19]
  6. Multi-species fruit flower detection - This dataset consists of four sets of flower images, from three different tree species: apple, peach, and pear, and accompanying ground truth images. (Philipe A. Dias, Amy Tabb, Henry Medeiros) [Before 28/12/19]
  7. Plant Phenotyping Datasets - plant data suitable for plant and leaf detection, segmentation, tracking, and species recognition (M. Minervini, A. Fischbach, H. Scharr, S. A. Tsaftaris) [Before 28/12/19]
  8. Plant seedlings dataset - High-resolution images of 12 weed species. (Aarhus University) [Before 28/12/19]

Attribute recognition

  1. Attribute Learning for Understanding Unstructured Social Activity - Database of videos containing 10 categories of unstructured social events to recognise, also annotated with 69 attributes. (Y. Fu Fudan/QMUL, T. Hospedales Edinburgh/QMUL) [Before 28/12/19]
  2. Animals with Attributes 2 - 37322 (freely licensed) images of 50 animal classes with 85 per-class binary attributes. (Christoph H. Lampert, IST Austria) [Before 28/12/19]
  3. BirdsThis database contains 600 images (100 samples each) of six different classes of birds. (Svetlana Lazebnik, Cordelia Schmid, and Jean Ponce) [Before 28/12/19]
  4. ButterfliesThis database contains 619 images of seven different classes of butterflies. (Svetlana Lazebnik, Cordelia Schmid, and Jean Ponce) [Before 28/12/19]
  5. CAER (Context-Aware Emotion Recognition) - Large scale image and video dataset for emotion recognition, and facial expression recognition (Lee, Kim, Kim, Park, and Sohn) [29/12/19]
  6. CALVIN research group datasets - object detection with eye tracking, imagenet bounding boxes, synchronised activities, stickman and body poses, youtube objects, faces, horses, toys, visual attributes, shape classes (CALVIN ggroup) [Before 28/12/19]
  7. CelebA - Large-scale CelebFaces Attributes Dataset(Ziwei Liu, Ping Luo, Xiaogang Wang, Xiaoou Tang) [Before 28/12/19]
  8. DukeMTMC-attribute - 23 pedestrian attributes for DukeMTMC-reID (Lin, Zheng, Zheng, Wu and Yang) [Before 28/12/19]
  9. EMOTIC (EMOTIons in Context) - Images of people (34357) embedded in their natural environments, annotated with 2 distinct emotion representation. (Ronak kosti, Agata Lapedriza, Jose Alvarez, Adria Recasens) [Before 28/12/19]
  10. HAT database of 27 human attributes (Gaurav Sharma, Frederic Jurie) [Before 28/12/19]
  11. LFW-10 dataset for learning relative attributes - A dataset of 10,000 pairs of face images with instance-level annotations for 10 attributes. (CVIT, IIIT Hyderabad. ) [Before 28/12/19]
  12. Market-1501-attribute - 27 visual attributes for 1501 shoppers. (Lin, Zheng, Zheng, Wu and Yang) [Before 28/12/19]
  13. Multi-Class Weather Dataset - Our multi-class benchmark dataset contains 65,000 images from 6 common categories for sunny, cloudy, rainy, snowy, haze and thunder weather. This dataset benefits weather classification and attribute recognition. (Di Lin) [Before 28/12/19]
  14. Person Recognition in Personal Photo Collections - we introduced three harder splits for evaluation and long-term attribute annotations and per-photo timestamp metadata. (Oh, Seong Joon and Benenson, Rodrigo and Fritz, Mario and Schiele, Bernt) [Before 28/12/19]
  15. UT-Zappos50K Shoes - Large scale shoe dataset consisting of 50,000 catalog images and over 50,000 pairwise relative attribute labels on 11 fine-grained attributes (Aron Yu, Mark Stephenson, Kristen Grauman, UT Austin) [Before 28/12/19]
  16. Visual Attributes Dataset visual attribute annotations for over 500 object classes (animate and inanimate) which are all represented in ImageNet. Each object class is annotated with visual attributes based on a taxonomy of 636 attributes (e.g., has fur, made of metal, is round). (Before 30/12/19) [Before 28/12/19]
  17. The Visual Privacy (VISPR) Dataset - Privacy Multilabel Dataset (22k images, 68 privacy attributes) (Orekondy, Schiele, Fritz) [Before 28/12/19]
  18. WIDER Attribute Dataset - WIDER Attribute is a large-scale human attribute dataset, with 13789 images belonging to 30 scene categories, and 57524 human bounding boxes each annotated with 14 binary attributes. (Li, Yining and Huang, Chen and Loy, Chen Change and Tang, Xiaoou) [Before 28/12/19]

Autonomous Driving

  1. AMUSE -The automotive multi-sensor (AMUSE) dataset taken in real traffic scenes during multiple test drives. (Philipp Koschorrek etc.) [Before 28/12/19]
  2. ApolloScape - high resolution cameras and a Riegl acquisition system. Our dataset is collected in different cities under various traffic conditions. 74555 video frames and their pixel-level and instance-level annotations (Peking University / Baido) [18/1/20]
  3. Argoverse - Two public datasets supported by highly detailed maps to test, experiment, and teach self-driving vehicles how to understand the world around them; more than 300,000 curated scenarios, 3D tracking annotations for 113 scenes and 324,557 interesting vehicle trajectories for motion forecasting (Chang, Lambert, Sangkloy, Singh, Bak, Hartnett, Wang, Carr, Lucey, Ramanan, Hays) [18/1/20]
  4. Autonomous Driving - Semantic segmentation, pedestrian detection, virtual-world data, far infrared, stereo,driver monitoring. (CVC research center and the UAB and UPC universities) [Before 28/12/19]
  5. Bosch Small Traffic Lights Dataset (BSTLD) - A dataset for traffic light detection, tracking, and classification. [Before 28/12/19]
  6. DrivingStereo - A Large-Scale Dataset for Stereo Matching in Autonomous Driving Scenarios. 180k stereo images covering a diverse set of driving scenarios (Yang, Song, Huang, Deng, Shi, Zhou) [Before 28/12/19]
  7. Boxy vehicle detection dataset - A vehicle detection dataset with 1.99 million annotated vehicles in 200,000 images. It contains AABB and keypoint labels. [Before 28/12/19]
  8. CASR: Cyclist Arm Sign Recognition - Small clips of ~10 seconds showing cyclists performing arm signs. The videos are acquired with a consumer-graded camera. There are 219 arm sign actions annotated. (Zhijie Fang, Antonio M. Lopez) [13/1/20]
  9. Ford Campus Vision and Lidar Data Set - time-registered data from professional (Applanix POS LV) and consumer (Xsens MTI-G) Inertial Measuring Unit (IMU), Velodyne 3D-lidar scanner, two push-broom forward looking Riegl lidars, and a Point Grey Ladybug3 omnidirectional camera system (Pandey, McBride, Eustice) [Before 28/12/19]
  10. FRIDA (Foggy Road Image DAtabase) Image Database - images for performance evaluation of visibility and contrast restoration algorithms. FRIDA: 90 synthetic images of 18 urban road scenes. FRIDA2: 330 synthetic images of 66 diverse road scenes, with viewpoint closed to that of the vehicle's driver. (Tarel, Cord, Halmaoui, Gruyer, Hautiere) [Before 28/12/19]
  11. H3D - Honda Research 3D dataset - 360 degree LiDAR dataset (dense pointcloud from Velodyne-64), 160 crowded and highly interactive traffic scenes, 1,071,302 3D bounding box labels, 8 common classes of traffic participants (Patil, Malla, Gang, Chen) [18/1/20]
  12. House3D - House3D is a virtual 3D environment which consists of thousands of indoor scenes equipped with a diverse set of scene types, layouts and objects sourced from the SUNCG dataset. It consists of over 45k indoor 3D scenes, ranging from studios to two-storied houses with swimming pools and fitness rooms. All 3D objects are fully annotated with category labels. Agents in the environment have access to observations of multiple modalities, including RGB images, depth, segmentation masks and top-down 2D map views. The renderer runs at thousands frames per second, making it suitable for large-scale RL training. (Yi Wu, Yuxin Wu, Georgia Gkioxari, Yuandong Tian, facebook research) [Before 28/12/19]
  13. India Driving Dataset (IDD) - unstructured driving conditions from India with 50,000 frames (10,000 semantic, and 40,000 coarse annotations) for training autonomous cars to see using object detection, scene-level and instance-level semantic segmentation (CVIT, IIIT Hyderabad and Intel) [Before 28/12/19]
  14. Joint Attention in Autonomous Driving (JAAD) - The dataset includes instances of pedestrians and cars intended primarily for the purpose of behavioural studies and detection in the context of autonomous driving. (Iuliia Kotseruba, Amir Rasouli and John K. Tsotsos) [Before 28/12/19]
  15. LISA Vehicle Detection Dataset - colour first person driving video under various lighting and traffic conditions (Sivaraman, Trivedi) [Before 28/12/19]
  16. LLAMAS Unsupervised dataset - A lane marker detection and segmentation dataset of 100,000 images with 3d lines, pixel level dashed markers, and curves for individual lines. [Before 28/12/19]
  17. Lost and Found Dataset - The Lost and Found Dataset addresses the problem of detecting unexpected small road hazards (often caused by lost cargo) for autonomous driving applications. (Sebastian Ramos, Peter Pinggera, Stefan Gehrig, Uwe Franke, Rudolf Mester, Carsten Rother) [Before 28/12/19]
  18. Multi Vehicle Stereo Event Camera Dataset - Multiple sequences containing a stereo pair of DAVIS 346b event cameras with ground truth poses, depth maps and optical flow. (lex Zihao Zhu, Dinesh Thakur, Tolga Ozaslan, Bernd Pfrommer, Vijay Kumar, Kostas Daniilidis) [Before 28/12/19]
  19. nuTonomy scenes dataset (nuScenes) - The nuScenes dataset is a large-scale autonomous driving dataset. It features: Full sensor suite (1x LIDAR, 5x RADAR, 6x camera, IMU, GPS), 1000 scenes of 20s each, 1,440,000 camera images, 400,000 lidar sweeps, two diverse cities: Boston and Singapore, left versus right hand traffic, detailed map information, manual annotations for 25 object classes, 1.1M 3D bounding boxes annotated at 2Hz, attributes such as visibility, activity and pose. (Caesar et al) [Before 28/12/19]
  20. RESIDE (Realistic Single Image DEhazing) - The current largest-scale benchmark consisting of both synthetic and real-world hazy images, for image dehazing research. RESIDE highlights diverse data sources and image contents, and serves various training or evaluation purposes. (Boyi Li, Wenqi Ren, Dengpan Fu, Dacheng Tao, Dan Feng, Wenjun Zeng, Zhangyang Wang) [Before 28/12/19]
  21. semanticKITTI - A Dataset for Semantic Scene Understanding using LiDAR Sequences (Behley, Garbade, Milioto, Quenzel, Behnke, Stachniss, Gall) [18/1/20]
  22. SYNTHetic collection of Imagery and Annotations - The purpose of aiding semantic segmentation and related scene understanding problems in the context of driving scenarios. (Computer vision center,UAB) [Before 28/12/19]
  23. SYNTHIA - Large set (~half million) of virtual-world images for training autonomous cars to see. (ADAS Group at Computer Vision Center) [Before 28/12/19]
  24. TRoM: Tsinghua Road Markings - This is a dataset which contributes to the area of road marking segmentation for Automated Driving and ADAS. (Xiaolong Liu, Zhidong Deng, Lele Cao, Hongchao Lu) [Before 28/12/19]
  25. TUM City Campus - Urban point clouds taken by Mobile Laser Scanning (MLS) for classification, object extraction and change detection (Stilla, Hebel, Xu, Gehrung) [3/1/20]
  26. University of Michigan North Campus Long-Term Vision and LIDAR Dataset - 27 sessions spaced approximately biweekly over the course of 15 months, indoors and outdoors, varying trajectories, different times of the day across all four seasons. Includes: moving obstacles (e.g., pedestrians, bicyclists, and cars), changing lighting, varying viewpoint, seasonal and weather changes (e.g., falling leaves and snow), and long-term structural changes caused by construction. Includes ground-truth pose. (Carlevaris-Bianco, Ushani, Eustice) [Before 28/12/19]
  27. UZH-FPV Drone Racing Dataset - for visual inertial odometry and SLAM. 28 real-world first-person view sequences both indoors and outdoors, cintaining images, IMU, and events and ground truth (Delmerico, Cieslewski, Rebecq, Faessler, Scaramuzza) [Before 28/12/19]

Biological/Medical

  1. 2008 MICCAI MS Lesion Segmentation Challenge (National Institutes of Health Blueprint for Neuroscience Research) [Before 28/12/19]
  2. ASU DR-AutoCC Data - a Multiple-Instance Learning feature space for a diabetic retinopathy classification dataset (Ragav Venkatesan, Parag Chandakkar, Baoxin Li - Arizona State University) [Before 28/12/19]
  3. Aberystwyth Leaf Evaluation Dataset - Timelapse plant images with hand marked up leaf-level segmentations for some time steps, and biological data from plant sacrifice. (Bell, Jonathan; Dee, Hannah M.) [Before 28/12/19]
  4. ADP: Atlas of Digital Pathology - 17,668 histological patch images extracted from 100 slides annotated with up to 57 hierarchical tissue types (HTTs) from different organs - the aim is to provide training data for supervised multi-label learning of tissue types in a digitized whole slide image (Hosseini, Chan, Tse, Tang, Deng, Norouzi, Rowsell, Plataniotis, Damaskinos) [14/1/20]
  5. Annotated Spine CT Database for Benchmarking of Vertebrae Localization, 125 patients, 242 scans (Ben Glockern) [Before 28/12/19]
  6. BRATS - the identification and segmentation of tumor structures in multiparametric magnetic resonance images of the brain (TU Munchen etc.) [Before 28/12/19]
  7. Breast Ultrasound Dataset B - 2D Breast Ultrasound Images with 53 malignant lesions and 110 benign lesions. (UDIAT Diagnostic Centre, M.H. Yap, R. Marti) [Before 28/12/19]
  8. Calgary-Campinas Public Brain MR Dataset: T1-weighted brain MRI volumes acquired in 359 subjects on scanners from three different vendors (GE, Philips, and Siemens) and at two magnetic field strengths (1.5 T and 3 T). The scans correspond to older adult subjects. (Souza, Roberto, Oeslle Lucena, Julia Garrafa, David Gobbi, Marina Saluzzi, Simone Appenzeller, Leticia Rittner, Richard Frayne, and Roberto Lotufo) [Before 28/12/19]
  9. CAMEL colorectal adenoma dataset - image-level labels for weakly supervised learning containing 177 whole slide images (156 contain adenoma) gathered and labeled by pathologists (Song and Wang) [29/12/19]
  10. CheXpert - a large dataset of chest X-rays and competition for automated chest x-ray interpretation, which features uncertainty labels and radiologist-labeled reference standard evaluation sets (Irvin, Rajpurkar et al) [Before 28/12/19]
  11. Cholec80: 80 gallbladder laparoscopic videos annotated with phase and tool information. (Andru Putra Twinanda) [Before 28/12/19]
  12. CRCHistoPhenotypes - Labeled Cell Nuclei Data - colorectal cancer?histology images?consisting of nearly 30,000 dotted nuclei with over 22,000 labeled with the cell type (Rajpoot + Sirinukunwattana) [Before 28/12/19]
  13. Cavy Action Dataset - 16 sequences with 640 x 480 resolutions recorded at 7.5 frames per second (fps) with approximately 31621506 frames in total (272 GB) of interacting cavies (guinea pig) (Al-Raziqi and Denzler) [Before 28/12/19]
  14. Cell Tracking Challenge Datasets - 2D/3D time-lapse video sequences with ground truth(Ma et al., Bioinformatics 30:1609-1617, 2014) [Before 28/12/19]
  15. Computed Tomography Emphysema Database (Lauge Sorensen) [Before 28/12/19]
  16. COPD Machine Learning Dataset - A collection of feature datasets derived from lung computed tomography (CT) images, which can be used in diagnosis of chronic obstructive pulmonary disease (COPD). The images in this database are weakly labeled, i.e. per image, a diagnosis(COPD or no COPD) is given, but it is not known which parts of the lungs are affected. Furthermore, the images were acquired at different sites and with different scanners. These problems are related to two learning scenarios in machine learning, namely multiple instance learning or weakly supervised learning, and transfer learning or domain adaptation. (Veronika Cheplygina, Isabel Pino Pena, Jesper Holst Pedersen, David A. Lynch, Lauge S., Marleen de Bruijne) [Before 28/12/19]
  17. CREMI: MICCAI 2016 Challenge - 6 volumes of electron microscopy of neural tissue,neuron and synapse segmentation, synaptic partner annotation. (Jan Funke, Stephan Saalfeld, Srini Turaga, Davi Bock, Eric Perlman) [Before 28/12/19]
  18. CRIM13 Caltech Resident-Intruder Mouse dataset - 237 10 minute videos (25 fps) annotated with actions (13 classes) (Burgos-Artizzu, Dollar, Lin, Anderson and Perona) [Before 28/12/19]
  19. CVC colon DB - annotated video sequences of colonoscopy video. It contains 15 short colonoscopy sequences, coming from 15 different studies. In each sequence one polyp is shown. (Bernal, Sanchez, Vilarino) [Before 28/12/19]
  20. DIADEM: Digital Reconstruction of Axonal and Dendritic Morphology Competition (Allen Institute for Brain Science et al) [Before 28/12/19]
  21. DIARETDB1 - Standard Diabetic Retinopathy Database (Lappeenranta Univ of Technology) [Before 28/12/19]
  22. DRIVE: Digital Retinal Images for Vessel Extraction (Univ of Utrecht) [Before 28/12/19]
  23. DeformIt 2.0 - Image Data Augmentation Tool: Simulate novel images with ground truth segmentations from a single image-segmentation pair (Brian Booth and Ghassan Hamarneh) [Before 28/12/19]
  24. Deformable Image Registration Lab dataset - for objective and rigrorous evaluation of deformable image registration (DIR) spatial accuracy performance. (Richard Castillo et al.) [Before 28/12/19]
  25. DERMOFIT Skin Cancer Dataset - 1300 lesions from 10 classes captured under identical controlled conditions. Lesion segmentation masks are included (Fisher, Rees, Aldridge, Ballerini, et al) [Before 28/12/19]
  26. Dermoscopy images (Eric Ehrsam) [Before 28/12/19]
  27. EATMINT (Emotional Awareness Tools for Mediated INTeraction) database - The EATMINT database contains multi-modal and multi-user recordings of affect and social behaviors in a collaborative setting. (Guillaume Chanel, Gaelle Molinari, Thierry Pun, Mireille Betrancourt) [Before 28/12/19]
  28. EPT29.This database contains 4842 images of 1613 specimens of 29 taxa of EPTs:(Tom etc.) [Before 28/12/19]
  29. EyePACS - retinal image database is comprised of over 3 million retinal images of diverse populations with various degrees of diabetic retinopathy (EyePACS) [Before 28/12/19]
  30. FIRE Fundus Image Registration Dataset - 134 retinal image pairs and groud truth for registration. (FORTH-ICS) [Before 28/12/19]
  31. FMD - Fluorescence Microscopy Denoising dataset - 12,000 real fluorescence microscopy images (Zhang, Zhu, Nichols, Wang, Zhang, Smith, Howard) [Before 28/12/19]
  32. FocusPath - Focus Quality Assessment for Digital Pathology (Microscopy) Images. 864 image pathes are naturally blurred by 16 levels of out-of-focus lens provided with GT scores of focus levels. (Hosseini, Zhang, Plataniotis) [Before 28/12/19]
  33. Histology Image Collection Library (HICL) - The HICL is a compilation of 3870histopathological images (so far) from various diseases, such as brain cancer,breast cancer and HPV (Human Papilloma Virus)-Cervical cancer. (Medical Image and Signal Processing (MEDISP) Lab., Department of BiomedicalEngineering, School of Engineering, University of West Attica) [Before 28/12/19]
  34. Honeybee segmentation dataset - It is a dataset containing positions and orientation angles of hundreds of bees on a 2D surface of honey comb. (Bozek K, Hebert L, Mikheyev AS, Stephesn GJ) [Before 28/12/19]
  35. IIT MBADA mice - Mice behavioral data. FLIR A315, spacial resolution of 320??240px at 30fps, 50x50cm open arena, two experts for three different mice pairs, mice identities. (Italian Inst. of Technology, PAVIS lab) [Before 28/12/19]
  36. Indian Diabetic Retinopathy Image Dataset - This dataset consists of retinal fundus images annotated at pixel-level for lesions associated with Diabetic Retinopathy. Also, it provides the disease severity of diabetic retinopathy and diabetic macular edema. This dataset is useful for development and evaluation of image analysis algorithms for early detection of diabetic retinopathy. (Prasanna Porwal, Samiksha Pachade, Ravi Kamble, Manesh Kokare, Girish Deshmukh, Vivek Sahasrabuddhe, Fabrice Meriaudeau) [Before 28/12/19]
  37. IRMA(Image retrieval in medical applications) - This collection compiles anonymous radiographs (Deserno TM, Ott B) [Before 28/12/19]
  38. IVDM3Seg - 24 3D multi-modality MRI data sets of at least 7 IVDs of the lower spine, collected from 12 subjects in two different stages (Zheng, Li, Belavy) [Before 28/12/19]
  39. JIGSAWS - JHU-ISI Surgical Gesture and Skill Assessment Working Set (a surgical activity dataset for human motion modeling, captured using the da Vinci Surgical System from eight surgeons with different levels of skill performing five repetitions of three elementary surgical tasks. It contains: kinematic and video data, plus manual annotations. (Carol Reiley and Balazs Vagvolgyi) [Before 28/12/19]
  40. KID - A capsule endoscopy database for medical decision support (Anastasios Koulaouzidis and Dimitris Iakovidis) [Before 28/12/19]
  41. Leaf Segmentation ChallengeTobacco and arabidopsis plant images (Hanno Scharr, Massimo Minervini, Andreas Fischbach, Sotirios A. Tsaftaris) [Before 28/12/19]
  42. LIDC-IDRI - Lung Image Database Consortium image collection (LIDC-IDRI) consists of diagnostic and lung cancer screening thoracic computed tomography (CT) scans with marked-up annotated lesions. (Before 30/12/19) [Before 28/12/19]
  43. LITS Liver Tumor Segmentation - 130 3D CT scans with segmentations of the liver and liver tumor. Public benchmark with leaderboard at Codalab.org (Patrick Christ) [Before 28/12/19]
  44. Mammographic Image Analysis Homepage - a collection of databases links [Before 28/12/19]
  45. Medical image database - Database of ultrasound images of breast abnormalities with the ground truth. (Prof. Stanislav Makhanov, biomedsiit.com) [Before 28/12/19]
  46. MiniMammographic Database (Mammographic Image Analysis Society) [Before 28/12/19]
  47. MIT CBCL Automated Mouse Behavior Recognition datasets (Nicholas Edelman) [Before 28/12/19]
  48. Moth fine-grained recognition - 675 similar classes, 5344 images (Erik Rodner et al) [Before 28/12/19]
  49. Mouse Embryo Tracking Database - cell division event detection (Marcelo Cicconet, Kris Gunsalus) [Before 28/12/19]
  50. MUCIC: Masaryk University Cell Image Collection - 2D/3D synthetic images of cells/tissues for benchmarking(Masaryk University) [Before 28/12/19]
  51. NIH Chest X-ray Dataset - 112,120 X-ray images with disease labels from 30,805 unique patients. (NIH) [Before 28/12/19]
  52. OASIS - Open Access Series of Imaging Studies - 500+ MRI data sets of the brain (Washington University, Harvard University, Biomedical Informatics Research Network) [Before 28/12/19]
  53. Plant Phenotyping Datasets - plant data suitable for plant and leaf detection, segmentation, tracking, and species recognition (M. Minervini, A. Fischbach, H. Scharr, S. A. Tsaftaris) [Before 28/12/19]
  54. RatSI: Rat Social Interaction Dataset - 9 fully annotated (11 class) videos (15 minute, 25 FPS) of two rats interacting socially in a cage (Malte Lorbach, Noldus Information Technology) [Before 28/12/19]
  55. Retinal fundus images - Ground truth of vascular bifurcations and crossovers (Univ of Groningen) [Before 28/12/19]
  56. SCORHE - 1, 2 and 3 mouse behavior videos, 9 behaviors, (Ghadi H. Salem, et al, NIH) [Before 28/12/19]
  57. SLP (Simultaneously-collected multimodal Lying Pose) - large scale dataset on in-bed poses includes: 2 Data Collection Settings: (a) Hospital setting: 7 participants, and (b) Home setting: 102 participants (29 females, age range: 20-40). 4 Imaging Modalities: RGB (regular webcam), IR (FLIR LWIR camera), DEPTH (Kinect v2) and Pressure Map (Tekscan Pressure Sensing Map). 3 Cover Conditions: uncover, bed sheet, and blanket. Fully labeled poses with 14 joints. (Ostadabbas and Liu) [2/1/20]
  58. SNEMI3D - 3D Segmentation of neurites in EM images [Before 28/12/19]
  59. STructured Analysis of the Retina - DESCRIPTION(400+ retinal images, with ground truth segmentations and medical annotations) (Before 30/12/19) [Before 28/12/19]
  60. Spine and Cardiac data (Digital Imaging Group of London Ontario, Shuo Li) [Before 28/12/19]
  61. Stonefly9This database contains 3826 images of 773 specimens of 9 taxa of Stoneflies (Tom etc.) [Before 28/12/19]
  62. Synthetic Migrating Cells -Six artificial migrating cells (neutrophils) over 98 time frames, various levels of Gaussian/Poisson noise and different paths characteristics with ground truth. (Dr Constantino Carlos Reyes-Aldasoro et al.) [Before 28/12/19]
  63. UBFC-RPPG Dataset - remote photoplethysmography (rPPG) video data and ground truth acquired with a CMS50E transmissive pulse oximeter (Bobbia, Macwan, Benezeth, Mansouri, Dubois) [Before 28/12/19]
  64. Uni Bremen Open, Abdominal Surgery RGB Dataset - Recording of a complete, open, abdominal surgery using a Kinect v2 that was mounted directly above the patient looking down at patient and staff. (Joern Teuber, Gabriel Zachmann, University of Bremen) [Before 28/12/19]
  65. Univ of Central Florida - DDSM: Digital Database for Screening Mammography (Univ of Central Florida) [Before 28/12/19]
  66. VascuSynth - 120 3D vascular tree like structures with ground truth (Mengliu Zhao, Ghassan Hamarneh) [Before 28/12/19]
  67. VascuSynth - Vascular Synthesizer generates vascular trees in 3D volumes. (Ghassan Hamarneh, Preet Jassi, Mengliu Zhao) [Before 28/12/19]
  68. York Cardiac MRI dataset (Alexander Andreopoulos) [Before 28/12/19]

Camera calibration

  1. Catadioptric camera calibration images (Yalin Bastanlar) [Before 28/12/19]
  2. GoPro-Gyro Dataset - This dataset consists of a number of wide-angle rolling shutter video sequences with corresponding gyroscope measurements (Hannes etc.) [Before 28/12/19]
  3. LO-RANSAC - LO-RANSAC library for estimation of homography and epipolar geometry(K. Lebeda, J. Matas and O. Chum) [Before 28/12/19]

Face and Eye/Iris Databases

  1. 2D-3D face dataset - This dataset includes pairs of 2D face image and its corresponding 3D face geometry model with geometry details. (Yudong Guo, Juyong Zhang, Jianfei Cai, Boyi Jiang, Jianmin Zheng) [Before 28/12/19]
  2. 300 Videos in the Wild (300-VW) - 68 Facial Landmark Tracking (Chrysos, Antonakos, Zafeiriou, Snape, Shen, Kossaifi, Tzimiropoulos, Pantic) [Before 28/12/19]
  3. 300W-Style - enhanced version of 300W by applying three style changes to the original images. It is used to facilitate the analysis of the facial landmark detection problem. (Xuanyi Dong) [29/12/19]
  4. 3D Mask Attack Database (3DMAD) - 76500 frames of 17 persons using Kinect RGBD with eye positions (Sebastien Marcel) [Before 28/12/19]
  5. 3D facial expression - Binghamton University 3D Static and Dynamic Facial Expression Databases (Lijun Yin, Jeff Cohn, and teammates) [Before 28/12/19]
  6. AFLW-Style - enhanced version of AFLW by applying three style changes to the original images. It is used to facilitate the analysis of the facial landmark detection problem. (Xuanyi Dong) [29/12/19]
  7. AginG Faces in the Wild v2 Database description: AGFW-v2 consists of 36,299 facial images divided into 11 age groups with a span of five years between groups. On average, there are 3,300 images per group. Facial images in AGFW-v2 are not public figures and less likely to have significant make-up or facial modifications, helping embed accurate aging effects during the learning process. (Chi Nhan Duong, Khoa Luu, Kha Gia Quach, Tien D. Bui) [Before 28/12/19]
  8. Audio-visual database for face and speaker recognition (Mobile Biometry MOBIO http://www.mobioproject.org/) [Before 28/12/19]
  9. Audiovisual Lombard grid speech corpus - a bi-view audiovisual Lombard speech corpus which can be used to support joint computational-behavioral studies in speech perception (Alghamdi, Maddock, Marxer, Barker and Brown) [31/12/19]
  10. BANCA face and voice database (Univ of Surrey) [Before 28/12/19]
  11. Binghampton Univ 3D static and dynamic facial expression database (Lijun Yin, Peter Gerhardstein and teammates) [Before 28/12/19]
  12. Binghamton-Pittsburgh 4D Spontaneous Facial Expression Database - consist of 2D spontaneous facial expression videos and FACS codes. (Lijun Yin et al.) [Before 28/12/19]
  13. BioID face database (BioID group) [Before 28/12/19]
  14. BioVid Heat Pain Database - This video (and biomedical signal) dataset contains facial and physiopsychological reactions of 87 study participants who were subjected to experimentally induced heat pain. (University of Magdeburg (Neuro-Information Technology group) and University of Ulm (Emotion Lab)) [Before 28/12/19]
  15. Biometric databases - biometric databases related to iris recognition (Adam Czajka) [Before 28/12/19]
  16. Biwi 3D Audiovisual Corpus of Affective Communication - 1000 high quality, dynamic 3D scans of faces, recorded while pronouncing a set of English sentences. [Before 28/12/19]
  17. Bosphorus 3D/2D Database of FACS annotated facial expressions, of head poses and of face occlusions (Bogazici University) [Before 28/12/19]
  18. CAER (Context-Aware Emotion Recognition) - Large scale image and video dataset for emotion recognition, and facial expression recognition (Lee, Kim, Kim, Park, and Sohn) [29/12/19]
  19. Caricature/Photomates dataset - a dataset with frontal faces and corresponding Caricature line drawings (Tayfun Akgul) [Before 28/12/19]
  20. CASIA-IrisV3 (Chinese Academy of Sciences, T. N. Tan, Z. Sun) [Before 28/12/19]
  21. CASIR Gaze Estimation Database - RGB and depth images (from Kinect V1.0) and ground truth values of facial features corresponding to experiments for gaze estimation benchmarking: (Filipe Ferreira etc.) [Before 28/12/19]
  22. Celeb-DF - A new large-scale and challenging DeepFake video dataset, Celeb-DF, for the development and evaluation of DeepFake detection algorithms (Li, Yang, Sun, Qi and Lyu) [30/12/19]
  23. CMU Facial Expression Database (CMU/MIT) [Before 28/12/19]
  24. The CMU Multi-PIE Face Database - more than 750,000 images of 337 people recorded in up to four sessions over the span of five months. (Jeff Cohn et al.) [Before 28/12/19]
  25. CMU Pose, Illumination, and Expression (PIE) Database (Simon Baker) [Before 28/12/19]
  26. CMU/MIT Frontal Faces (CMU/MIT) [Before 28/12/19]
  27. CMU/MIT Frontal Faces (CMU/MIT) [Before 28/12/19]
  28. CoMA 3D face dataset - 20,466 meshes (3D head scans and registrations in FLAME topology) of extreme facial expressions captured from 12 different subjects (Ranjan, Bolkart, Sanyal, Black) [Before 28/12/19]
  29. CSSE Frontal intensity and range images of faces (Ajmal Mian) [Before 28/12/19]
  30. CelebA - Large-scale CelebFaces Attributes Dataset(Ziwei Liu, Ping Luo, Xiaogang Wang, Xiaoou Tang) [Before 28/12/19]
  31. Celebrities in Frontal-Profile in the Wild - 500+ images of celebrities in frontal and profile views (Sengupta, Cheng, Castillo, Patel, Chellappa, Jacobs) [Before 28/12/19]
  32. Cohn-Kanade AU-Coded Expression Database - 500+ expression sequences of 100+ subjects, coded by activated Action Units (Affect Analysis Group, Univ. of Pittsburgh) [Before 28/12/19]
  33. Cohn-Kanade AU-Coded Expression Database - for research in automatic facial image analysis and synthesis and for perceptual studies (Jeff Cohn et al.) [Before 28/12/19]
  34. Columbia Gaze Data Set - 5,880 images of 56 people over 5 head poses and 21 gaze directions (Brian A. Smith, Qi Yin, Steven K. Feiner, Shree K. Nayar) [Before 28/12/19]
  35. Computer Vision Laboratory Face Database (CVL Face Database) - Database contains 798 images of 114 persons, with 7 images per person and is freely available for research purposes. (Peter Peer etc.) [Before 28/12/19]
  36. Deep future gaze - This dataset consists of 57 sequences on search and retrieval tasks performed by 55 subjects. Each video clip lasts for around 15 minutes with the frame rate 10 fps and frame resolution 480 by 640. Each subject is asked to search for a list of 22 items (including lanyard, laptop) and move them to the packing location (dining table). (National University of Singapore, Institute for Infocomm Research) [Before 28/12/19]
  37. DISFA+:Extended Denver Intensity of Spontaneous Facial Action Database - an extension of DISFA (M.H. Mahoor) [Before 28/12/19]
  38. DISFA:Denver Intensity of Spontaneous Facial Action Database - a non-posed facial expression database for those who are interested in developing computer algorithms for automatic action unit detection and their intensities described by FACS. (M.H. Mahoor) [Before 28/12/19]
  39. DHF1K - 1000 elaborately selected video sequences with fixation annotations from 17 viewers. (Prof. Jianbing Shen) [Before 28/12/19]
  40. EURECOM Facial Cosmetics Database - 389 images, 50 persons with/without make-up, annotations about the amount and location of applied makeup. (Jean-Luc DUGELAY et al) [Before 28/12/19]
  41. EURECOM Kinect Face Database - 52 people, 2 sessions, 9 variations, 6 facial landmarks. (Jean-Luc DUGELAY et al) [Before 28/12/19]
  42. EYEDIAP dataset - The EYEDIAP dataset was designed to train and evaluate gaze estimation algorithms from RGB and RGB-D data.It contains a diversity of participants, head poses, gaze targets and sensing conditions. (Kenneth Funes and Jean-Marc Odobez) [Before 28/12/19]
  43. Face2BMI Dataset The Face2BMI dataset contains 2103 pairs of faces, with corresponding gender, height and previous and current body weights, which allows for training computer vision models that can predict body-mass index (BMI) from profile pictures. (Enes Kocabey, Ferda Ofli, Yusuf Aytar, Javier Marin, Antonio Torralba, Ingmar Weber) [Before 28/12/19]
  44. FDDB: Face Detection Data set and Benchmark - studying unconstrained face detection (University of Massachusetts Computer Vision Laboratory) [Before 28/12/19]
  45. FDDB-360 - face detection in 360 degree fisheye images (Fu, Alvar, Bajic, and Vaughan) [29/12/19]
  46. FG-Net Aging Database of faces at different ages (Face and Gesture Recognition Research Network) [Before 28/12/19]
  47. Face Recognition Grand Challenge datasets (FRVT - Face Recognition Vendor Test) [Before 28/12/19]
  48. FMTV - Laval Face Motion and Time-Lapse Video Database. 238 thermal/video subjects with a wide range of poses and facial expressions acquired over 4 years (Ghiass, Bendada, Maldague) [Before 28/12/19]
  49. Face Super-Resolution Dataset - Ground truth HR-LR face images captured with a dual-camera setup (Chengchao Qu etc.) [Before 28/12/19]
  50. FaceScrub - A Dataset With Over 100,000 Face Images of 530 People (50:50 male and female) (H.-W. Ng, S. Winkler) [Before 28/12/19]
  51. FaceTracer Database - 15,000 faces (Neeraj Kumar, P. N. Belhumeur, and S. K. Nayar) [Before 28/12/19]
  52. Facial Expression Dataset - This dataset consists of 242 facial videos (168,359 frames) recorded in real world conditions. (Daniel McDuff et al.) [Before 28/12/19]
  53. Florence 2D/3D Hybrid Face Dataset - bridges the gap between 2D, appearance-based recognition techniques, and fully 3D approaches (Bagdanov, Del Bimbo, and Masi) [Before 28/12/19]
  54. Facial Recognition Technology (FERET) Database (USA National Institute of Standards and Technology) [Before 28/12/19]
  55. Gi4E Database - eye-tracking database with 1300+ images acquired with a standard webcam, corresponding to different subjects gazing at different points on a screen, including ground-truth 2D iris and corner points (Villanueva, Ponz, Sesma-Sanchez, Mikel Porta, and Cabeza) [Before 28/12/19]
  56. Google Facial Expression Comparison dataset - a large-scale facial expression dataset consisting of face image triplets along with human annotations that specify which two faces in each triplet form the most similar pair in terms of facial expression, which is different from datasets that focus mainly on discrete emotion classification or action unit detection (Vemulapalli, Agarwala) [Before 28/12/19]
  57. Hannah and her sisters database - a dense audio-visual person-oriented ground-truth annotation of faces, speech segments, shot boundaries (Patrick Perez, Technicolor) [Before 28/12/19]
  58. Headspace dataset - The Headspace dataset is a set of 3D images of the full human head, consisting of 1519 subjects wearing tight fitting latex caps to reduce the effect of hairstyles. (Christian Duncan, Rachel Armstrong, Alder Hey Craniofacial Unit, Liverpool, UK) [Before 28/12/19]
  59. Hong Kong Face Sketch Database [Before 28/12/19]
  60. IDIAP Head Pose Database (IHPD) - The dataset contains a set of meeting videos along with the head groundtruth of individual participants (around 128min)(Sileye Ba and Jean-Marc Odobez) [Before 28/12/19]
  61. IARPA Janus Benchmark datasets - IJB-A, IJB-B, IJB-C, FRVT (NIST) [Before 28/12/19]
  62. IMDB-WIKI - 500k+ face images with age and gender labels (Rasmus Rothe, Radu Timofte, Luc Van Gool ) [Before 28/12/19]
  63. Indian Movie Face database (IMFDB) - a large unconstrained face database consisting of 34512 images of 100 Indian actors collected from more than 100 videos (Vijay Kumar and C V Jawahar) [Before 28/12/19]
  64. Iranian Face Database - IFDB is the first image database in middle-east, contains color facial images with age, pose, and expression whose subjects are in the range of 2-85. (Mohammad Mahdi Dehshibi) [Before 28/12/19]
  65. Japanese Female Facial Expression (JAFFE) Database (Michael J. Lyons) [Before 28/12/19]
  66. LIRIS Children Spontaneous Facial Expression Video Database - pontaneous / natural facial expressions of 12 children in diverse settings with variable video recording scenarios showing six universal or prototypic emotional expressions (happiness, sadness, anger, surprise, disgust and fear). Children are recorded in constraint free environment (no restriction on head movement, no restriction on hands movement, free sitting setting, no restriction of any sort) while they watched specially built / selected stimuli. This constraint free environment allowed us to record spontaneous / natural expression of children as they occur. The database has been validated by 22 human raters. (Khan, Crenn, Meyer, Bouakaz) [29/12/19]
  67. LFW: Labeled Faces in the Wild - unconstrained face recognition [Before 28/12/19]
  68. LS3D-W - a large-scale 3D face alignment dataset annotated with 68 points containing faces captured in a "in-the-wild" setting. (Adrian Bulat, Georgios Tzimiropoulos) [Before 28/12/19]
  69. MAFA: MAsked FAces - 30,811 images with 35,806 labeled MAsked FAces, six main attributes of each masked face. (Shiming Ge, Jia Li, Qiting Ye, Zhao Luo) [Before 28/12/19]
  70. Makeup Induced Face Spoofing (MIFS) - 107 makeup-transformations attempting to spoof a target identity. Also other datasets. (Antitza Dantcheva) [Before 28/12/19]
  71. Mexculture142 - Mexican Cultural heritage objects and eye-tracker gaze fixations (Montoya Obeso, Benois-Pineau, Garcia-Vazquez, Ramirez Acosta) [Before 28/12/19]
  72. MIT CBCL Face Recognition Database (Center for Biological and Computational Learning) [Before 28/12/19]
  73. MIT Collation of Face Databases (Ethan Meyers) [Before 28/12/19]
  74. MIT eye tracking database (1003 images) (Judd et al) [Before 28/12/19]
  75. MMI Facial Expression Database - 2900 videos and high-resolution still images of 75 subjects, annotated for FACS AUs. [Before 28/12/19]
  76. MORPH (Craniofacial Longitudinal Morphological Face Database) (University of North Carolina Wilmington) [Before 28/12/19]
  77. MPIIGaze dataset - 213,659 samples with eye images and gaze target under different illumination conditions and nature head movement, collected from 15 participants with their laptop during daily using. (Xucong Zhang, Yusuke Sugano, Mario Fritz, Andreas Bulling.) [Before 28/12/19]
  78. Manchester Annotated Talking Face Video Dataset (Timothy Cootes) [Before 28/12/19]
  79. MegaFace - 1 million faces in bounding boxes (Kemelmacher-Shlizerman, Seitz, Nech, Miller, Brossard) [Before 28/12/19]
  80. Music video dataset - 8 music videos from YouTube for developing multi-face tracking algorithms in unconstrained environments (Shun Zhang, Jia-Bin Huang, Ming-Hsuan Yang) [Before 28/12/19]
  81. NIST Face Recognition Grand Challenge (FRGC) (NIST) [Before 28/12/19]
  82. NIST mugshot identification database (USA National Institute of Standards and Technology) [Before 28/12/19]
  83. NRC-IIT Facial Video Database - this database contains pairs of short video clips each showing a face of a computer user sitting in front of the monitor exhibiting a wide range of facial expressions and orientations (Dmitry Gorodnichy) [Before 28/12/19]
  84. Notre Dame Iris Image Dataset (Patrick J. Flynn) [Before 28/12/19]
  85. Notre Dame face, IR face, 3D face, expression, crowd, and eye biometric datasets (Notre Dame) [Before 28/12/19]
  86. ORL face database: 40 people with 10 views (ATT Cambridge Labs) [Before 28/12/19]
  87. OUI-Adience Faces - unfiltered faces for gender and age classification plus 3D faces (OUI) [Before 28/12/19]
  88. Oxford: faces, flowers, multi-view, buildings, object categories, motion segmentation, affine covariant regions, misc (Oxford Visual Geometry Group) [Before 28/12/19]
  89. Pandora - POSEidon: Face-from-Depth for Driver Pose (Borghi, Venturelli, Vezzani, Cucchiara) [Before 28/12/19]
  90. PubFig: Public Figures Face Database (Neeraj Kumar, Alexander C. Berg, Peter N. Belhumeur, and Shree K. Nayar) [Before 28/12/19]
  91. QMUL-SurvFace - A large-scale face recognition benchmark dedicated for real-world surveillance face analysis and matching. (QMUL Computer Vision Group) [Before 28/12/19]
  92. Re-labeled Faces in the Wild - original images, but aligned using "deep funneling" method. (University of Massachusetts, Amherst) [Before 28/12/19]
  93. RT-GENE: Real-Time Eye Gaze Estimation in Natural Environments 122,531 images with the subjects' ground truth eye gaze and head pose labels under free-viewing conditions and large camera-subject distances (Fischer, Chang, Demiris, Imperial College London) [Before 28/12/19]
  94. S3DFM - Edinburgh Speech-driven 3D Facial Motion Database. 77 people with 10 repetitions of speaking a passphrase: 1 second of 500 frame per second 600x600 pixels of {IR intensity video, registered depth images} plus synchronized 44.1 Khz audio. There are an additional 26 people (10 repetitions) moving their heads while speaking (Zhang, Fisher) [Before 28/12/19]
  95. Salient features in gaze-aligned recordings of human visual input - TB of human gaze-contingent data "in the wild" (Frank Schumann etc.) [Before 28/12/19]
  96. SAMM Dataset of Micro-Facial Movements - The dataset contains 159 spontaneous micro-facial movements obtained from 32 participants from 13 different ethnicities. (A.Davison, C.Lansley, N.Costen, K.Tan, M.H.Yap) [Before 28/12/19]
  97. SCface - Surveillance Cameras Face Database (Mislav Grgic, Kresimir Delac, Sonja Grgic, Bozidar Klimpak) [Before 28/12/19]
  98. SiblingsDB - The SiblingsDB contains two datasets depicting images of individuals related by sibling relationships. (Politecnico di Torino/Computer Graphics & Vision Group) [Before 28/12/19]
  99. SoF dataset - 42,592 face images with glasses under different illumination conditions; provided with face region, facial landmarks, facial expression, subject ID, gender, and age information (Afifi, Abdelhamed) [29/12/19]
  100. Solving the Robot-World Hand-Eye(s) Calibration Problem with Iterative Methods - These datasets were generated for calibrating robot-camera systems. (Amy Tabb) [Before 28/12/19]
  101. Spontaneous Emotion Multimodal Database (SEM-db) - non-posed reactions to visual stimulus data recorded with HD RGB, depth and IR frames of the face, EEG signal and eye gaze data (Fernandez. Montenegro, Gkelias, Argyriou) [Before 28/12/19]
  102. The UNBC-McMaster Shoulder Pain Expression Archive Database - Painful data: The UNBC-McMaster Shoulder Pain Expression Archive Database (Lucy et al.) [Before 28/12/19]
  103. VOCASET - 4D face dataset with about 29 minutes of 3D head scans captured at 60 fps and synchronized audio from 12 speakers (Cudeiro, Bolkart, Laidlaw, Ranjan, Black) [Before 28/12/19]
  104. Trondheim Kinect RGB-D Person Re-identification Dataset (Igor Barros Barbosa) [Before 28/12/19]
  105. UB KinFace Database - University of Buffalo kinship verification and recognition database [Before 28/12/19]
  106. UBIRIS: Noisy Visible Wavelength Iris Image Databases (University of Beira) [Before 28/12/19]
  107. UMDFaces - About 3.7 million annotated video frames from 22,000 videos and 370,000 annotated still images. (Ankan Bansal et al.) [Before 28/12/19]
  108. UPNA Head Pose Database - head pose database, with 120 webcam videos containing guided-movement sequences and free-movement sequences, including ground-truth head pose and automatically annotated 2D facial points. (Ariz, Bengoechea, Villanueva, Cabeza) [Before 28/12/19]
  109. UPNA Synthetic Head Pose Database - a synthetic replica of the UPNA Head Pose Database, with 120 videos with their 2D ground truth landmarks projections, their corresponding head pose ground truth, 3D head models and camera parameters. (Larumbe, Segura, Ariz, Bengoechea, Villanueva, Cabeza) [Before 28/12/19]
  110. UTIRIS cross-spectral iris image databank (Mahdi Hosseini) [Before 28/12/19]
  111. UvA-NEMO Smile Database - 1240 smile videos (597 spontaneous and 643 posed) from 400 subjects, including age, gender, and kinship annotations (Gevers, Dibeklioglu, Salah) [Before 28/12/19]
  112. VGGFace2 - VGGFace2 is a large-scale face recognition dataset covering large variations in pose, age, illumination, ethnicity and profession. (Oxford Visual Geometry Group) [Before 28/12/19]
  113. VIPSL Database - VIPSL Database is for research on face sketch-photo synthesis and recognition, including 200 subjects (1 photo and 5 sketches per subject). (Nannan Wang) [Before 28/12/19]
  114. Visual Search Zero Shot Database - Collection of human eyetracking data in three increasingly complex visual search tasks: object arrays, natural images and Waldo images. (Kreiman lab) [Before 28/12/19]
  115. VT-KFER: A Kinect-based RGBD+Time Dataset for Spontaneous and Non-Spontaneous Facial Expression Recognition - 32 subjects, 1,956 sequences of RGBD, six facial expressions in 3 poses (Aly, Trubanova, Abbott, White, and Youssef) [Before 28/12/19]
  116. Washington Facial Expression Database (FERG-DB) - a database of 6 stylized (Maya) characters with 7 annotated facial expressions (Deepali Aneja, Alex Colburn, Gary Faigin, Linda Shapiro, and Barbara Mones) [Before 28/12/19]
  117. WebCaricature Dataset - The WebCaricature dataset is a large photograph-caricature dataset consisting of 6042 caricatures and 5974 photographs from 252 persons collected from the web. (Jing Huo, Wenbin Li, Yinghuan Shi, Yang Gao and Hujun Yin) [Before 28/12/19]
  118. WIDER FACE: A Face Detection Benchmark - 32,203 images with 393,703 labeled faces, 61 event classes (Shuo Yang, Ping Luo, Chen Change Loy, Xiaoou Tang) [Before 28/12/19]
  119. Wider-360 - Datasets for face and object detection in fisheye images (Fu, Bajic, and Vaughan) [29/12/19]
  120. XM2VTS Face video sequences (295): The extended M2VTS Database (XM2VTS) - (Surrey University) [Before 28/12/19]
  121. Yale Face Database - 11 expressions of 10 people (A. Georghaides) [Before 28/12/19]
  122. Yale Face Database B - 576 viewing conditions of 10 people (A. Georghaides) [Before 28/12/19]
  123. York 3D Ear Dataset - The York 3D Ear Dataset is a set of 500 3D ear images, synthesized from detailed 2D landmarking, and available in both Matlab format (.mat) and PLY format (.ply). (Nick Pears, Hang Dai, Will Smith, University of York) [Before 28/12/19]
  124. York Univ Eye Tracking Dataset (120 images) (Neil Bruce) [Before 28/12/19]
  125. YouTube Faces DB - 3,425 videos of 1,595 different people. (Wolf, Hassner, Maoz) [Before 28/12/19]
  126. Zurich Natural Image - the image material used for creating natural stimuli in a series of eye-tracking studies (Frey et al.) [Before 28/12/19]

Fingerprints

  1. FVC fingerpring verification competition 2002 dataset (University of Bologna) [Before 28/12/19]
  2. FVC fingerpring verification competition 2004 dataset (University of Bologna) [Before 28/12/19]
  3. Fingerprint Manual Minutiae Marker (FM3) Databases: - Fingerprint Manual Minutiae Marker (FM3) Databases( Mehmet Kayaoglu, Berkay Topcu and Umut Uludag) [Before 28/12/19]
  4. NIST fingerprint databases (USA National Institute of Standards and Technology) [Before 28/12/19]
  5. SPD2010 Fingerprint Singular Points Detection Competition (SPD 2010 committee) [Before 28/12/19]

General Images

  1. A Dataset for Real Low-Light Image Noise Reduction - It contains pixel and intensity aligned pairs of images corrupted by low-light camera noise and their low-noise counterparts. (J. Anaya, A. Barbu) [Before 28/12/19]
  2. A database of paintings related to Vincent van Gogh - This is the dataset VGDB-2016 built for the paper "From Impressionism to Expressionism: Automatically Identifying Van Gogh's Paintings" (Guilherme Folego and Otavio Gomes and Anderson Rocha) [Before 28/12/19]
  3. AMOS: Archive of Many Outdoor Scenes (20+m) (Nathan Jacobs) [Before 28/12/19]
  4. Aerial imagesBuilding detection from aerial images using invariant color features and shadow information. (Beril Sirmacek) [Before 28/12/19]
  5. Approximated overlap error datasetImage pairs with sparse sets of ground-truth matches for evaluating local image descriptors (Fabio Bellavia) [Before 28/12/19]
  6. AutoDA (Automatic Dataset Augmentation) - An automatically constructed image dataset including 12.5 million images with relevant textual information for the 1000 categories of ILSVRC2012 (Bai, Yang, Ma, Zhao) [Before 28/12/19]
  7. BGU Hyperspectral Image Database of Natural Scenes (Ohad Ben-Shahar and Boaz Arad) [Before 28/12/19]
  8. Brown Univ Large Binary Image Database (Ben Kimia) [Before 28/12/19]
  9. Butterfly-200 - Butterfly-20 is a image dataset for fine-grained image classification, which contains 25,279 images and covers four levels categories of 200 species, 116 genera, 23 subfamilies, and 5 families. (Tianshui Chen) [Before 28/12/19]
  10. CIFAR-10 classes with different WB settings - 15,098 rendered images that reflect real in-camera white-balance settings (Afifi, Brown) [29/12/19]
  11. CMP Facade Database - Includes 606 rectified images of facades from various places with 12 architectural classes annotated. (Radim Tylecek) [Before 28/12/19]
  12. Caltech-UCSD Birds-200-2011 (Catherine Wah) [Before 28/12/19]
  13. Color correction dataset - Homography-based registered images for evaluating color correction algorithms for image stitching. (Fabio Bellavia) [Before 28/12/19]
  14. Columbia Multispectral Image Database (F. Yasuma, T. Mitsunaga, D. Iso, and S.K. Nayar) [Before 28/12/19]
  15. DAQUAR (Visual Turing Challenge) - A dataset containing questions and answers about real-world indoor scenes. (Mateusz Malinowski, Mario Fritz) [Before 28/12/19]
  16. Darmstadt Noise Dataset - 50 pairs of real noisy images and corresponding ground truth images (RAW and sRGB) (Tobias Plotz and Stefan Roth) [Before 28/12/19]
  17. Dataset of American Movie Trailers 2010-2014 - Contains links to 474 hollywood movie trailers along with associated metadata (genre, budget, runtime, release, MPAA rating, screens released, sequel indicator) (USC Signal Analysis and Interpretation Lab) [Before 28/12/19]
  18. DIML Multimodal Benchmark - To evaluate matching performance under photometric and geometric variations, 100 images of 1200 x 800 size. (Yonsei University) [Before 28/12/19]
  19. DSLR Photo Enhancement Dataset (DPED) - 22K photos taken synchronously in the wild by three smartphones and one DSLR camera, useful for comparing infered high quality images from multiple low quality images (Ignatov, Kobyshev, Timofte, Vanhoey, and Van Gool). [Before 28/12/19]
  20. Flickr-style - 80K Flickr photographs annotated with 20 curated style labels, and 85K paintings annotated with 25 style/genre labels (Sergey Karayev) [Before 28/12/19]
  21. Flickr1024: A Dataset for Stereo Image Super-resolution - 1024 high-quality images pairs and covers diverse senarios (Wang, Wang, Yang, An, Guo) [Before 28/12/19]
  22. Forth Multispectral Imaging Datasets - images from 23 spectral bands each from 5 paintings. Images are annotated with ground truth data. (Karamaoynas Polykarpos et al) [Before 28/12/19]
  23. General 100 Dataset - General-100 dataset contains 100 bmp-format images (with no compression), which are well-suited for super-resolution training(Dong, Chao and Loy, Chen Change and Tang, Xiaoou) [Before 28/12/19]
  24. GOPRO dataset - Blurred image dataset with sharp image ground truth (Nah, Kim, and Lee) [Before 28/12/19]
  25. HIPR2 Image Catalogue of different types of images (Bob Fisher et al) [Before 28/12/19]
  26. HPatches - A benchmark and evaluation of handcrafted and learned local descriptors (Balntas, Lenc, Vedaldi, Mikolajczyk) [Before 28/12/19]
  27. Hyperspectral images for spatial distributions of local illumination in natural scenes - Thirty calibrated hyperspectral radiance images of natural scenes with probe spheres embedded for local illumination estimation. (Nascimento, Amano & Foster) [Before 28/12/19]
  28. Hyperspectral images of natural scenes - 2002 (David H. Foster) [Before 28/12/19]
  29. Hyperspectral images of natural scenes - 2004 (David H. Foster) [Before 28/12/19]
  30. ISPRS multi-platform photogrammetry dataset - 1: Nadir and oblique aerial images plus 2: Combined UAV and terrestrial images (Francesco Nex and Markus Gerke) [Before 28/12/19]
  31. Image & Video Quality Assessment at LIVE - used to develop picture quality algorithms (the University of Texas at Austin) [Before 28/12/19]
  32. ImageNet Large Scale Visual Recognition Challenges - Currently 200 object classes and 500+K images (Alex Berg, Jia Deng, Fei-Fei Li and others) [Before 28/12/19]
  33. ImageNet Linguistically organised (WordNet) Hierarchical Image Database - 10E7 images, 15K categories (Li Fei-Fei, Jia Deng, Hao Su, Kai Li) [Before 28/12/19]
  34. Improved 3D Sparse Maps for High-performance Structure from Motion with Low-cost Omnidirectional Robots - Evaluation Dataset - Data set used in research paper doi:10.1109/ICIP.2015.7351744 (Breckon, Toby P., Cavestany, Pedro) [Before 28/12/19]
  35. Konstanz visual quality databases - Large-scale image and video databases for the development and evaluation of visual quality assessment algorithms. (MMSP group, University of Konstanz) [Before 28/12/19]
  36. Kodak McMaster demosaic dataset - (Zhang, Wu, Buades, Li) [Before 28/12/19]
  37. LabelMeFacade Database - 945 labeled building images (Erik Rodner et al) [Before 28/12/19]
  38. Local illumination hyperspectral radiance images - Thirty hyperspectral radiance images of natural scenes with embedded probe spheres for local illumination estimates(Sgio M. C. Nascimento, Kinjiro Amano, David H. Foster) [Before 28/12/19]
  39. McGill Calibrated Colour Image Database (Adriana Olmos and Fred Kingdom) [Before 28/12/19]
  40. Multiply Distorted Image Database -a database for evaluating the results of image quality assessment metrics on multiply distorted images. (Fei Zhou) [Before 28/12/19]
  41. NAS-Bench-102 - An algorithm-agnostic nas benchmark with detailed information (training/validation/test loss/accuracy etc) of 15,625 architectures on three datasets. (Xuanyi Dong) [29/12/19]
  42. NPRgeneral - A standardized collection of images for evaluating image stylization algorithms. (David Mould, Paul Rosin) [Before 28/12/19]
  43. nuTonomy scenes dataset (nuScenes) - The nuScenes dataset is a large-scale autonomous driving dataset. It features: Full sensor suite (1x LIDAR, 5x RADAR, 6x camera, IMU, GPS), 1000 scenes of 20s each, 1,440,000 camera images, 400,000 lidar sweeps, two diverse cities: Boston and Singapore, left versus right hand traffic, detailed map information, manual annotations for 25 object classes, 1.1M 3D bounding boxes annotated at 2Hz, attributes such as visibility, activity and pose. (Caesar et al) [Before 28/12/19]
  44. NYU Symmetry Database - 176 single-symmetry and 63 multyple-symmetry images (Marcelo Cicconet and Davi Geiger) [Before 28/12/19]
  45. OceanDark dataset - 100 low-lighting underwater images from underwater sites in the Northeast Pacific Ocean. 1400x1000 pixels, varying lighting and recording conditions (Ocean Networks Canada) [Before 28/12/19]
  46. OTCBVS Thermal Imagery Benchmark Dataset Collection (Ohio State Team) [Before 28/12/19]
  47. PAnorama Sparsely STructured Areas Datasets - the PASSTA datasets used for evaluation of the image alignment (Andreas Robinson) [Before 28/12/19]
  48. QMUL-OpenLogo - A logo detection benchmark for testing the model generalisation capability in detecting a variety of logo objects in natural scenes with the majority logo classes unlabelled. (QMUL Computer Vision Group) [Before 28/12/19]
  49. RESIDE (Realistic Single Image DEhazing) - The current largest-scale benchmark consisting of both synthetic and real-world hazy images, for image dehazing research. RESIDE highlights diverse data sources and image contents, and serves various training or evaluation purposes. (Boyi Li, Wenqi Ren, Dengpan Fu, Dacheng Tao, Dan Feng, Wenjun Zeng, Zhangyang Wang) [Before 28/12/19]
  50. Rijksmuseum Challenge 2014 - It consist of 100K art objects from the rijksmuseum and comes with an extensive xml files describing each object. (Thomas Mensink and Jan van Gemert) [Before 28/12/19]
  51. See in the Dark - 77 Gb of dark images (Chen, Chen, Xu, and Koltun) [Before 28/12/19]
  52. Smartphone Image Denoising Dataset (SIDD) - The Smartphone Image Denoising Dataset (SIDD) consists of about 30,000 noisy images with corresponding high-quality ground truth in both raw-RGB and sRGB spaces obtained from 10 scenes with different lighting conditions using five representative smartphone cameras. (Abdelrahman Abdelhamed, Stephen Lin, Michael S. Brown) [Before 28/12/19]
  53. Rendered WB dataset - 100,000+ rendered sRGB images with different white balance (WB) settings (Afifi, Price, Cohen, Brown) [29/12/19]
  54. Stanford Street View Image, Pose, and 3D Cities Dataset - a large scale dataset of street view images (25 million images and 118 matching image pairs) with their relative camera pose, 3D models of cities, and 3D metadata of images. (Zamir, Wekel, Agrawal, Malik, Savarese) [Before 28/12/19]
  55. TESTIMAGES - Huge and free collection of sample images designed for analysis and quality assessment of different kinds of displays (i.e. monitors, televisions and digital cinema projectors) and image processing techniques. (Nicola Asuni) [Before 28/12/19]
  56. Time-Lapse Hyperspectral Radiance Images of Natural Scenes - Four time-lapse sequences of 7-9 calibrated hyperspectral radiance images of natural scenes taken over the day. (Foster, D.H., Amano, K., & Nascimento, S.M.C.) [Before 28/12/19]
  57. Time-lapse hyperspectral radiance images - Four time-lapse sequences of 7-9 calibrated hyperspectral images of natural scenes, spectra at 10-nm intervals(David H. Foster, Kinjiro Amano, Sgio M. C. Nascimento) [Before 28/12/19]
  58. Tiny Images Dataset 79 million 32x32 color images (Fergus, Torralba, Freeman) [Before 28/12/19]
  59. TURBID Dataset - five different subsets of degraded images with its respective ground-truth. Subsets Milk and DeepBlue have 20 images each and the subset Chlorophyll has 42 images (Amanda Duarte) [Before 28/12/19]
  60. UT Snap Angle 360° Dataset - A list of 360° videos of four activities (disney, parade, ski, concert) from youtube (Kristen Grauman, UT Austin) [Before 28/12/19]
  61. UT Snap Point Dataset - Human judgement on snap point quality of a subset of frames from UT Egocentric dataset and a newly collected mobile robot dataset (frames are also included) (Bo Xiong, Kristen Grauman, UT Austin) [Before 28/12/19]
  62. Visual Dialog - 120k human-human dialogs on COCO images, 10 rounds of QA per dialog (Das, Kottur, Gupta, Singh, Yadav, Moura, Parikh, Batra) [Before 28/12/19]
  63. Visual Question Answering - 254K imags, 764K questions, ground truth (Agrawal, Lu, Antol, Mitchell, Zitnick, Batra, Parikh) [Before 28/12/19]
  64. Visual Question Generation - 15k images (including both object-centric and event-centric images), 75k natural questions asked about the images which can evoke further conversation(Nasrin Mostafazadeh, Ishan Misra, Jacob Devlin, Margaret Mitchell, Xiao dong He, Lucy Vanderwende) [Before 28/12/19]
  65. VQA Human Attention - 60k human attention maps for visual question answering i.e. where humans choose to look to answer questions about images (Das, Agrawal, Zitnick, Parikh, Batra) [Before 28/12/19]
  66. Wild Web tampered image dataset - A large collection of tampered images from Web and social media sources, including ground-truth annotation masks for tampering localization (Markos Zampoglou, Symeon Papadopoulos) [Before 28/12/19]
  67. YFCC100M: The New Data in Multimedia Research - This publicly available curated dataset of 100 million photos and videos is free and legal for all. (Bart Thomee, Yahoo Labs and Flickr in San Francisco,etc.) [Before 28/12/19]

General RGBD and Depth Datasets

Note: there are 3D datasets elsewhere as well, e.g. in Objects, Scenes, and Actions.

See also: List of RGBD datasets.

  1. 3D60: 3D Vision Indoor Spherical Panoramas - A multimodal dataset of 360 spherical panoramas containing paired color images, depth and normal maps, as well as vertical and horizontal stereo pairs (with their assorted depth and normal maps as well) that can be used to train or evaluate a variety of 3D vision tasks. (Nikolaos Zioulis, Antonis Karakottas, Dimitrios Zarpalas, Petros Daras) [Before 28/12/19]
  2. 3D-Printed RGB-D Object Dataset - 5 objects with groundtruth CAD models and camera trajectories, recorded with various quality RGB-D sensors. (Siemens & TUM) [Before 28/12/19]
  3. 3DCOMET - 3DCOMET is a dataset for testing 3D data compression methods. (Miguel Cazorla, Javier Navarrete,Vicente Morell, Miguel Cazorla, Diego Viejo, Jose Garcia-Rodriguez, Sergio Orts.) [Before 28/12/19]
  4. 3D articulated body - 3D reconstruction of an articulated body with rotation and translation. Single camera, varying focal. Every scene may have an articulated body moving. There are four kinds of data sets included. A sample reconstruction result included which uses only four images of the scene. (Prof Jihun Park) [Before 28/12/19]
  5. A Dataset for Non-Rigid Reconstruction from RGB-D Data - Eight scenes for reconstructing non-rigid geometry from RGB-D data, each containing several hundred frames along with our results. (Matthias Innmann, Michael Zollhoefer, Matthias Niessner, Christian Theobalt, Marc Stamminger) [Before 28/12/19]
  6. A Large Dataset of Object Scans - 392 objects in 9 casses, hundreds of frames each (Choi, Zhou, Miller, Koltun) [Before 28/12/19]
  7. Articulated Object Challenge - 4 articulated objects consisting of rigids parts connected by 1D revolute and prismatic joints, 7000+ RGBD images with annotations for 6D pose estimation(Frank Michel, Alexander Krull, Eric Brachmann, Michael. Y. Yang,Stefan Gumhold, Carsten Rother) [Before 28/12/19]
  8. BigBIRD - 100 objects with for each object, 600 3D point clouds and 600 high-resolution color images spanning all views (Singh, Sha, Narayan, Achim, Abbeel) [Before 28/12/19]
  9. CAESAR Civilian American and European Surface Anthropometry Resource Project - 4000 3D human body scans (SAE International) [Before 28/12/19]
  10. CIN 2D+3D object classification dataset - segmented color and depth images of objects from 18 categories of common household and office objects (Bjorn Browatzki et al) [Before 28/12/19]
  11. CoRBS - an RGB-D SLAM benchmark, providing the combination of real depth and color data together with a ground truth trajectory of the camera and a ground truth 3D model of the scene (Oliver Wasenmuller) [Before 28/12/19]
  12. CSIRO synthetic deforming people - synthetic RGBD dataset for evaluating non-rigid 3D reconstruction: 2 subjects and 4 camera trajectories (Elanattil and Moghadam) [Before 28/12/19]
  13. CTU Garment Folding Photo Dataset - Color and depth images from various stages of garment folding. (Sushkov R., Melkumov I., Smutn y V. (Czech Technical University in Prague)) [Before 28/12/19]
  14. CTU Garment Sorting Dataset - Dataset of garment images, detailed stereo images, depth images and weights. (Petrik V., Wagner L. (Czech Technical University in Prague)) [Before 28/12/19]
  15. Clothing part dataset - The clothing part dataset consists of image and depth scans, acquired with a Kinect, of garments laying on a table, with over a thousand part annotations (collar, cuffs, hood, etc) using polygonal masks. (Arnau Ramisa, Guillem Aleny, Francesc Moreno-Noguer and Carme Torras) [Before 28/12/19]
  16. Cornell-RGBD-Dataset - Office Scenes (Hema Koppula) [Before 28/12/19]
  17. CVSSP Dynamic RGBD Modelling 2015 - This dataset contains eight RGBD sequences of general dynamic scenes captured using the Kinect V1/V2 as well as two synthetic sequences. (Charles Malleson, CVSSP, University of Surrey) [Before 28/12/19]
  18. Deformable 3D Reconstruction Dataset - two single-stream RGB-D sequences of dynamically moving mechanical toys together with ground-truth 3D models in the canonical rest pose. (Siemens, TUM) [Before 28/12/19]
  19. Delft Windmill Interior and Exterior Laser Scanning Point Clouds (Beril Sirmacek) [Before 28/12/19]
  20. Diabetes60 - RGB-D images of 60 western dishes, home made. Data was recorded using a Microsoft Kinect V2. (Patrick Christ and Sebastian Schlecht) [Before 28/12/19]
  21. ETH3D - Benchmark for multi-view stereo and 3D reconstruction, covering a variety of indoor and outdoor scenes, with ground truth acquired by a high-precision laser scanner. (Thomas Sch??ps, Johannes L. Sch??nberger, Silvano Galliani, Torsten Sattler, Konrad Schindler, Marc Pollefeys, Andreas Geiger) [Before 28/12/19]
  22. EURECOM Kinect Face Database - 52 people, 2 sessions, 9 variations, 6 facial landmarks. (Jean-Luc DUGELAY et al) [Before 28/12/19]
  23. G4S meta rooms - RGB-D data 150 sweeps with 18 images per sweep. (John Folkesson et al.) [Before 28/12/19]
  24. Georgiatech-Metz Symphony Lake Dataset - 5 million RGBD outdoor images over 4 years from 121 surveys of a lakeshore. (Griffith and Pradalier) [Before 28/12/19]
  25. Goldfinch: GOogLe image-search Dataset for FINe grained CHallenges - a largescale dataset for finegrained bird (11K species),butterfly (14K species), aircraft (409 types), and dog (515 breeds) recognition. (Jonathan Krause, Benjamin Sapp, Andrew Howard, Howard Zhou, Alexander Toshev, Tom Duerig, James Philbin, Li Fei-Fei) [Before 28/12/19]
  26. Headspace dataset - The Headspace dataset is a set of 3D images of the full human head, consisting of 1519 subjects wearing tight fitting latex caps to reduce the effect of hairstyles. (Christian Duncan, Rachel Armstrong, Alder Hey Craniofacial Unit, Liverpool, UK) [Before 28/12/19]
  27. House3D - House3D is a virtual 3D environment which consists of thousands of indoor scenes equipped with a diverse set of scene types, layouts and objects sourced from the SUNCG dataset. It consists of over 45k indoor 3D scenes, ranging from studios to two-storied houses with swimming pools and fitness rooms. All 3D objects are fully annotated with category labels. Agents in the environment have access to observations of multiple modalities, including RGB images, depth, segmentation masks and top-down 2D map views. The renderer runs at thousands frames per second, making it suitable for large-scale RL training. (Yi Wu, Yuxin Wu, Georgia Gkioxari, Yuandong Tian, facebook research) [Before 28/12/19]
  28. IMPART multi-view/multi-modal 2D+3D film production dataset - LIDAR, video, 3D models, spherical camera, RGBD, stereo, action, facial expressions, etc. (Univ. of Surrey) [Before 28/12/19]
  29. Industrial 3D Object Detection Dataset (MVTec ITODD) - depth and gray value data of 28 objects in 3500 labeled scenes for 3D object detection and pose estimation with a strong focus on industrial settings and applications (MVTec Software GmbH, Munich) [Before 28/12/19]
  30. Kinect v2 Dataset - Efficient Multi-Frequency Phase Unwrapping using Kernel Density Estimation (Felix etc.) [Before 28/12/19]
  31. KOMATSUNA dataset - The datasets is designed for instance segmentation, tracking and reconstruction for leaves using both sequential multi-view RGB images and depth images. (Hideaki Uchiyama, Kyushu University) [Before 28/12/19]
  32. Make3D Laser+Image data - about 1000 RGB outdoor images with aligned laser depth images (Saxena, Chung, Ng, Sun) [Before 28/12/19]
  33. McGill-Reparti Artificial Perception Database - RGBD data from four cameras and unfiltered Vicon skeletal data of two human subjects performing simulated assembly tasks on a car door (Andrew Phan, Olivier St-Martin Cormier, Denis Ouellet, Frank P. Ferrie). [Before 28/12/19]
  34. Meta rooms - RGB-D data comprised of 28 aligned depth camera images collected by having robot go to specific place and do 360 degrees of pan with various tilts. (John Folkesson et al.) [Before 28/12/19]
  35. METU Multi-Modal Stereo Datasets ???Benchmark Datasets for Multi-Modal Stereo-Vision??? - The METU Multi-Modal Stereo Datasets includes benchmark datasets for for Multi-Modal Stereo-Vision which is composed of two datasets: (1) The synthetically altered stereo image pairs from the Middlebury Stereo Evaluation Dataset and (2) the visible-infrared image pairs captured from a Kinect device. (Dr. Mustafa Yaman, Dr. Sinan Kalkan) [Before 28/12/19]
  36. MHT RGB-D - collected by a robot every 5 min over 16 days by the University of Lincoln. (John Folkesson et al.) [Before 28/12/19]
  37. Moving INfants In RGB-D (MINI-RGBD) - A synthetic, realistic RGB-D data set for infant pose estimation containing 12 sequences of moving infants with ground truth joint positions. (N. Hesse, C. Bodensteiner, M. Arens, U. G. Hofmann, R. Weinberger, A. S. Schroeder) [Before 28/12/19]
  38. Multi-sensor 3D Object Dataset for Object Recognition with Full Pose Estimation - Multi-sensor 3D Object Dataset for Object Recognition and Pose Estimation(Alberto Garcia-Garcia, Sergio Orts-Escolano, Sergiu Oprea,etc.) [Before 28/12/19]
  39. NTU RGB+D Action Recognition Dataset - NTU RGB+D is a large scale dataset for human action recognition(Amir Shahroudy) [Before 28/12/19]
  40. nuTonomy scenes dataset (nuScenes) - The nuScenes dataset is a large-scale autonomous driving dataset. It features: Full sensor suite (1x LIDAR, 5x RADAR, 6x camera, IMU, GPS), 1000 scenes of 20s each, 1,440,000 camera images, 400,000 lidar sweeps, two diverse cities: Boston and Singapore, left versus right hand traffic, detailed map information, manual annotations for 25 object classes, 1.1M 3D bounding boxes annotated at 2Hz, attributes such as visibility, activity and pose. (Caesar et al) [Before 28/12/19]
  41. NYU Depth Dataset V2 - Indoor Segmentation and Support Inference from RGBD Images [Before 28/12/19]
  42. Oakland 3-D Point Cloud Dataset (Nicolas Vandapel) [Before 28/12/19]
  43. Pacman project - Synthetic RGB-D images of 400 objects from 20 classes. Generated from 3D mesh models (Vladislav Kramarev, Umit Rusen Aktas, Jeremy L. Wyatt.) [Before 28/12/19]
  44. Procedural Human Action Videos - This dataset contains about 40,000 videos for human action recognition that had been generated using a 3D game engine. The dataset contains about 6 million frames which can be used to train and evaluate models not only action recognition but also models for depth map estimation, optical flow, instance segmentation, semantic segmentation, 3D and 2D pose estimation, and attribute learning. (Cesar Roberto de Souza) [Before 28/12/19]
  45. RGB-D-based Action Recognition Datasets - Paper that includes the list and links of different rgb-d action recognition datasets. (Jing Zhang, Wanqing Li, Philip O. Ogunbona, Pichao Wang, Chang Tang) [Before 28/12/19]
  46. RGB-D Part Affordance Dataset - RGB-D images and ground-truth affordance labels for 105 kitchen, workshop and garden tools, and 3 cluttered scenes (Myers, Teo, Fermuller, Aloimonos) [Before 28/12/19]
  47. ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes - ScanNet is a dataset of richly-annotated RGB-D scans of real-world environments containing 2.5M RGB-D images in more than 1500 scans, annotated with 3D camera poses, surface reconstructions, and instance-level semantic segmentations. (Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, Matthias Niessner) [Before 28/12/19]
  48. SceneNN: A Scene Meshes Dataset with aNNotations - RGB-D scene dataset with 100+ indoor scenes, labeled triangular mesh, voxel and pixel. (Hua, Pham, Nguyen, Tran, Yu, and Yeung) [Before 28/12/19]
  49. Semantic-8: 3D point cloud classification with 8 classes (ETH Zurich) [Before 28/12/19]
  50. Small office data sets - Kinect depth images every 5 seconds beginning in April 2014 and on-going. (John Folkesson et al.) [Before 28/12/19]
  51. Stereo and ToF dataset with ground truth - The dataset contains 5 different scenes acquired with a Time-of-flight sensor and a stereo setup. Ground truth information is also provided. (Carlo Dal Mutto, Pietro Zanuttigh, Guido M. Cortelazzo) [Before 28/12/19]
  52. SYNTHIA - Large set (~half million) of virtual-world images for training autonomous cars to see. (ADAS Group at Computer Vision Center) [Before 28/12/19]
  53. Taskonomy - Over 4.5 million real images each with ground truth for 25 semantic, 2D, and 3D tasks. (Zamir, Sax, Shen, Guibas, Malik, Savarese) [Before 28/12/19]
  54. TAU Agent Dataset - a high-resolution RGB-D dataset, created using Blender. Contains 530 high-resolution RGB images with corresponding pixel-wise ground truth depth maps (Haim, Elmalem, Giryes, Bronstein, and Marom) [30/12/19]
  55. THU-READ(Tsinghua University RGB-D Egocentric Action Dataset) - THU-READ is a large-scale dataset for action recognition in RGBD videos with pixel-lever hand annotation. (Yansong Tang, Yi Tian, Jiwen Lu, Jianjiang Feng, Jie Zhou) [Before 28/12/19]
  56. TUM RGB-D Benchmark - Dataset and benchmark for the evaluation of RGB-D visual odometry and SLAM algorithms (Jorgen Sturm, Nikolas Engelhard, Felix Endres, Wolfram Burgard and Daniel Cremers) [Before 28/12/19]
  57. UC-3D Motion Database - Available data types encompass high resolution Motion Capture, acquired with MVN Suit from Xsens and Microsoft Kinect RGB and depth images. (Institute of Systems and Robotics, Coimbra, Portugal) [Before 28/12/19]
  58. Uni Bremen Open, Abdominal Surgery RGB Dataset - Recording of a complete, open, abdominal surgery using a Kinect v2 that was mounted directly above the patient looking down at patient and staff. (Joern Teuber, Gabriel Zachmann, University of Bremen) [Before 28/12/19]
  59. USF Range Image Database - 400+ laser range finder and structured light camera images, many with ground truth segmentations (Adam et al.) [Before 28/12/19]
  60. Washington RGB-D Object Dataset - 300 common household objects and 14 scenes. (University of Washington and Intel Labs Seattle) [Before 28/12/19]
  61. Witham Wharf - For RGB-D of eight locations collect by robot every 10 min over ~10 days by the University of Lincoln. (John Folkesson et al.) [Before 28/12/19]
  62. York 3D Ear Dataset - The York 3D Ear Dataset is a set of 500 3D ear images, synthesized from detailed 2D landmarking, and available in both Matlab format (.mat) and PLY format (.ply). (Nick Pears, Hang Dai, Will Smith, University of York) [Before 28/12/19]

General Videos

  1. AlignMNIST - An artificially extended version of the MNIST handwritten dataset. (en Hauberg) [Before 28/12/19]
  2. Audio-Visual Event (AVE) dataset- AVE dataset contains 4143 YouTube videos covering 28 event categories and videos in AVE dataset are temporally labeled with audio-visual event boundaries. (Yapeng Tian, Jing Shi, Bochen Li, Zhiyao Duan, and Chenliang Xu) [Before 28/12/19]
  3. Dataset of Multimodal Semantic Egocentric Video (DoMSEV) - Labeled 80-hour Dataset of Multimodal Semantic Egocentric Videos (DoMSEV) covering a wide range of activities, scenarios, recorders, illumination and weather conditions. (UFMG, Michel Silva, Washington Ramos, Jo??o Ferreira, Felipe Chamone, Mario Campos, Erickson R. Nascimento) [Before 28/12/19]
  4. DAVIS: Video Object Segmentation dataset 2016 - A Benchmark Dataset and Evaluation Methodology for Video Object Segmentation (F. Perazzi, J. Pont-Tuset, B. McWilliams, L. Van Gool, M. Gross, and A. Sorkine-Hornung) [Before 28/12/19]
  5. DAVIS: Video Object Segmentation dataset 2017 - The 2017 DAVIS Challenge on Video Object Segmentation (J. Pont-Tuset, F. Perazzi, S. Caelles, P. Arbelaez, A. Sorkine-Hornung, and L. Van Gool) [Before 28/12/19]
  6. EGO-CH - a large egocentric video dataset acquired by real visitors in two different cultural sites. The dataset includes more than 27 hours of video acquired by 70 different subjects. The overall dataset includes labels for 26 environments and over 200 Points of Interest (POIs). (Giovanni Maria Farinella) [31/12/19]
  7. FAIR-Play - 1,871 video clips (~5 hrs) and their corresponding binaural audio clips recorded in a music room (Gao and Grauman) [29/12/19]
  8. GoPro-Gyro Dataset - ego centric videos (Linkoping Computer Vision Laboratory) [Before 28/12/19]
  9. Image & Video Quality Assessment at LIVE - used to develop picture quality algorithms (the University of Texas at Austin) [Before 28/12/19]
  10. Large scale YouTube video dataset - 156,823 videos (2,907,447 keyframes) crawled from YouTube videos (Yi Yang) [Before 28/12/19]
  11. Movie Memorability Dataset - memorable movie clips and ground truth of detail memorability, 660 short movie excerpts extracted from 100 Hollywood-like movies (Cohendet, Yadati, Duong and Demarty) [Before 28/12/19]
  12. MovieQA - each machines to understand stories by answering questions about them. 15000 multiple choice QAs, 400+ movies. (M. Tapaswi, Y. Zhu, R. Stiefelhagen, A. Torralba, R. Urtasun, and S. Fidler) [Before 28/12/19]
  13. Multispectral visible-NIR video sequences - Annotated multispectral video, visible + NIR (LE2I, Universit de Bourgogne) [Before 28/12/19]
  14. Moments in Time Dataset - Moments in Time Dataset 1M 3-second videos annotated with action type, the largest dataset of its kind for action recognition and understanding in video. (Monfort, Oliva, et al.) [Before 28/12/19]
  15. Near duplicate video retrieval dataset - This database consists of 156,823 videos sequences (2,907,447 keyframes), which were crawled from YouTube during the period of July 2010 to September 2010. (Jingkuan Song, Yi Yang, Zi Huang, Heng Tao Shen, Richang Hong) [Before 28/12/19]
  16. PHD2: Personalized Highlight Detection Dataset - PHD2 is a dataset with personalized highlight information, which allows to train highlight detection models that use information about the user, when making predictions. (Ana Garcia del Molino, Michael Gygli) [Before 28/12/19]
  17. Sports-1M - Dataset for sports video classification containing 487 classes and 1.2M videos. (Andrej Karpathy and George Toderici and Sanketh Shetty and Thomas Leung and Rahul Sukthankar and Li Fei-Fei.) [Before 28/12/19]
  18. nuTonomy scenes dataset (nuScenes) - The nuScenes dataset is a large-scale autonomous driving dataset. It features: Full sensor suite (1x LIDAR, 5x RADAR, 6x camera, IMU, GPS), 1000 scenes of 20s each, 1,440,000 camera images, 400,000 lidar sweeps, two diverse cities: Boston and Singapore, left versus right hand traffic, detailed map information, manual annotations for 25 object classes, 1.1M 3D bounding boxes annotated at 2Hz, attributes such as visibility, activity and pose. (Caesar et al) [Before 28/12/19]
  19. REDS (REalistic and Dynamic Scenes) - high-quality realistic blurry video dataset with reference sharp frames (improved version of GOPRO) (Nah, Baik, Hong, Moon, Son, Timofte and Lee) [4/1/2020]
  20. Video Sequencesused for research on Euclidean upgrades based on minimal assumptions about the camera(Kenton McHenry) [Before 28/12/19]
  21. Video Stacking Dataset - A Virtual Tripod for Hand-held Video Stacking on Smartphones (Erik Ringaby etc.) [Before 28/12/19]
  22. VideoMem Dataset - The VideoMem or Video Memorability Database is a collection of sound-less video excerpts and their corresponding ground-truth memorability files. The memorability scores are computed based on the measurement of short-term and long-term memory performances when recognizing small video excerpts a few minutes after viewing them for the short-term case, and 24 to 72 hours later, for the long-term case. It is accompanied with video features extracted from the video excerpts. It is intended to be used for understanding the memorability of videos and for assessing the quality of methods for predicting the memorability of multimedia content. (Cohendet, Demarty, Duong and Engilberge) [6/1/20]
  23. YFCC100M videos - A benchmark on the video subset of YFCC100M which includes the videos, he video content features and the API to a sate-of-the-art video content engine. (Lu Jiang) [Before 28/12/19]
  24. YFCC100M: The New Data in Multimedia Research - This publicly available curated dataset of 100 million photos and videos is free and legal for all. (Bart Thomee, Yahoo Labs and Flickr in San Francisco,etc.) [Before 28/12/19]
  25. YouTube-BoundingBoxes - 5.6 million accurate human-annotated BB from 23 object classes tracked across frames, from 240,000 YouTube videos, with a strong focus on the person class (1.3 million boxes) (Real, Shlens, Pan, Mazzocchi, Vanhoucke, Khan, Kakarla et al) [Before 28/12/19]
  26. YouTube-8M - Dataset for video classification in the wild, containing pre-extracted frame level features from 8M videos, and 4800 classes. (Sami Abu-El-Haija, Nisarg Kothari, Joonseok Lee, Paul Natsev,George Toderici, Balakrishnan Varadarajan, Sudheendra Vijayanarasimhan) [Before 28/12/19]
  27. YUP++ / Dynamic Scenes dataset - 20 outdoor scene classes, each with 60 colour videos (each 5 seconds, 480 pixels wide, 24-30 fps) from 60 different scenes. Half of the videos are with a static camera and half with a moving camera (Feichtenhofer, Pinz, Wildes) [Before 28/12/19

你可能感兴趣的:(计算机视觉CV,计算机视觉,数据集)