一、inpainting相关算法
1、Place2数据集。Places365-Standard 180万张,365个类别,用于训练,每个类别50张用于验证,900张用于测试。有高分别率和低分辨率256x256。
Places2: A Large-Scale Database for Scene Understandinghttp://places2.csail.mit.edu/download.html
2、CelebA数据集
CelebFaces Attributes Dataset (CelebA) is a large-scale face attributes dataset with more than 200K celebrity images, each with 40attribute annotations. The images in this dataset cover large pose variations and background clutter. CelebA has large diversities, large quantities, and rich annotations, including
10,177 number of identities,
202,599 number of face images, and
5 landmark locations, 40 binary attributes annotations per image.
The dataset can be employed as the training and test sets for the following computer vision tasks: face attribute recognition, face recognition, face detection, landmark (or facial part) localization, and face editing & synthesis.
http://mmlab.ie.cuhk.edu.hk/projects/CelebA.htmlhttp://mmlab.ie.cuhk.edu.hk/projects/CelebA.html
二、超分辨率SR
1、DIV2K数据集
We are making available a large newly collected dataset -DIV2K- of RGB images with a large diversity of contents.
The DIV2K dataset is divided into:
DIV2K Datasethttps://data.vision.ee.ethz.ch/cvl/DIV2K/
2、Fickr2K
https://cv.snu.ac.kr/research/EDSR/Flickr2k.tarhttps://cv.snu.ac.kr/research/EDSR/Flickr2k.tar
3、Manga109
This data set (hereafter referred to as Manga109) has been compiled by the Aizawa Yamasaki Matsui Laboratory, Department of Information and Communication Engineering, the Graduate School of Information Science and Technology, the University of Tokyo. The compilation is intended for use in academic research on the media processing of Japanese manga. Manga109 is composed of 109 manga volumes drawn by professional manga artists in Japan. These manga were commercially made available to the public between the 1970s and 2010s, and encompass a wide range of target readerships and genres (see the table in Explore for further details.) Most of the manga in the compilation are available at the manga library “Manga Library Z” (formerly the “Zeppan Manga Toshokan” library of out-of-print manga).
Manga109http://www.manga109.org/en/
三、分类识别检测分割等
1、MNIST
The MNIST database of handwritten digits, available from this page, has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image.
It is a good database for people who want to try learning techniques and pattern recognition methods on real-world data while spending minimal efforts on preprocessing and formatting.
MNIST handwritten digit database, Yann LeCun, Corinna Cortes and Chris Burgeshttp://yann.lecun.com/exdb/mnist/
2、The CIFAR-10/100 dataset
The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.
The dataset is divided into five training batches and one test batch, each with 10000 images. The test batch contains exactly 1000 randomly-selected images from each class. The training batches contain the remaining images in random order, but some training batches may contain more images from one class than another. Between them, the training batches contain exactly 5000 images from each class.
This dataset is just like the CIFAR-10, except it has 100 classes containing 600 images each. There are 500 training images and 100 testing images per class. The 100 classes in the CIFAR-100 are grouped into 20 superclasses. Each image comes with a "fine" label (the class to which it belongs) and a "coarse" label (the superclass to which it belongs).
CIFAR-10 and CIFAR-100 datasetshttp://www.cs.toronto.edu/~kriz/cifar.html
3、 COCO
COCO is a large-scale object detection, segmentation, and captioning dataset. COCO has several features:
Object segmentation
Recognition in context
Superpixel stuff segmentation
330K images (>200K labeled)
1.5 million object instances
80 object categories
91 stuff categories
5 captions per image
250,000 people with keypoints
COCO - Common Objects in Contexthttps://cocodataset.org/#home
4、ImageNet
The most highly-used subset of ImageNet is the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) 2012-2017 image classification and localization dataset. This dataset spans 1000 object classes and contains 1,281,167 training images, 50,000 validation images and 100,000 test images. This subset is available on Kaggle.
ImageNethttps://image-net.org/download.php
5、Pascal VOC
The Pascal VOC challenge is a very popular dataset for building and evaluating algorithms for image classification, object detection, and segmentation. However, the website goes down like all the time.
Pascal VOC Dataset MirrorHere is a mirror for the Pascal VOC files in case, you know, you want to download them at a somewhat decent rate.https://pjreddie.com/projects/pascal-voc-dataset-mirror/
6、PASCAL-Context数据集
This dataset is a set of additional annotations for PASCAL VOC 2010. It goes beyond the original PASCAL semantic segmentation task by providing annotations for the whole scene. The statistics section has a full list of 400+ labels.
PASCAL-Context Datasethttps://cs.stanford.edu/~roozbeh/pascal-context/
7、Fashion MNIST
Fashion-MNIST is a dataset of Zalando's article images—consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes. Zalando intends Fashion-MNIST to serve as a direct drop-in replacement for the original MNIST dataset for benchmarking machine learning algorithms. It shares the same image size and structure of training and testing splits.
The original MNIST dataset contains a lot of handwritten digits. Members of the AI/ML/Data Science community love this dataset and use it as a benchmark to validate their algorithms. In fact, MNIST is often the first dataset researchers try. "If it doesn't work on MNIST, it won't work at all", they said. "Well, if it does work on MNIST, it may still fail on others."
Zalando seeks to replace the original MNIST dataset
Fashion MNIST | KaggleAn MNIST-like dataset of 70,000 28x28 labeled fashion imageshttps://www.kaggle.com/zalando-research/fashionmnist
8、Caltech 101
Pictures of objects belonging to 101 categories. About 40 to 800 images per category. Most categories have about 50 images. Collected in September 2003 by Fei-Fei Li, Marco Andreetto, and Marc 'Aurelio Ranzato. The size of each image is roughly 300 x 200 pixels.
We have carefully clicked outlines of each object in these pictures, these are included under the 'Annotations.tar'. There is also a matlab script to view the annotaitons, 'show_annotations.m'.
Caltech101http://www.vision.caltech.edu/Image_Datasets/Caltech101/
9、Helen dataset 人脸特征定位
In our effort of building a facial feature localization algorithm that can operate reliably and accurately under a broad range of appearance variation, including pose, lighting, expression, occlusion, and individual differences, we realize that it is necessary that the training set include high resolution examples so that, at test time, a high resolution test image can be fit accurately. Although a number face databases exist, we found none that meet our requirements, particularly the resolution requirement.
http://www.ifp.illinois.edu/~vuongle2/helen/http://www.ifp.illinois.edu/~vuongle2/helen/
10、LFW Labeled Faces in the Wild Home 人脸识别
Labeled Faces in the Wild is a public benchmark for face verification, also known as pair matching. No matter what the performance of an algorithm on LFW, it should not be used to conclude that an algorithm is suitable for any commercial purpose. There are many reasons for this. Here is a non-exhaustive list:
For all of these reasons, we would like to emphasize that LFW was published to help the research community make advances in face verification, not to provide a thorough vetting of commercial algorithms before deployment.
http://vis-www.cs.umass.edu/lfw/http://vis-www.cs.umass.edu/lfw/
11、The Cityscapes Dataset
We present a new large-scale dataset that contains a diverse set of stereo video sequences recorded in street scenes from 50 different cities, with high quality pixel-level annotations of 5 000 frames in addition to a larger set of 20 000 weakly annotated frames. The dataset is thus an order of magnitude larger than similar previous attempts
Cityscapes Dataset – Semantic Understanding of Urban Street Sceneshttps://www.cityscapes-dataset.com
12、ADE20K
The annotated images cover the scene categories from the SUN and Places database. Here there are some examples showing the images, object segmentations, and parts segmentations:
ADE20K datasethttp://groups.csail.mit.edu/vision/datasets/ADE20K/
13、BSDS500
This new dataset is an extension of the BSDS300, where the original 300 images are used for training / validation and 200 fresh images, together with human annotations, are added for testing. Each image was segmented by five different subjects on average. Performance is evaluated by measuring Precision / Recall on detected boundaries and three additional region-based metrics. UC Berkeley Computer Vision Group - Contour Detection and Image Segmentation - Resources
四、效果风格
1、MIT-Adobe FiveK Dataset
We collected 5,000 photographs taken with SLR cameras by a set of different photographers. They are all in RAW format; that is, all the information recorded by the camera sensor is preserved. We made sure that these photographs cover a broad range of scenes, subjects, and lighting conditions. We then hired five photography students in an art school to adjust the tone of the photos. Each of them retouched all the 5,000 photos using a software dedicated to photo adjustment (Adobe Lightroom) on which they were extensively trained. We asked the retouchers to achieve visually pleasing renditions, akin to a postcard. The retouchers were compensated for their work.
MIT-Adobe FiveK datasethttps://data.csail.mit.edu/graphics/fivek/