Dataset
Overview
Structure
Image format
Annotation format
Result format
Pre-calculated features
HOG features
The file contains three sets of differently configured HOG features (Histograms of Oriented Gradients). The sets contain feature vectors of length 1568, 1568, and 2916 respectively. The features were calculated using the source code from http://pascal.inrialpes.fr/soft/olt/. For detailed information on HOG, we refer to
N. Dalal and B. Triggs. Histograms of Oriented Gradients for Human Detection. IEEE Conference on Computer Vision and Pattern Recognition, pages 886-893, 2005
Haar-like features
The file contains one set of Haar-like features. For each image, 5 different types of Haar-like features were computed in different sizes for a total of 12 different features. The overall feature vector contains 11,584 features.
Hue Histograms
For each image in the training set, the file contains a 256-bin histogram of hue values (HSV color space).
Code snippets
Matlab
The Matlab example code provides functions to iterate over the datasets (both training and test) to read the images and the corresponding annotations.
Locations where you can easiliy hook in your training or classification method are marked in the code by dummy function calls.
Please have a look at the file Readme.txt in the ZIP file for more details
C++
The C++ example code demonstrates how to to train a linear classifier (LDA) using the Shark machine learning library.
This code uses the precalculated features. It was used to generate the baseline results.
Please have a look at the file Readme.txt in the ZIP file for more details
Python
The Python example code provides a function to iterate over the training set to read the images and the corresponding class id.
The code depends on matplotlib. Please have a look at the file Readme.txt in the ZIP file for more details
Citation
The data is free to use. However, we cordially ask you to cite the following publication if you do:J. Stallkamp, M. Schlipsing, J. Salmen, and C. Igel. The German Traffic Sign Recognition Benchmark: A multi-class classification competition. In Proceedings of the IEEE International Joint Conference on Neural Networks, pages 1453–1460. 2011.
@inproceedings{Stallkamp-IJCNN-2011, author = {Johannes Stallkamp and Marc Schlipsing and Jan Salmen and Christian Igel}, booktitle = {IEEE International Joint Conference on Neural Networks}, title = {The {G}erman {T}raffic {S}ign {R}ecognition {B}enchmark: A multi-class classification competition}, year = {2011}, pages = {1453--1460} }
Thank you.
Result Analysis Application
We provide a simple application to facilitate result analysis. It allows you to compare different approaches, analyse the confusion matices and inspect which images were classified correctly.The software is supplied under GPLv2. It depends on Qt 4.7, which is available here in source code and binary form. Qt is licensed under LGPL. Qt is a trademark of Nokia Corporation.
The software is provided as source code and Win32 binary. The files can be found in the download section. The code is platform-independent, however, it has only been tested on Microsoft Windows with Visual Studio. So there might be a couple of issues left where GCC is more strict than Visual Studio. We appreciate any comments, patches and bug reports.
The project uses CMake, an open-source, cross-platform build system which allows you to generate project files/makefiles for your preferred compiler toolchain.
Here are some screenshots to get an idea of this tool.
Main window: Performance of one or more approaches
Compare multiple approaches and on which images they erred
Confusion matrix: See which classes got confused.
Clicking cells, rows or columns in the confusion matrix shows which images were misclassified.
Here: All "Speed limit 60" images that were incorrectly classified as some other class.
Downloads
Training dataset
This is the official GTSRB training set. If you either intend to participate in the final competition session at IJCNN 2011 or you want to publish experimental results based on GTRSB data, you must use this dataset for training.
The training data set contains 39,209 training images in 43 classes.
- Images and annotations: Download (263 MB)
- Three sets of different HOG features: Download (870 MB)
- Haar-like features: Download (944 MB)
- Hue histograms: Download (17 MB)
Test dataset
Thís is the official GTSRB test set. It was first published at IJCNN 2011 during the special session "Traffic Sign Recognition for Machine Learning". All experimental results that are reported on GTSRB data must use this dataset for testing (apart from the ones already published at IJCNN 2011). The structure of the dataset follows the test set that was published for the online competition (and is now part of the training data).
The test dataset contains 12,630 test images or the corresponding pre-calculated features in random order.
- Images and annotations: Download (84 MB)
- Three sets of different HOG features: Download (278 MB)
- Haar-like features: Download (304 MB)
- Hue histograms: Download (5 MB)
- Extended annotations including class ids: Download (98 kB)