I am going to maintain this page to record a few things about computer vision that I have read, am doing, or will have a look at. Previously I’d like to write short notes of the papers that I have read. It is a good way to remember and understand the ideas of the authors. But gradually I found that I forget much portion of what I had learnt because in addition to paper I also derive knowledges from others’ blogs, online courses and reports, not recording them at all. Besides, I need a place to keep a list of what I should have a look at but do not at the time when I discover them. This page will be much like a catalog.
Fast Neural Architecture Search of Compact Semantic Segmentation Models via Auxiliary Cells (PDF)
Deep Learning for Semantic Segmentation on Minimal Hardware (PDF)
TernausNetV2: Fully Convolutional Network for Instance Segmentation (PDF, Project/Code)
Stacked U-Nets: A No-Frills Approach to Natural Image Segmentation (PDF, Project/Code)
Deep Object Co-Segmentation (PDF)
Fusing Hierarchical Convolutional Features for Human Body Segmentation and Clothing Fashion Classification (PDF)
ShuffleSeg: Real-time Semantic Segmentation Network (PDF)
Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation (PDF, Project/Code)
Learning random-walk label propagation for weakly-supervised semantic segmentation (PDF)
Panoptic Segmentation (PDF, Reading Note)
Learning to Segment Every Thing (PDF, Project/Code)
Deep Extreme Cut: From Extreme Points to Object Segmentation (PDF)
Instance-aware Semantic Segmentation via Multi-task Network Cascades (PDF, Project/Code)
ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation (PDF, Reading Note)
Learning Deconvolution Network for Semantic Segmentation (PDF, Reading Note)
Semantic Object Parsing with Graph LSTM (PDF, Reading Note)
Bayesian SegNet: Model Uncertainty in Deep Convolutional Encoder-Decoder Architectures for Scene Understanding (PDF, Reading Note)
Learning to Segment Moving Objects in Videos (PDF, Reading Note)
Deep Structured Features for Semantic Segmentation (PDF)
We propose a highly structured neural network architecture for semantic segmentation of images that combines i) a Haar wavelet-based tree-like convolutional neural network (CNN), ii) a random layer realizing a radial basis function kernel approximation, and iii) a linear classifier. While stages i) and ii) are completely pre-specified, only the linear classifier is learned from data. Thanks to its high degree of structure, our architecture has a very small memory footprint and thus fits onto low-power embedded and mobile platforms. We apply the proposed architecture to outdoor scene and aerial image semantic segmentation and show that the accuracy of our architecture is competitive with conventional pixel classification CNNs. Furthermore, we demonstrate that the proposed architecture is data efficient in the sense of matching the accuracy of pixel classification CNNs when trained on a much smaller data set.
CNN-aware Binary Map for General Semantic Segmentation (PDF)
Learning to Refine Object Segments (PDF)
Clockwork Convnets for Video Semantic Segmentation(PDF, Project/Code)
Convolutional Gated Recurrent Networks for Video Segmentation (PDF)
Efficient Convolutional Neural Network with Binary Quantization Layer (PDF)
One-Shot Video Object Segmentation (PDF)
Fully Convolutional Instance-aware Semantic Segmentation (PDF, Projcet/Code, Reading Note)
Semantic Segmentation using Adversarial Networks (PDF)
Full-Resolution Residual Networks for Semantic Segmentation in Street Scenes (PDF)
Deep Watershed Transform for Instance Segmentation (PDF)
InstanceCut: from Edges to Instances with MultiCut (PDF)
The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation (PDF)
Improving Fully Convolution Network for Semantic Segmentation (PDF)
Video Scene Parsing with Predictive Feature Learning (PDF)
Training Bit Fully Convolutional Network for Fast Semantic Segmentation (PDF)
Pyramid Scene Parsing Network (PDF, Reading Note)
Mining Pixels: Weakly Supervised Semantic Segmentation Using Image Labels (PDF)
FastMask: Segment Object Multi-scale Candidates in One Shot (PDF, Project/Code, Reading Note)
A New Convolutional Network-in-Network Structure and Its Applications in Skin Detection, Semantic Segmentation, and Artifact Reduction (PDF, Reading Note)
FusionSeg: Learning to combine motion and appearance for fully automatic segmention of generic objects in videos (PDF)
Visual Saliency Prediction Using a Mixture of Deep Neural Networks (PDF)
PixelNet: Representation of the pixels, by the pixels, and for the pixels (PDF, Project/Code)
Super-Trajectory for Video Segmentation (PDF)
Understanding Convolution for Semantic Segmentation (PDF, Reading Note)
Adversarial Examples for Semantic Image Segmentation (PDF)
Large Kernel Matters – Improve Semantic Segmentation by Global Convolutional Network (PDF)
Deep Image Matting (PDF, Reading Note)
Mask R-CNN (PDF, Caffe Implementation, TuSimple Implementation on MXNet, TensorFlow Implementation, Reading Note)
Predicting Deeper into the Future of Semantic Segmentation (PDF)
Convolutional Oriented Boundaries: From Image Segmentation to High-Level Tasks (PDF, Project/Code)
One-Shot Video Object Segmentation (PDF, Project/Code)
Semantic Instance Segmentation via Deep Metric Learning (PDF)
Not All Pixels Are Equal: Difficulty-aware Semantic Segmentation via Deep Layer Cascade (PDF)
Semantically-Guided Video Object Segmentation (PDF)
Recurrent Multimodal Interaction for Referring Image Segmentation (PDF)
Loss Max-Pooling for Semantic Image Segmentation (PDF)
Reformulating Level Sets as Deep Recurrent Neural Network Approach to Semantic Segmentation (PDF)
Learning Video Object Segmentation with Visual Memory (PDF)
A Review on Deep Learning Techniques Applied to Semantic Segmentation (PDF)
BiSeg: Simultaneous Instance Segmentation and Semantic Segmentation with Fully Convolutional Networks (PDF)
Rethinking Atrous Convolution for Semantic Image Segmentation (PDF)
Discriminative Localization in CNNs for Weakly-Supervised Segmentation of Pulmonary Nodules (PDF)
Superpixel-based semantic segmentation trained by statistical process control (PDF)
The Devil is in the Decoder (PDF)
Semantic Segmentation with Reverse Attention (PDF)
Learning Deconvolution Network for Semantic Segmentation (PDF, Project/Code)
Depth Adaptive Deep Neural Network for Semantic Segmentation (PDF)
Semantic Instance Segmentation with a Discriminative Loss Function (PDF)
A Cost-Sensitive Visual Question-Answer Framework for Mining a Deep And-OR Object Semantics from Web Images (PDF)
ICNet for Real-Time Semantic Segmentation on High-Resolution Images (PDF, Project/Code)
Pyramid Scene Parsing Network (PDF, Project/Code, Reading Note)
Learning to Segment Instances in Videos with Spatial Propagation Network (PDF, Project/Code)
Learning Affinity via Spatial Propagation Networks (PDF, Project/Code)
https://arxiv.org/abs/1805.07883 (PDF)
Rethinking ImageNet Pre-training (PDF)
Learning From Positive and Unlabeled Data: A Survey (PDF)
Gather-Excite: Exploiting Feature Context in Convolutional Neural Networks (PDF, Project/Code)
DropBlock: A regularization method for convolutional networks (PDF)
Differentiable Abstract Interpretation for Provably Robust Neural Networks (PDF, Project/Code)
Adding One Neuron Can Eliminate All Bad Local Minima (PDF)
Step Size Matters in Deep Learning (PDF)
Do Better ImageNet Models Transfer Better? (PDF)
Robust Classification with Convolutional Prototype Learning (PDF, Project/Code)
Fast Feature Extraction with CNNs with Pooling Layers (PDF)
Network Transplanting (PDF)
An Information-Theoretic View for Deep Learning (PDF)
Understanding Individual Neuron Importance Using Information Theory (PDF)
Understanding Convolutional Neural Network Training with Information Theory (PDF)
The unreasonable effectiveness of the forget gate (PDF)
Discovering Hidden Factors of Variation in Deep Networks (PDF)
Regularizing Deep Networks by Modeling and Predicting Label Structure (PDF)
Hierarchical Novelty Detection for Visual Object Recognition (PDF)
Guide Me: Interacting with Deep Networks (PDF)
Studying Invariances of Trained Convolutional Neural Networks (PDF)
Deep Residual Networks and Weight Initialization (PDF)
WNGrad: Learn the Learning Rate in Gradient Descent (PDF)
Understanding the Loss Surface of Neural Networks for Binary Classification (PDF)
Tell Me Where to Look: Guided Attention Inference Network (PDF)
Convolutional Neural Networks with Alternately Updated Clique (PDF, Project/Code)
Visual Interpretability for Deep Learning: a Survey (PDF)
Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey (PDF)
CNNs are Globally Optimal Given Multi-Layer Support (PDF)
Take it in your stride: Do we need striding in CNNs? (PDF)
Gradients explode - Deep Networks are shallow - ResNet explained (PDF)
Super-Convergence: Very Fast Training of Residual Networks Using Large Learning Rates (PDF, Project/Code)
Data Distillation: Towards Omni-Supervised Learning (PDF)
Peephole: Predicting Network Performance Before Training (PDF)
AdaBatch: Adaptive Batch Sizes for Training Deep Neural Networks (PDF)
Gradual Tuning: a better way of Fine Tuning the parameters of a Deep Neural Network (PDF)
CondenseNet: An Efficient DenseNet using Learned Group Convolutions (PDF, Project/Code)
Population Based Training of Neural Networks (PDF)
Knowledge Concentration: Learning 100K Object Classifiers in a Single CNN (PDF)
Shift: A Zero FLOP, Zero Parameter Alternative to Spatial Convolutions (PDF)
Unleashing the Potential of CNNs for Interpretable Few-Shot Learning (PDF)
Non-local Neural Networks (PDF, Caffe2)
Log-DenseNet: How to Sparsify a DenseNet (PDF)
Don’t Decay the Learning Rate, Increase the Batch Size (PDF)
Guarding Against Adversarial Domain Shifts with Counterfactual Regularization (PDF)
UberNet: Training a ‘Universal’ Convolutional Neural Network for Low-, Mid-, and High-Level Vision using Diverse Datasets and Limited Memory (PDF, Project/Code)
What makes ImageNet good for transfer learning? (PDF, Project/Code, Reading Note)
The tremendous success of features learnt using the ImageNet classification task on a wide range of transfer tasks begs the question: what are the intrinsic properties of the ImageNet dataset that are critical for learning good, general-purpose features? This work provides an empirical investigation of various facets of this question: Is more pre-training data always better? How does feature quality depend on the number of training examples per class? Does adding more object classes improve performance? For the same data budget, how should the data be split into classes? Is fine-grained recognition necessary for learning good features? Given the same number of training classes, is it better to have coarse classes or fine-grained classes? Which is better: more classes or more examples per class?
Understanding and Improving Convolutional Neural Networks via Concatenated Rectified Linear Units (PDF)
Densely Connected Convolutional Networks (PDF, Project/Code, Reading Note)
Decoupled Neural Interfaces using Synthetic Gradients (PDF)
Training directed neural networks typically requires forward-propagating data through a computation graph, followed by backpropagating error signal, to produce weight updates. All layers, or more generally, modules, of the network are therefore locked, in the sense that they must wait for the remainder of the network to execute forwards and propagate error backwards before they can be updated. In this work we break this constraint by decoupling modules by introducing a model of the future computation of the network graph. These models predict what the result of the modeled sub-graph will produce using only local information. In particular we focus on modeling error gradients: by using the modeled synthetic gradient in place of true backpropagated error gradients we decouple subgraphs, and can update them independently and asynchronously.
Rethinking the Inception Architecture for Computer Vision (PDF, Reading Note)
In this paper, several network designing choices are discussed, including factorizing convolutions into smaller kernels and asymmetric kernels, utility of auxiliary classifiers and reducing grid size using convolution stride rather than pooling.
Factorized Convolutional Neural Networks (PDF, Reading Note)
Do semantic parts emerge in Convolutional Neural Networks? (PDF, Reading Note)
A Critical Review of Recurrent Neural Networks for Sequence Learning (PDF)
Image Compression with Neural Networks (Project/Code)
Graph Convolutional Networks (Project/Code)
Understanding intermediate layers using linear classifier probes (PDF, Reading Note)
Learning What and Where to Draw (PDF, Project/Code)
On the interplay of network structure and gradient convergence in deep learning (PDF)
Deep Learning with Separable Convolutions (PDF)
Grad-CAM: Why did you say that? Visual Explanations from Deep Networks via Gradient-based Localization (PDF, Project/Code)
Optimization of Convolutional Neural Network using Microcanonical Annealing Algorithm (PDF)
Deep Pyramidal Residual Networks (PDF)
Impatient DNNs - Deep Neural Networks with Dynamic Time Budgets (PDF)
Uncertainty in Deep Learning (PDF, Project/Code)
This is the PhD Thesis of Yarin Gal.
Tensorial Mixture Models (PDF, Project/Code)
Multifaceted Feature Visualization: Uncovering the Different Types of Features Learned By Each Neuron in Deep Neural Networks (PDF)
Why Deep Neural Networks? (PDF)
Local Similarity-Aware Deep Feature Embedding (PDF)
A Review of 40 Years of Cognitive Architecture Research: Focus on Perception, Attention, Learning and Applications (PDF)
Professor Forcing: A New Algorithm for Training Recurrent Networks (PDF)
On the expressive power of deep neural networks(PDF)
What Is the Best Practice for CNNs Applied to Visual Instance Retrieval? (PDF)
Deep Convolutional Neural Network Design Patterns (PDF, Project/Code)
Tricks from Deep Learning (PDF)
A Connection between Generative Adversarial Networks, Inverse Reinforcement Learning, and Energy-Based Models (PDF)
Multi-Shot Mining Semantic Part Concepts in CNNs (PDF)
Aggregated Residual Transformations for Deep Neural Networks (PDF, Reading Note)
PolyNet: A Pursuit of Structural Diversity in Very Deep Networks (PDF)
On the Exploration of Convolutional Fusion Networks for Visual Recognition (PDF)
ResFeats: Residual Network Based Features for Image Classification (PDF)
Object Recognition with and without Objects (PDF)
LCNN: Lookup-based Convolutional Neural Network (PDF, Reading Note)
Inductive Bias of Deep Convolutional Networks through Pooling Geometry (PDF, Project/Code)
Wider or Deeper: Revisiting the ResNet Model for Visual Recognition (PDF, Reading Note)
Multi-Scale Context Aggregation by Dilated Convolutions (PDF, Project/Code)
Large-Margin Softmax Loss for Convolutional Neural Networks (PDF, mxnet Code, Caffe Code)
Adversarial Examples Detection in Deep Networks with Convolutional Filter Statistics (PDF)
Feedback Networks (PDF)
Visualizing Residual Networks (PDF)
Convolutional Oriented Boundaries: From Image Segmentation to High-Level Tasks (PDF, Project/Code)
Understanding trained CNNs by indexing neuron selectivity (PDF)
Benchmarking State-of-the-Art Deep Learning Software Tools (PDF, Project/Code)
Batch Renormalization: Towards Reducing Minibatch Dependence in Batch-Normalized Models (PDF)
Visualizing Deep Neural Network Decisions: Prediction Difference Analysis (PDF, Project/Code)
ShaResNet: reducing residual network parameter number by sharing weights (PDF)
Deep Forest: Towards An Alternative to Deep Neural Networks (PDF, Project/Code)
All You Need is Beyond a Good Init: Exploring Better Solution for Training Extremely Deep Convolutional Neural Networks with Orthonormality and Modulation (PDF)
Genetic CNN (PDF)
Deformable Convolutional Networks (PDF)
Quality Resilient Deep Neural Networks (PDF)
How ConvNets model Non-linear Transformations (PDF)
Active Convolution: Learning the Shape of Convolution for Image Classification (PDF)
Multi-Scale Dense Convolutional Networks for Efficient Prediction (PDF, Project/Code)
Coordinating Filters for Faster Deep Neural Networks (PDF, Project/Code)
A Genetic Programming Approach to Designing Convolutional Neural Network Architectures (PDF)
On Generalization and Regularization in Deep Learning (PDF)
Interpretable Explanations of Black Boxes by Meaningful Perturbation (PDF)
Energy Propagation in Deep Convolutional Neural Networks (PDF)
Introspection: Accelerating Neural Network Training By Learning Weight Evolution (PDF)
Deeply-Supervised Nets (PDF)
Speeding up Convolutional Neural Networks By Exploiting the Sparsity of Rectifier Units (PDF)
Inception Recurrent Convolutional Neural Network for Object Recognition (PDF)
Residual Attention Network for Image Classification (PDF)
The Landscape of Deep Learning Algorithms (PDF)
Pixel Deconvolutional Networks (PDF)
Dilated Residual Networks (PDF)
A Kernel Redundancy Removing Policy for Convolutional Neural Network (PDF)
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour (PDF)
Learning Spatial Regularization with Image-level Supervisions for Multi-label Image Classification (PDF, Project/Code, Reading Note)
VisualBackProp: efficient visualization of CNNs (PDF)
Pruning Convolutional Neural Networks for Resource Efficient Inference (PDF, Project/Code)
Zero-Shot Learning - A Comprehensive Evaluation of the Good, the Bad and the Ugly (PDF, Project/Code)
ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices (PDF, Caffe Implementation)
Submanifold Sparse Convolutional Networks (PDF, Project/Code)
Dual Path Networks (PDF)
ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression (PDF, Project/Code, Reading Note)
Memory-Efficient Implementation of DenseNets (PDF)
Residual Attention Network for Image Classification (PDF, Project/Code)
An Effective Training Method For Deep Convolutional Neural Network (PDF)
Learning to Transfer (PDF)
Learning Efficient Convolutional Networks through Network Slimming (PDF, Project/Code)
Super-Convergence: Very Fast Training of Residual Networks Using Large Learning Rates (PDF, Project/Code)
Hierarchical loss for classification (PDF)
Convolutional Gaussian Processes (PDF, Code/Project)
Interpretable Convolutional Neural Networks (PDF)
What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? (PDF)
Porcupine Neural Networks: (Almost) All Local Optima are Global (PDF)
Generalization in Deep Learning (PDF)
A systematic study of the class imbalance problem in convolutional neural networks (PDF)
Interpretable Transformations with Encoder-Decoder Networks (PDF, Project/Code)
One pixel attack for fooling deep neural networks (PDF)
Data Augmentation for Computer Vision with PyTorch
Neural Network Distiller
Distiller is an open-source Python package for neural network compression research.
Neural Network Tools: Converter, Constructor and Analyser
For caffe, pytorch, tensorflow, draknet and so on.
TensorFlow Examples
TensorFlow Tutorial with popular machine learning algorithms implementation. This tutorial was designed for easily diving into TensorFlow, through examples.It is suitable for beginners who want to find clear and concise examples about TensorFlow. For readability, the tutorial includes both notebook and code with explanations.
TensorFlow Tutorials
These tutorials are intended for beginners in Deep Learning and TensorFlow. Each tutorial covers a single topic. The source-code is well-documented. There is a YouTube video for each tutorial.
Home Surveilance with Facial Recognition
Deep Learning algorithms with TensorFlow
This repository is a collection of various Deep Learning algorithms implemented using the TensorFlow library. This package is intended as a command line utility you can use to quickly train and evaluate popular Deep Learning models and maybe use them as benchmark/baseline in comparison to your custom models/datasets.
TensorLayer
TensorLayer is designed to use by both Researchers and Engineers, it is a transparent library built on the top of Google TensorFlow. It is designed to provide a higher-level API to TensorFlow in order to speed-up experimentations and developments. TensorLayer is easy to be extended and modified. In addition, we provide many examples and tutorials to help you to go through deep learning and reinforcement learning.
Easily Create High Quality Object Detectors with Deep Learning
Using dlib to train a CNN to detect.
Command Line Neural Network
Neuralcli provides a simple command line interface to a python implementation of a simple classification neural network. Neuralcli allows a quick way and easy to get instant feedback on a hypothesis or to play around with one of the most popular concepts in machine learning today.
LSTM for Human Activity Recognition
Human activity recognition using smartphones dataset and an LSTM RNN. The project is based on Tesorflow. A MXNet implementation is MXNET-Scala Human Activity Recognition.
YOLO in caffe
This is a caffe implementation of the YOLO:Real-Time Object Detection.
SSD: Single Shot MultiBox Object Detector in mxnet
MTCNN face detection and alignment in MXNet
This is a python/mxnet implementation of Zhang’s work .
CNTK Examples: Image/Detection/Fast R-CNN
Self Driving (Toy) Ferrari
Finding Lane Lines on the Road
Magenta
Magenta is a project from the Google Brain team that asks: Can we use machine learning to create compelling art and music? If so, how? If not, why not?
Adversarial Nets Papers
The classical Papers about adversarial nets
Mushreco
Make a photo of a mushroom and see which species it is. Determine over 200 different species.
Neural Enhance
The neural network is hallucinating details based on its training from example images. It’s not reconstructing your photo exactly as it would have been if it was HD. That’s only possible in Hollywood — but using deep learning as “Creative AI” works and it is just as cool!
CNN Models by CVGJ
This repository contains convolutional neural network (CNN) models trained on ImageNet by Marcel Simon at the Computer Vision Group Jena (CVGJ) using the Caffe framework. Each model is in a separate subfolder and contains everything needed to reproduce the results. This repository focuses currently contains the batch-normalization-variants of AlexNet and VGG19 as well as the training code for Residual Networks (Resnet).
YOLO2
YOLOv2 uses a few tricks to improve training and increase performance. Like Overfeat and SSD we use a fully-convolutional model, but we still train on whole images, not hard negatives. Like Faster R-CNN we adjust priors on bounding boxes instead of predicting the width and height outright. However, we still predict the x and y coordinates directly. The full details are in our paper soon to be released on Arxiv, stay tuned!
Lightened CNN for Deep Face Representation
The Deep Face Representation Experiment is based on Convolution Neural Network to learn a robust feature for face verification task.
Recurrent dreams and filling in
MTCNN in MXnet
openai-gemm
Open single and half precision gemm implementations. The main speedups over cublas are with small minibatch and in fp16 data formats.
Neural Style
style transfer with mxnet
Can Convolutional Neural Networks Crack Sudoku Puzzles?
cleverhans
This repository contains the source code for cleverhans , a Python library to benchmark machine learning systems’ vulnerability to adversarial examples.
A deep learning traffic light detector using dlib and a few images from Google street view
Paints Chainer
Calculate deep convolution neurAl network on Cell Unit
Deep Video Analytics
Deep Video Analytics provides a platform for indexing and extracting information from videos and images. Deep learning detection and recognition algorithms are used for indexing individual frames / images along with detected objects. The goal of Deep Video analytics is to become a quickly customizable platform for developing visual & video analytics applications, while benefiting from seamless integration with state or the art models released by the vision research community.
Yolo_mark
Windows GUI for marking bounded boxes of objects in images for training Yolo v2
Yolo-Windows v2 - Windows version of Yolo Convolutional Neural Networks
An Unsupervised Distance Learning Framework for Multimedia Retrieva
awesome-deep-vision-web-demo
Mini Caffe
Minimal runtime core of Caffe, Forward only, GPU support and Memory efficiency.
Picasso: A free open-source visualizer for Convolutional Neural Networks
Picasso is a free open-source (Eclipse Public License) DNN visualization tool that gives you partial occlusion and saliency maps with minimal fuss.
pix2code: Generating Code from a Graphical User Interface Screenshot
MTCNN-light
this repository is the implementation of MTCNN with no framework, Just need opencv and openblas. “Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Neural Networks”, implemented with C++,no framework
MobileNet-MXNet
This is a MXNet implementation of Google’s MobileNets.
NoScope: 1000x Faster Deep Learning Queries over Video
Caffe2 C++ Tutorials and Examples
Web Image Downloader Tools
A Comprehensive Hands-on Guide to Transfer Learning with Real-World Applications in Deep Learning
Attention? Attention!
Intel’s Neural Compute Stick 2 is 8 times faster than its predecessor
How fast is my model?
Depthwise separable convolutions for machine learning
The Building Blocks of Interpretability
Setting the learning rate of your neural network.
The Root Cause of Slow Neural Net Training
Why is it hard to train deep neural networks? Degeneracy, not vanishing gradients, is the key
ResNet, AlexNet, VGG, Inception: Understanding various architectures of Convolutional Networks
Neural Networks For Recommender Systems
MIT Technology Review
A good place to keep up the trends.
LAB41
Lab41 is a Silicon Valley challenge lab where experts from the U.S. Intelligence Community (IC), academia, industry, and In-Q-Tel come together to gain a better understanding of how to work with — and ultimately use — big data.
Partnership on AI
Amazon, DeepMind/Google, Facebook, IBM, and Microsoft announced that they will create a non-profit organization that will work to advance public understanding of artificial intelligence technologies (AI) and formulate best practices on the challenges and opportunities within the field. Academics, non-profits, and specialists in policy and ethics will be invited to join the Board of the organization, named the Partnership on Artificial Intelligence to Benefit People and Society (Partnership on AI).
爱可可-爱生活 老师的推荐十分值得一看
Guide to deploying deep-learning inference networks and realtime object recognition tutorial for NVIDIA Jetson TX1
A Return to Machine Learning
This post is aimed at artists and other creative people who are interested in a survey of recent developments in machine learning research that intersect with art and culture. If you’ve been following ML research recently, you might find some of the experiments interesting but will want to skip most of the explanations.
ResNets, HighwayNets, and DenseNets, Oh My!
This post walks through the logic behind three recent deep learning architectures: ResNet, HighwayNet, and DenseNet. Each make it more possible to successfully trainable deep networks by overcoming the limitations of traditional network design.
How to build a robot that “sees” with $100 and TensorFlow
I wanted to build a robot that could recognize objects. Years of experience building computer programs and doing test-driven development have turned me into a menace working on physical projects. In the real world, testing your buggy device can burn down your house, or at least fry your motor and force you to wait a couple of days for replacement parts to arrive.
Navigating the unsupervised learning landscape
Unsupervised learning is the Holy Grail of Deep Learning. The goal of unsupervised learning is to create general systems that can be trained with little data. Very little data.
Deconvolution and Checkerboard Artifacts
Facial Recognition on a Jetson TX1 in Tensorflow
Here’s a way to hack facial recognition system together in relatively short time on NVIDIA’s Jetson TX1.
Deep Learning with Generative and Generative Adverserial Networks – ICLR 2017 Discoveries
This blog post gives an overview of Deep Learning with Generative and Adverserial Networks related papers submitted to ICLR 2017.
Unsupervised Deep Learning – ICLR 2017 Discoveries
This blog post gives an overview of papers related to Unsupervised Deep Learning submitted to ICLR 2017.
You Only Look Twice — Multi-Scale Object Detection in Satellite Imagery With Convolutional Neural Networks
Deep Learning isn’t the brain
iSee: Using deep learning to remove eyeglasses from faces
Decoding The Thought Vector
Algorithmia will help you make your own AI-powered photo filters
Deep Learning Enables You to Hide Screen when Your Boss is Approaching
对偶学习:一种新的机器学习范式
How to Train a GAN? Tips and tricks to make GANs work
While research in Generative Adversarial Networks (GANs) continues to improve the fundamental stability of these models, we use a bunch of tricks to train them and make them stable day to day.
Highlights of IEEE Big Data 2016: Nearest Neighbours, Outliers and Deep Learning
Some CNN visualization tools and techniques
Besides this post, the others written by the author are also worthy of reading.
Deep Learning 2016: The Year in Review
GANs will change the world
colah’s blog
Analysis of Dropout
NIPS 2016 Review
【榜单】GitHub 最受欢迎深度学习应用项目 Top 16(持续更新)
Why use SVM?
TensorFlow Image Recognition on a Raspberry Pi
Building Your Own Deep Learning Box
Vehicle tracking using a support vector machine vs. YOLO
Understanding, generalisation, and transfer learning in deep neural networks
NVIDIA Announces The Jetson TX2, Powered By NVIDIA’s “Denver 2” CPU & Pascal Graphics
Can FPGAs Beat GPUs in Accelerating Next-Generation Deep Learning?
Flexible Image Tagging with Fast0Tag
Eye Fidelity: How Deep Learning Will Help Your Smartphone Track Your Gaze
Using Deep Learning to Find Similar Dresses
Rules of Machine Learning: Best Practices for ML Engineering
Neural Network Architectures
A Brief History of CNNs in Image Segmentation: From R-CNN to Mask R-CNN
晓雷机器学习笔记
Image Classification with 5 methods
How do Convolutional Neural Networks work?
基于深度卷积神经网络进行人脸识别的原理是什么
10 Deep Learning projects based on Apache MXNet
Off the Convex Path
图像风格迁移(Neural Style)简史
ML notes: Why the log-likelihood?
Generative Adversarial Networks (GANs): Engine and Applications
Compressing deep neural nets
Sigmoidal
A Gentle Introduction to the Bag-of-Words Model
Fantastic GANs and where to find them
Fantastic GANs and where to find them II
Today we present the Mapillary Vistas Dataset—the world’s largest and most diverse publicly available, pixel-accurately and instance-specifically annotated street-level imagery dataset for empowering autonomous mobility and transport at the global scale.
The WebVision dataset is designed to facilitate the research on learning visual representation from noisy web data. Our goal is to disentangle the deep learning techniques from huge human labor on annotating large-scale vision dataset. We release this large scale web images dataset as a benchmark to advance the research on learning from web data, including weakly supervised visual representation learning, visual transfer learning, text and vision, etc.
DukeMTMC4ReID dataset is new large-scale real-world person re-id dataset based on DukeMTMC.
Person re-identification has drawn intensive attention in the computer vision society in recent decades. As far as we know, this page collects all public datasets that have been tested by person re-identification algorithms.
Netron is a viewer for neural network, deep learning and machine learning models.
Bring Deep Learning to small devices An open source deep learning platform for low bit computation
Albumentations fast image augmentation library and easy to use wrapper around other libraries.
FeatherCNN
FeatherCNN is a high performance inference engine for convolutional neural networks.
Caffe
Caffe is a deep learning framework made with expression, speed, and modularity in mind. It is developed by the Berkeley Vision and Learning Center (BVLC) and by community contributors. Yangqing Jia created the project during his PhD at UC Berkeley. Caffe is released under the BSD 2-Clause license.
Caffe2
Caffe2 is a deep learning framework made with expression, speed, and modularity in mind. It is an experimental refactoring of Caffe, and allows a more flexible way to organize computation.
Caffe on Intel
This fork of BVLC/Caffe is dedicated to improving performance of this deep learning framework when running on CPU, in particular Intel® Xeon processors (HSW+) and Intel® Xeon Phi processors
TensorFlow
TensorFlow is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) that flow between them. This flexible architecture lets you deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device without rewriting code. TensorFlow also includes TensorBoard, a data visualization toolkit.
MXNet
MXNet is a deep learning framework designed for both efficiency and flexibility. It allows you to mix the flavours of symbolic programming and imperative programming to maximize efficiency and productivity. In its core, a dynamic dependency scheduler that automatically parallelizes both symbolic and imperative operations on the fly. A graph optimization layer on top of that makes symbolic execution fast and memory efficient. The library is portable and lightweight, and it scales to multiple GPUs and multiple machines.
neon
neon is Nervana’s Python based Deep Learning framework and achieves the fastest performance on modern deep neural networks such as AlexNet, VGG and GoogLeNet. Designed for ease-of-use and extensibility.
Piotr’s Computer Vision Matlab Toolbox
This toolbox is meant to facilitate the manipulation of images and video in Matlab. Its purpose is to complement, not replace, Matlab’s Image Processing Toolbox, and in fact it requires that the Matlab Image Toolbox be installed. Emphasis has been placed on code efficiency and code reuse. Thanks to everyone who has given me feedback - you’ve helped make this toolbox more useful and easier to use.
NVIDIA Developer
nvCaffe
A special branch of caffe is used on TX1 which includes support for FP16.
dlib
Dlib is a modern C++ toolkit containing machine learning algorithms and tools for creating complex software in C++ to solve real world problems. It is used in both industry and academia in a wide range of domains including robotics, embedded devices, mobile phones, and large high performance computing environments. Dlib’s open source licensing allows you to use it in any application, free of charge.
OpenCV
OpenCV is released under a BSD license and hence it’s free for both academic and commercial use. It has C++, C, Python and Java interfaces and supports Windows, Linux, Mac OS, iOS and Android. OpenCV was designed for computational efficiency and with a strong focus on real-time applications.
CNNdroid
CNNdroid is an open source library for execution of trained convolutional neural networks on Android devices.
tiny dnn
tiny-dnn is a C++11 implementation of deep learning. It is suitable for deep learning on limited computational resource, embedded systems and IoT devices.
An introduction to this toolkit at《Deep learning with C++ - an introduction to tiny-dnn》by Taiga Nomi
CaffeMex
A multi-GPU & memory-reduced MAT-Caffe on LINUX and WINDOWS
ARCore ARCore is a platform for building augmented reality apps on Android. ARCore uses three key technologies to integrate virtual content with the real world as seen through your phone’s camera
CNTK Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit.
ONNX ONNX is a open format to represent deep learning models. With ONNX, AI developers can more easily move models between state-of-the-art tools and choose the combination that is best for them. ONNX is developed and supported by a community of partners.
PyToune is a Keras-like framework for PyTorch and handles much of the boilerplating code needed to train neural networks.
Deep Learning Studio - Desktop DeepCognition.ai is a single user solution that runs locally on your hardware. Desktop version allows you to train models on your GPU(s) without uploading data to the cloud. The platform supports transparent multi-GPU training for up to 4 GPUs. Additional GPUs are supported in Deep Learning Studio – Enterprise.