Stacked Denoising and Stacked Convolutional Autoencoders

An Evaluation of Transformation Robustness for Spatial Data Representations


This blog mainly want to take some note from the work of SDA & SCA from TUM ---- https://mediatum.ub.tum.de/doc/1381852/54858742554.pdf

I think they got the following conclusion.

** SCA yield image features that are more useful for a specific purpose, namely the image classification with a subsequent SVM classifier. 

** Secondly, SCA features exhibit a higher degree of invariance to input transformations than those representations generated by an SDA.

** the reason for the result is likely a consequence of SCA preserving and exploiting the spatial structure of the input data, while also coming with pooling which by design lead to a degree of invariance.


1 Introduction 

In today's CV task often related a standard procedure which is feature extraction and representation, because you use it like NLP problems like image embedding once you have the overall feature space. Namely you could use it as a component of lots unsupervised and reinforcement learning tasks.

However there are lots of problems around the topic of feature representation in CV. Such as light sources and object shapes and object surface that is complex. 

Some good example of the traditional method is PCA. While PCA have been frequently used for feature extraction purposes, their capabilities are limited as they are not able to perform nonlinear transformations.Generally highly non-linear functions of the raw input.

the treatment is autoencoder "nonlinear generalization of PCA"

Masci et al. [3] combine CNNs with SAEs to construct stackedconvolutional autoencoders(SCA) for feature extraction. The approach promises to include CNN characteristics, which currently significantly outperform all other solutions in the realm of image processing. Additionally, however, SCA can be trained in unsupervised fashion and then be used for either extracting features or initializing the parameters of a traditional CNN.

Common Autoencoder Approach

(a) Triangular Shape

(b) Rectangular Shape

SCA architecture

conclusion


SCA are able to exploit spatial relations and come with max pooling layers that naturally increase transformation invariance, they may be a better choice for representation learning in the realm of visual data than SDA

你可能感兴趣的:(Stacked Denoising and Stacked Convolutional Autoencoders)