Image Classification

    • Intro to Image Classification data-driven approach pipeline
    • Nearest Neighbor Classifier
      • k-Nearest Neighbor
    • Validation sets Cross-validation hyperparameter tuning
    • ProsCons of Nearest Neighbor

Intro to Image Classification, data-driven approach, pipeline

data-driven approach
training dataset: labeled images
image classification pipeline: input training set(N images labeled with one of K classess)->learning training a classifier/learning a model->evaluation predict labels of a new set of images

Nearest Neighbor Classifier

compare the image pixel by pixle and add up difference. calculate L1 distance/L2 distance etc.

k-Nearest Neighbor/

find the top k closest images->vote on the label
decision boundaries

Validation sets, Cross-validation, hyperparameter tuning

hyperparameters:cannot use test set to tweak hyperparameters
generalization
overfit
tune hyperparameters: split training set in two(validation set (slightly smaller)&training set)->choose best k
cross-validation:iterate over different validation sets, average the performance
**in practice**avoid cross-validation,usually use 50%-90% of training data to train, rest to validate.

Pros/Cons of Nearest Neighbor

just store, take no time to train. predicting takes too much time

你可能感兴趣的:(Image Classification)