GitHub地址:https://github.com/ZJUFangzh/cs231n
个人博客:fangzh.top
抽取图像的HOG和HSV特征。
对于每张图,我们会计算梯度方向直方图(HOG)特征和用HSV(Hue色调,Saturation饱和度,Value明度)颜色空间的色调特征。把每张图的梯度方向直方图和颜色直方图特征合并形成我们最后的特征向量。
粗略的讲呢,HOG应该可以捕捉到图像的纹理特征而忽略了颜色信息,颜色直方图会表示图像的颜色特征而忽略了纹理特征(详细见这篇)。所以我们预估把两者结合起来得到的效果应该是比用其中一种得到的效果好。对于后面的bonus,验证一下这个设想是不错的选择。
hog_feature
和color_histogram_hsv
两个函数都只对一张图做操作并返回这张图片的特征向量。extract_features
函数接收一堆图片和一个list的特征函数,然后用每个特征函数在每张图片上过一遍,把结果存到一个矩阵里面,矩阵的每一行都是一张图片的所有特征的合并。
在features.py中写了两个特征的计算方法,HOG是改写了scikit-image的fog接口,并且首先要转换成灰度图。颜色直方图是实现用matplotlib.colors.rgb_to_hsv的接口把图片从RGB变成HSV,再提取明度(value),把value投射到不同的bin当中去。关于HOG的原理请谷歌百度。
如果出错:
orientation_histogram[:,:,i] = uniform_filter(temp_mag, size=(cx, cy))[cx/2::cx, cy/2::cy].T
这行报错,“TypeError: slice indices must be integers or None or have an index method”,可以把代码改成: orientation_histogram[:,:,i] = uniform_filter(temp_mag, size=(cx, cy))[int(cx/2)::cx, int(cy/2)::cy].T
通过这一步,把原来的数据集都提取出了特征,换成了X_train_feats,X_val_feats,X_test_feats
from cs231n.features import *
num_color_bins = 10 # Number of bins in the color histogram
feature_fns = [hog_feature, lambda img: color_histogram_hsv(img, nbin=num_color_bins)]
X_train_feats = extract_features(X_train, feature_fns, verbose=True)
X_val_feats = extract_features(X_val, feature_fns)
X_test_feats = extract_features(X_test, feature_fns)
# Preprocessing: Subtract the mean feature
mean_feat = np.mean(X_train_feats, axis=0, keepdims=True)
X_train_feats -= mean_feat
X_val_feats -= mean_feat
X_test_feats -= mean_feat
# Preprocessing: Divide by standard deviation. This ensures that each feature
# has roughly the same scale.
std_feat = np.std(X_train_feats, axis=0, keepdims=True)
X_train_feats /= std_feat
X_val_feats /= std_feat
X_test_feats /= std_feat
# Preprocessing: Add a bias dimension
X_train_feats = np.hstack([X_train_feats, np.ones((X_train_feats.shape[0], 1))])
X_val_feats = np.hstack([X_val_feats, np.ones((X_val_feats.shape[0], 1))])
X_test_feats = np.hstack([X_test_feats, np.ones((X_test_feats.shape[0], 1))])
跟之前都一样的,把训练集换成 ***_feats就行了
# Use the validation set to tune the learning rate and regularization strength
from cs231n.classifiers.linear_classifier import LinearSVM
learning_rates = [1e-9, 1e-8, 1e-7]
regularization_strengths = [5e4, 5e5, 5e6]
results = {}
best_val = -1
best_svm = None
learning_rates =[5e-9, 7.5e-9, 1e-8]
regularization_strengths = [(5+i)*1e6 for i in range(-3,4)]
################################################################################
# TODO: #
# Use the validation set to set the learning rate and regularization strength. #
# This should be identical to the validation that you did for the SVM; save #
# the best trained classifer in best_svm. You might also want to play #
# with different numbers of bins in the color histogram. If you are careful #
# you should be able to get accuracy of near 0.44 on the validation set. #
################################################################################
for learning_rate in learning_rates:
for regularization_strength in regularization_strengths:
svm = LinearSVM()
loss_hist = svm.train(X_train_feats, y_train, learning_rate=learning_rate, reg=regularization_strength,
num_iters=1500, verbose=False)
y_train_pred = svm.predict(X_train_feats)
y_val_pred = svm.predict(X_val_feats)
y_train_acc = np.mean(y_train_pred==y_train)
y_val_acc = np.mean(y_val_pred==y_val)
results[(learning_rate,regularization_strength)] = [y_train_acc, y_val_acc]
if y_val_acc > best_val:
best_val = y_val_acc
best_svm = svm
################################################################################
# END OF YOUR CODE #
################################################################################
# Print out results.
for lr, reg in sorted(results):
train_accuracy, val_accuracy = results[(lr, reg)]
print('lr %e reg %e train accuracy: %f val accuracy: %f' % (
lr, reg, train_accuracy, val_accuracy))
print('best validation accuracy achieved during cross-validation: %f' % best_val)
from cs231n.classifiers.neural_net import TwoLayerNet
input_dim = X_train_feats.shape[1]
hidden_dim = 500
num_classes = 10
net = TwoLayerNet(input_dim, hidden_dim, num_classes)
best_net = None
################################################################################
# TODO: Train a two-layer neural network on image features. You may want to #
# cross-validate various parameters as in previous sections. Store your best #
# model in the best_net variable. #
################################################################################
best_val = -1
learning_rates = [1.2e-3, 1.5e-3, 1.75e-3]
regularization_strengths = [1, 1.25, 1.5 , 2]
for lr in learning_rates:
for reg in regularization_strengths:#
# net = TwoLayerNet(input_dim, hidden_dim, num_classes)
loss_hist = net.train(X_train_feats, y_train, X_val_feats, y_val,
num_iters=1000, batch_size=200,
learning_rate=lr, learning_rate_decay=0.95,
reg=reg, verbose=False)
y_train_pred = net.predict(X_train_feats)
y_val_pred = net.predict(X_val_feats)
y_train_acc = np.mean(y_train_pred==y_train)
y_val_acc = np.mean(y_val_pred==y_val)
results[(lr,reg)] = [y_train_acc, y_val_acc]
if y_val_acc > best_val:
best_val = y_val_acc
best_net = net
for lr, reg in sorted(results):
train_accuracy, val_accuracy = results[(lr, reg)]
print('lr %e reg %e train accuracy: %f val accuracy: %f' % (
lr, reg, train_accuracy, val_accuracy))
print('best validation accuracy achieved during cross-validation: %f' % best_val)
################################################################################
# END OF YOUR CODE #
################################################################################