Assignment2_2作业主要是引导新手从头开始建一个用逻辑回归分类器( classififier, LR)来识别图像中是否有猫。简单来说,逻辑回归是二值分类器,可以看作是仅含有一个神经元的单层神经网络。
调用该文件中的load_dataset()函数可获得输入、输出数据和类别(train_set_x_orig,train_set_y,test_set_x_orig,test_set_y, classes)
train_set_x_orig, train_set_y, test_set_x_orig, test_set_y, classes = load_dataset()
train_set_x_orig, train_set_y, test_set_x_orig, test_set_y, classes = load_dataset()
###reshape images[x,x,3] in a numpy-array[x*x*3,1]
###each column represents a flattened image
m_train = train_set_x_orig.shape[0]
m_test = test_set_x_orig.shape[0]
num_px = train_set_x_orig.shape[1]
train_set_x_flatten = train_set_x_orig.reshape(m_train,-1).T ##equals to reshape(m_train, num_px*num_px*3)
test_set_x_flatten = test_set_x_orig.reshape(m_test,-1).T
##center and standardize the data
train_set_x = train_set_x_flatten / 255.
test_set_x = test_set_x_flatten / 255.
其中nx表示特征数,表示当输入为x时,y =1的概率。上面说到,逻辑回归是二分类,所以y的取值为{0,1}。的计算公式如下:
即用sigmoid方程计算,也可说是激励函数(Activation function):
当已知{(),.....,()}(m是样本数目),我们希望的是,因此我们需要计算损失函数的值,并最小化损失函数。这里面有两个概念,Lost(error) function 和Cost function。
Lost function 是定义在单个样本上的函数:
Cost function 是定义在整个样本集上的函数,也可以看作是Lost function的均值:
那么怎么找到最佳参数呢?用Back Propagation来计算w和b的值,再用Front Propagation计算cost的值,然后重复此过程直至迭代停止。而计算w,b的值就需要用到梯度下降法了。我们根据式(5)分别求出w和b的偏导数,每次迭代中更新w和b的值:
##definition of sigmoid function
def sigmoid_function(z):
s = 1 / (1 + np.exp(-z))
return s
##initializing parameters w&b, create a vector of zeros of shape((dim,1),type = float64)
def initiolize_with_zeros(dim):
w = np.zeros((dim,1))
b = 0
return w, b
def propagation(w, b, x ,y):
##forward propagation
y_hat = sigmoid_function(np.dot(w.T,x) + b)
y_diff = y_hat - y
L = -(y * np.log(y_hat) + (1 - y) * np.log(1 - y_hat)) ##Loss function
cost = np.sum(L) / x.shape[1]
##backward propagation
dw = np.dot(x, y_diff.T) / x.shape[1]
db = np.sum(y_diff) / x.shape[1]
assert(dw.shape == w.shape)
assert(db.dtype == float)
cost = np.squeeze(cost)
assert(cost.shape == ())
##save as dictionary
grads = {"dw" : dw, "db" : db}
return grads, cost
##optimization, learn w&b by minimizing the cost
##update parameters using gradient descent
def optimize(w, b, x, y, num_iterations, learning_rate):
costs = []
best_cost = np.array([1.])
best_params = {}
decay = 0.999 ##decay of learning_rate
for i in range(num_iterations):
grads, cost = propagation(w, b ,x ,y)
dw = grads["dw"]
db = grads["db"]
##update params
w = w - learning_rate * dw
b = b - learning_rate * db
learning_rate *= decay
##record cost every 100 iteration
if i % 100 == 0:
# print "cost after iteration %d: %+f" %(i, cost)
# print "learning_rate:%f"%learning_rate
##when the data_set is big enough
##save the params at the smallest cost
if cost < best_cost:
best_cost = cost
best_params["w"] = w
best_params["b"] = b
print "best cost : %f"%best_cost
params = {"w" : w, "b" : b, "learning_rate" : learning_rate, "best_w" : best_params["w"], \
"best_b" : best_params["b"]}
grads = {"dw" : dw , "db" : db}
return params, grads, costs
##step1:calculate y_hat
def predict(w, b, x):
y_hat = sigmoid_function(np.dot(w.T,x) + b)
assert(y_hat.shape[1] == x.shape[1])
y_pred = np.zeros((1,y_hat.shape[1]))
for i in range(y_hat.shape[1]):
if y_hat[:,i] <= 0.5:
y_pred[:,i] = 0
y_pred[:,i] = 1
return y_pred
##(4)merge all functions into a model
def model(x_train, y_train, x_test, y_test, num_iterations = 2000, \
learning_rate = 0.05):
features_num = x_train.shape[0]
w, b = initiolize_with_zeros(features_num)
params, grads, costs = optimize(w, b, x_train, y_train, \
num_iterations , learning_rate)
w, b, learning_rate = params["w"],params["b"],params["learning_rate"]
y_pred_train = predict(w, b, x_train)
y_pred_test = predict(w, b, x_test)
accuracy_train = 100 - np.mean(np.abs(y_pred_train - y_train) * 100)
accuracy_test = 100 - np.mean(np.abs(y_pred_test - y_test) * 100)
##predict y_hat with best_params
best_w, best_b = params["best_w"], params["best_b"]
best_y_pred_train = predict(best_w, best_b, x_train)
best_y_pred_test = predict(best_w, best_b, x_test)
best_accuracy_train = 100 - np.mean(np.abs(best_y_pred_train - y_train) * 100)
best_accuracy_test = 100 - np.mean(np.abs(best_y_pred_test - y_test) * 100)
##comparison between last w&b and best w&b
print "learning_rate : %f"%learning_rate
print "train accuracy -- %f%% : %f%%"%(accuracy_train,best_accuracy_train)
print "test accuracy -- %f%% : %f%%"%(accuracy_test, best_accuracy_test)
result = {"costs" : costs, "y_pred_test" : y_pred_test, \
"y_pred_train" : y_pred_train, "w" : w, "b" : b, \
"learning_rate" : learning_rate, "num_iterations" : num_iterations}
return result