(1) 用你熟知的语言(尽量使用python)实现感知器的算法,并在给定的数据集上训练。
(2) 在测试集上用训练好的感知器模型进行测试,并将预测结果以csv格式保存为一行预测的分类。
(3) 简要说明算法原理,记录实验过程的关键步骤,以及实验过程中遇到的问题和解决方法。
(4) 尝试与上次实验进行数据互换对比两种算法。
给定训练集 D = { ( x ( 1 ) , y ( 1 ) ) , ( x ( 2 ) , y ( 2 ) ) , … … , ( x ( N ) , y ( N ) ) } D = \{(\mathbf x^{(1)},y^{(1)}),(\mathbf x^{(2)},y^{(2)}),……,(\mathbf x^{(N)},y^{(N)})\} D={(x(1),y(1)),(x(2),y(2)),……,(x(N),y(N))} ,其中 x ( n ) ∈ X \mathbf x^{(n)} \in X x(n)∈X , y ( n ) ∈ Y \mathbf y^{(n)} \in Y y(n)∈Y , n = 1 , 2 , 3 , … … , N n = 1,2,3,……,N n=1,2,3,……,N
构建线性分类器,即学习一个由属性的线性组合构成的函数。对于一个给定的 D D D维样本 x = [ x 1 , … … , x D ] T \mathbf x = [x_1,……,x_D]^T x=[x1,……,xD]T ,其线性组合为
f ( x ) = w 1 x 1 + w 2 x 2 + … … + w D x D + b f(\mathbf x) = w_1x_1+w_2x_2+……+w_Dx_D+b f(x)=w1x1+w2x2+……+wDxD+b
f ( x ) = w T x + b f(\mathbf x) =\mathbf {w^T x}+b f(x)=wTx+b
其中 w \mathbf w w为 D D D维的权重向量, b b b为偏置。
s i g n ( f ( x ) ) = s i g n ( w T x + b ) = { + 1 , f(x) ≥ 0 − 1 , f(x) < 0 sign(f(\mathbf x)) = sign(\mathbf {w^T x}+b) = \begin{cases} +1, & \text{f(x) $\ge$ 0} \\ -1, & \text{f(x) $\lt$ 0} \end{cases} sign(f(x))=sign(wTx+b)={+1,−1,f(x) ≥ 0f(x) < 0
根据训练内容可知,我们需要训练的参数为 w \mathbf w w和 b b b
y ( n ) ( w T x ( n ) + b ) > 0 y^{(n)}(\mathbf{w^T x^{(n)}}+b)>0 y(n)(wTx(n)+b)>0
y ( n ) ( w T x ( n ) + b ) < 0 y^{(n)}(\mathbf{w^T x^{(n)}}+b)<0 y(n)(wTx(n)+b)<0
L ( w , b ) = − ∑ x ( n ) ∈ Q y ( n ) ( w T x ( n ) + b ) L(\mathbf w,b) = -\sum_{x^{(n)}\in Q}y^{(n)}(\mathbf {w^T x^{(n)}}+b) L(w,b)=−x(n)∈Q∑y(n)(wTx(n)+b)
KaTeX parse error: Undefined control sequence: \part at position 9: \frac {\̲p̲a̲r̲t̲ ̲L(\mathbf w,b)}…
利用梯度对参数 w \mathbf w w和 b b b进行更新:
w ← w + η y ( n ) x ( n ) b ← b + η y ( n ) \mathbf w \leftarrow \mathbf w + \eta y^{(n)}\mathbf x^{(n)} \\ b \leftarrow b + \eta y^{(n)} w←w+ηy(n)x(n)b←b+ηy(n)
其中, η ( 0 < η ≤ 1 ) \eta(0 \lt \eta \le 1) η(0<η≤1)是步长,也称为学习率。
如此以来,通过多次迭代后,损失函数 L ( w , b ) L(\mathbf w,b) L(w,b)不断减小,直到满足收敛条件(例如, L ( w , b ) = 0 L(\mathbf w,b)=0 L(w,b)=0),完成分类。
我们可以将权重向量和偏置看作时一个超平面 S S S的参数,感知机可以理解为最小化错误分类样本离超平面的距离。错误分类样本 x ( n ) \mathbf x^{(n)} x(n)到超平面的距离为:
− 1 ∥ w ∥ y ( n ) ( w T x ( n ) + b ) -\frac {1}{\parallel \mathbf w \parallel}y^{(n)}(\mathbf {w^T x^{(n)}}+b) −∥w∥1y(n)(wTx(n)+b)
∑ x ( n ) ∈ Q − 1 ∥ w ∥ y ( n ) ( w T x ( n ) + b ) \sum_{\mathbf x^{(n)} \in Q}-\frac {1}{\parallel \mathbf w \parallel}y^{(n)}(\mathbf {w^T x^{(n)}}+b) x(n)∈Q∑−∥w∥1y(n)(wTx(n)+b)
仔细观察,当我们忽略 1 ∥ w ∥ \frac {1}{\parallel \mathbf w \parallel} ∥w∥1后,就可以得到感知机的损失函数,也就补充说明了损失函数定义的由来。
**输入:**训练集 D = { ( x ( 1 ) , y ( 1 ) ) , ( x ( 2 ) , y ( 2 ) ) , … … , ( x ( N ) , y ( N ) ) } D = \{(\mathbf x^{(1)},y^{(1)}),(\mathbf x^{(2)},y^{(2)}),……,(\mathbf x^{(N)},y^{(N)})\} D={(x(1),y(1)),(x(2),y(2)),……,(x(N),y(N))} ,其中 x ( n ) ∈ X \mathbf x^{(n)} \in X x(n)∈X , y ( n ) ∈ Y \mathbf y^{(n)} \in Y y(n)∈Y , n = 1 , 2 , 3 , … … , N n = 1,2,3,……,N n=1,2,3,……,N ;步长 η ( 0 < η ≤ 1 ) \eta(0 \lt \eta \le 1) η(0<η≤1) .
初始化: w 0 = 0 , b 0 = 0 \mathbf w_0 = 0,b_0 = 0 w0=0,b0=0 ;
r e p e a t \mathbf repeat repeat
选取训练集中的一个样本 ( x ( n ) , y ( n ) ) (x^{(n)},y^{(n)}) (x(n),y(n));
i f y ( n ) ( w T x ( n ) + b ) ≤ 0 t h e n \mathbf if \ \ \ y^{(n)}(\mathbf {w^T x^{(n)}}+b) \le 0 \ \ \mathbf then if y(n)(wTx(n)+b)≤0 then
w ← w + η y ( n ) x ( n ) \mathbf w \leftarrow \mathbf w + \eta y^{(n)}\mathbf x^{(n)} w←w+ηy(n)x(n)
b ← b + η y ( n ) b \leftarrow b + \eta y^{(n)} b←b+ηy(n)
e n d i f \mathbf {end \ \ if} end if
u n t i l \mathbf until until 满足收敛条件;
输出: w , b \mathbf w,b w,b
在感知机算法中,采用错误驱动的迭代方法, 用类似于贪心的策略调整参数,直到可以被正确分类。但当给定的数据集无法二分类时,会无法收敛,不能得出正确结果。
# 从外部文件提取数据(数据格式:a、b、c、d、label)
# 构建数据集data_set和标签集data_label
data_set = []
data_label = []
with open('./iris/iris_train.csv') as f:
for line in f.readlines():
# print( line.strip() )
line = line.strip().split(',')
for i in range(len(line)-1):
line[i] = float(line[i])
# print(line)
# 为方便计算,将'setosa'记为1,'versicolor'和'virginica'记为-1
if line[-1] == 'setosa':
# print(data_label, '\n')
# print(data_set, '\n')
data = np.array(data_set)
label = np.array(data_label)
# print(data)
# print(label)
# 初始化权重、偏置、步长
w = np.array([0, 0, 0, 0])
b = 0
alpha = 1
# 计算 (data*w+b)*label,寻找分类错误的点
score = (np.dot(data, w.T) + b) * label
idx = np.where(score <= 0)
loss = np.sum(score[idx])
print('The Loss:%d' % loss)
# print(score)
# print(idx)
# 使用随机梯度下降法求解权重、偏置
iteration = 1
while score[idx].size != 0:
point = np.random.randint((score[idx].shape[0]))
x = data[idx[0][point], :]
y = label[idx[0][point]]
w = w + alpha * y * x
b = b + alpha * y
# print(x)
# print(y)
# print(w)
# print(b)
score = (np.dot(data, w.T) + b) * label
idx = np.where(score <= 0)
loss = np.sum(score[idx])
print('Iteration:%d w:%s b:%s loss:%d' % (iteration, w, b, loss))
iteration = iteration + 1
# 绘图显示
index_p = np.where(label == 1)
index_n = np.where(label != 1)
data_p = data[index_p]
data_n = data[index_n]
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
x_p = np.transpose(data_p[:, 0])
y_p = np.transpose(data_p[:, 1])
z_p = np.transpose(data_p[:, 2])
s_p = np.transpose(data_p[:, 3])
x_n = np.transpose(data_n[:, 0])
y_n = np.transpose(data_n[:, 1])
z_n = np.transpose(data_n[:, 2])
s_n = np.transpose(data_n[:, 3])
ax.scatter(x_p, y_p, z_p, s=s_p, c='r', marker='^', label='setosa')
ax.scatter(x_n, y_n, z_n, s=s_n, c='b', marker='o', label='versicolor and virginica')
ax.set_xlabel('X Label')
ax.set_ylabel('Y Label')
ax.set_zlabel('Z Label')
print('\nPerceptron learning algorithm is over')
def build_training_set(file):
data_set = []
data_label = []
with open(file) as f:
for line in f.readlines():
line = line.strip().split(',')
for i in range(len(line) - 1):
line[i] = float(line[i])
# 为方便计算,将'setosa'记为1,'versicolor'和'virginica'记为-1
if line[-1] == 'setosa':
data = np.array(data_set)
label = np.array(data_label)
return data, label
def build_test_set(testFile, targetsFile):
data_set = []
data_label = []
with open(testFile) as testfile:
for line in testfile.readlines():
line = line.strip().split(',')
for i in range(len(line)):
line[i] = float(line[i])
with open(targetsFile) as targetsfile:
for line in targetsfile.readlines():
line = line.strip().split(',')
if line[-1] == 'setosa':
data = np.array(data_set)
label = np.array(data_label)
return data, label
def training_parameters(data, label):
w = np.array([0, 0, 0, 0])
b = 0
alpha = 1
# 计算 (data*w+b)*label,寻找分类错误的点
score = (np.dot(data, w.T) + b) * label
idx = np.where(score <= 0)
# 使用随机梯度下降法求解权重、偏置
# iteration = 1
while score[idx].size != 0:
point = np.random.randint((score[idx].shape[0]))
x = data[idx[0][point], :]
y = label[idx[0][point]]
w = w + alpha * y * x
b = b + alpha * y
score = (np.dot(data, w.T) + b) * label
idx = np.where(score <= 0)
# loss = np.sum(score[idx])
# print('Iteration:%d w:%s b:%s loss:%d' % (iteration, w, b, loss))
# iteration = iteration + 1
return w, b
def paint3d(data, label):
index_p = np.where(label == 1)
index_n = np.where(label != 1)
data_p = data[index_p]
data_n = data[index_n]
ax = plt.figure().add_subplot(111, projection='3d')
x_p = np.transpose(data_p[:, 0])
y_p = np.transpose(data_p[:, 1])
z_p = np.transpose(data_p[:, 2])
s_p = np.transpose(data_p[:, 3])
x_n = np.transpose(data_n[:, 0])
y_n = np.transpose(data_n[:, 1])
z_n = np.transpose(data_n[:, 2])
s_n = np.transpose(data_n[:, 3])
ax.scatter(x_p, y_p, z_p, s=s_p, c='r', marker='^', label='setosa')
ax.scatter(x_n, y_n, z_n, s=s_n, c='b', marker='o', label='versicolor and virginica')
ax.set_xlabel('X Label')
ax.set_ylabel('Y Label')
ax.set_zlabel('Z Label')
print('\nPerceptron learning algorithm is over!\n')
trainingFile = './iris/iris_train.csv'
testFile = './iris/iris_test.csv'
targetsFile = './iris/iris_test_targets.csv'
trainingData, trainingLabel = build_training_set(trainingFile)
testData, targetsLabel = build_test_set(testFile, targetsFile)
trainingW, trainingB = training_parameters(trainingData, trainingLabel)
paint3d(trainingData, trainingLabel)
# print(trainingData, trainingLabel)
# print(testData, targetsLabel)
testScore = (np.dot(testData, trainingW.T) + trainingB)
testLabel = np.array(np.zeros(testScore.shape[0])) - 1
index = np.where(testScore > 0)
testLabel[index] = 1
# print(testLabel)
wrongScore = testLabel * targetsLabel
wrongNum = -np.sum(np.where(wrongScore <= 0))
print('Number of test errors: %d' % wrongNum)
import numpy as np
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
def build_training_set(file):
data_set = []
data_label = []
with open(file) as f:
for line in f.readlines():
line = line.strip().split(',')
for i in range(len(line) - 1):
line[i] = float(line[i])
# 为方便计算,将'setosa'记为1,'versicolor'和'virginica'记为-1
if line[-1] == 'setosa':
data = np.array(data_set)
label = np.array(data_label)
return data, label
def build_test_set(testFile, targetsFile):
data_set = []
data_label = []
with open(testFile) as testfile:
for line in testfile.readlines():
line = line.strip().split(',')
for i in range(len(line)):
line[i] = float(line[i])
with open(targetsFile) as targetsfile:
for line in targetsfile.readlines():
line = line.strip().split(',')
if line[-1] == 'setosa':
data = np.array(data_set)
label = np.array(data_label)
return data, label
def training_parameters(data, label):
w = np.array([0, 0, 0, 0])
b = 0
alpha = 1
# 计算 (data*w+b)*label,寻找分类错误的点
score = (np.dot(data, w.T) + b) * label
idx = np.where(score <= 0)
# 使用随机梯度下降法求解权重、偏置
# iteration = 1
while score[idx].size != 0:
point = np.random.randint((score[idx].shape[0]))
x = data[idx[0][point], :]
y = label[idx[0][point]]
w = w + alpha * y * x
b = b + alpha * y
score = (np.dot(data, w.T) + b) * label
idx = np.where(score <= 0)
# loss = np.sum(score[idx])
# print('Iteration:%d w:%s b:%s loss:%d' % (iteration, w, b, loss))
# iteration = iteration + 1
return w, b
def paint3d(data, label):
index_p = np.where(label == 1)
index_n = np.where(label != 1)
data_p = data[index_p]
data_n = data[index_n]
ax = plt.figure().add_subplot(111, projection='3d')
x_p = np.transpose(data_p[:, 0])
y_p = np.transpose(data_p[:, 1])
z_p = np.transpose(data_p[:, 2])
s_p = np.transpose(data_p[:, 3])
x_n = np.transpose(data_n[:, 0])
y_n = np.transpose(data_n[:, 1])
z_n = np.transpose(data_n[:, 2])
s_n = np.transpose(data_n[:, 3])
ax.scatter(x_p, y_p, z_p, s=s_p, c='r', marker='^', label='setosa')
ax.scatter(x_n, y_n, z_n, s=s_n, c='b', marker='o', label='versicolor and virginica')
ax.set_xlabel('X Label')
ax.set_ylabel('Y Label')
ax.set_zlabel('Z Label')
print('\nPerceptron learning algorithm is over!\n')
trainingFile = './iris/iris_train.csv'
testFile = './iris/iris_test.csv'
targetsFile = './iris/iris_test_targets.csv'
trainingData, trainingLabel = build_training_set(trainingFile)
testData, targetsLabel = build_test_set(testFile, targetsFile)
trainingW, trainingB = training_parameters(trainingData, trainingLabel)
paint3d(trainingData, trainingLabel)
# print(trainingData, trainingLabel)
# print(testData, targetsLabel)
testScore = (np.dot(testData, trainingW.T) + trainingB)
testLabel = np.array(np.zeros(testScore.shape[0])) - 1
index = np.where(testScore > 0)
testLabel[index] = 1
# print(testLabel)
# print(targetsLabel)
wrongScore = testLabel * targetsLabel
wrongNum = -np.sum(np.where(wrongScore <= 0))
print('Number of test errors: %d' % wrongNum)