python bp神经网络包_Python搭建BP神经网络测试mnist数据集

mnist数据集官网:http://yann.lecun.com/exdb/mnist/

mnist数据集是一个被广泛应用(“嚼烂”)的手写体数字数据集,包含60000个训练样本及10000个测试样本,以字节形式存储。在官网下载到的数据是安装包形式,安装包及其解压后数据形式如下:

我们需要注意的是哪个文件是什么数据集,我将对应关系陈列如下:

t10k-images    :    测试图像数据集

t10k-labels      :测试标签数据集

train-images    :    训练图像数据集

train-labels    :    训练标签数据集

关于标签和图像的对应关系我不在此处表达,因为看到这篇文章的同学们应该都是对数据集有一定了解的同学们。

我在搭建神经网络测试数据的时候,参考了网上很多的代码,也搭建了很多不同的网络,引用mnist数据集的方法也测试了许多次。但也失败了好多,我最终找到了如下方法可以达到预期目标。

首先将mnist数据集转换为CSV格式:(参考网站:https://blog.csdn.net/Albert201605/article/details/79893585)

我将个人转换代码张贴如下:

def convert(imgf, labelf, outf, n):

f = open(imgf,'rb')

o = open(outf,'w')

l = open(labelf,'rb')

f.read(16)

l.read(8)

images = []

for i in range(n):

image = [ord(l.read(1))]

for j in range(28*28):

image.append(ord(f.read(1)))

images.append(image)

for image in images:

o.write(','.join(str(pix)for pixin image) +'\n')

f.close()

o.close()

l.close()

train_image_path ='E:/College/Graduate_Paper/mnist_test/train-images.idx3-ubyte'

train_label_path ='E:/College/Graduate_Paper/mnist_test/train-labels.idx1-ubyte'

test_image_path ='E:/College/Graduate_Paper/mnist_test/t10k-images.idx3-ubyte'

test_label_path ='E:/College/Graduate_Paper/mnist_test/t10k-labels.idx1-ubyte'

convert( train_image_path , train_label_path ,'E:/College/Graduate_Paper/mnist_test/mnist_train.csv' ,60000 )

convert( test_image_path , test_label_path ,'E:/College/Graduate_Paper/mnist_test/mnist_test.csv' ,10000 )

print('Convert finished!')

转换完成后文件格式如下所示:

在此时,我们依旧无法自然语言方式直接读取测试集内的数据。

其次,将CSV格式的数据集读入神经网络进行训练测试:(参考网址:https://blog.csdn.net/ebzxw/article/details/81591437)

代码张贴如下:

import numpy

import scipy.special

class neuralNetwork:

def __init__(self, inputnodes, hiddennodes, outputnodes, learningrate):

self.inodes = inputnodes

self.hnodes = hiddennodes

self.onodes = outputnodes

self.lr = learningrate

self.wih = (numpy.random.normal(0.0,pow(self.hnodes, -0.5), (self.hnodes,self.inodes)))#shape (200,784)

self.who = (numpy.random.normal(0.0,pow(self.onodes, -0.5), (self.onodes,self.hnodes)))#shape (10,200)

self.activation_function =lambda x: scipy.special.expit(x)

pass

print('初始化神经网络完成')

def train(self, inputs_list, targets_list):

inputs = numpy.array(inputs_list,ndmin=2).T#shape (784,1)

targets = numpy.array(targets_list,ndmin=2).T#shape (10,1)

hidden_inputs = numpy.dot(self.wih, inputs)#shape (200,1)

hidden_outputs =self.activation_function(hidden_inputs)

final_inputs = numpy.dot(self.who, hidden_outputs)#shape (10,1)

final_outputs =self.activation_function(final_inputs)

output_errors = targets - final_outputs#shape (10,1)

hidden_errors = numpy.dot(self.who.T, output_errors)#shape (200,1)

self.who +=self.lr * numpy.dot((output_errors * final_outputs * (1.0 - final_outputs)),

numpy.transpose(hidden_outputs))

self.wih +=self.lr * numpy.dot((hidden_errors * hidden_outputs * (1.0 - hidden_outputs)),

numpy.transpose(inputs))

pass

print('神经网络训练完成')

def query(self, inputs_list):

inputs = numpy.array(inputs_list,ndmin=2).T

hidden_inputs = numpy.dot(self.wih, inputs)

hidden_outputs =self.activation_function(hidden_inputs)

final_inputs = numpy.dot(self.who, hidden_outputs)

final_outputs =self.activation_function(final_inputs)

return final_outputs

print('神经网络测试完成')

#设置神经网络初始参数

input_nodes =784    # 28 * 28 = 784

hidden_nodes =200

output_nodes =10

learning_rate =0.1

n = neuralNetwork(input_nodes, hidden_nodes, output_nodes, learning_rate)

print('神经网络参数传入完成')

#训练神经网络

training_data_file =open('E:/College/Graduate_Paper/mnist_test/mnist_train.csv','r')

training_data_list = training_data_file.readlines()

training_data_file.close()

# epochs is the number of times the training data set is used for training

epochs =5

for ein range(epochs):

for recordin training_data_list:

all_values = record.split(',')

inputs = (numpy.asfarray(all_values[1:]) /255.0 *0.99) +0.01

targets = numpy.zeros(output_nodes) +0.01

targets[int(all_values[0])] =0.99

n.train(inputs, targets)

pass

print('%d times train result in the followings:'%e)

test_data_file =open('E:/College/Graduate_Paper/mnist_test/mnist_test.csv','r')

test_data_list = test_data_file.readlines()

test_data_file.close()

scorecard = []

for recordin test_data_list:

all_values = record.split(',')

correct_label =int(all_values[0])

inputs = (numpy.asfarray(all_values[1:]) /255.0 *0.99) +0.01

outputs = n.query(inputs)

label = numpy.argmax(outputs)

if (label == correct_label):

scorecard.append(1)

else:

scorecard.append(0)

pass

scorecard_array = numpy.asarray(scorecard)

print('performance = ', scorecard_array.sum() / scorecard_array.size)

pass

代码运行结果展示如下:

更改参数对神经网络识别正确率影响如下所示:

测试数据仅供参考,转载请注明出处。若有疑问,请私信我(不经常上),看到后会尽快与您讨论。若有侵权,请联系我删除此文。

你可能感兴趣的:(python,bp神经网络包)