CIFAR10+卷积神经网络TensorFlow

文章目录:

  • 代码
    • 读取数据集
    • 卷积神经网络1
    • 卷积神经网络2
  • 遇到的问题
  • 总结

使用卷积神经网络对CIFAR10数据集进行分类。

运行平台: Linux
Python版本: Python 3.6
TensorFlow版本: 1.15.2
IDE: Colab

       在上次的全连接网络的基础上对CIFAR10进行分类CIFAR10+全连接网络TensorFlow(有需要的同学可以先看这个),一些基本的部分如解压和数据集就不再介绍了。

代码

读取数据集

%tensorflow_version 1.x
import tensorflow as tf
import numpy as np
import time
import os

print(tf.__version__)
!/opt/bin/nvidia-smi
TensorFlow 1.x selected.
1.15.2
Mon May 11 13:18:03 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.67       Driver Version: 418.67       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla P100-PCIE...  Off  | 00000000:00:04.0 Off |                    0 |
| N/A   52C    P0    31W / 250W |      0MiB / 16280MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
# 训练数据集载入函数
def train_data_load(path):
  import pickle
  
  train_dataset = {'train_images':[], 'train_labels':[]}
  for batch_num in range(1, 6):
    with open('{0}/data_batch_{1}'.format(path, batch_num), 'rb') as f:
      data = pickle.load(f, encoding='bytes')
    train_dataset['train_images'].append(data[b'data'])
    train_dataset['train_labels'].append(data[b'labels'])
  train_dataset['train_images'] = np.concatenate(train_dataset['train_images'])
  train_dataset['train_labels'] = np.concatenate(train_dataset['train_labels'])
  return train_dataset


# 测试数据集载入函数
def test_data_load(path):
  import pickle

  test_dataset = {'test_images':[], 'test_labels':[]}
  with open('{0}/test_batch'.format(path), 'rb') as f:
    data = pickle.load(f, encoding='bytes')
  test_dataset['test_images'] = data[b'data']
  test_dataset['test_labels'] = data[b'labels']
  
  return test_dataset


# 读入测试和训练集
train_dataset = train_data_load('/content/drive/My Drive/cifar-10-batches-py')
test_dataset = test_data_load('/content/drive/My Drive/cifar-10-batches-py')

# 将数据集reshape符合的形状
train_images = np.reshape(train_dataset['train_images'], [-1,32,32,3])
train_labels = np.array(train_dataset['train_labels'])
test_images = np.reshape(test_dataset['test_images'], [-1,32,32,3])
test_labels = np.array(test_dataset['test_labels'])

# 查看各个的形状
print(train_images.shape)
print(train_labels.shape)
print(test_images.shape)
print(test_labels.shape)
(50000, 32, 32, 3)
(50000,)
(10000, 32, 32, 3)
(10000,)

卷积神经网络1

两层卷积层、两层最大池化层、两层全连接层

# 设置占位符
images_placeholder = tf.placeholder(tf.float32, [None,32,32,3])
labels_placeholder = tf.placeholder(tf.int64, [None])

# 权值变量初始化函数
def weights_variable(shape, stddev):
  return tf.Variable(tf.truncated_normal(shape, stddev=stddev))

# 偏置值初始化函数
def biases_variable(shape, stddev):
  return tf.Variable(tf.truncated_normal(shape, stddev=stddev))

# 卷积初始化函数
def conv2d(x, W):
  return tf.nn.conv2d(x, W, strides=[1,1,1,1], padding='SAME')

# 最大池化初始化函数
def max_pool(x):
  return tf.nn.max_pool(x, ksize=[1,2,2,1], strides=[1,2,2,1], padding='SAME')

# 正则化
def LRnorm(x):
  return tf.nn.lrn(x, 4, bias=1.0, alpha=0.001/9.0, beta=0.75)

# 卷积层1
weights_1 = weights_variable([3,3,3,10], 0.1)
biases_1 = biases_variable([10], 0.1)
conv_1 = conv2d(images_placeholder, weights_1)
relu_1 = tf.nn.relu(conv_1 + biases_1)
LRnorm_1 = LRnorm(relu_1)
print(LRnorm_1.shape)

# 卷积层2
weights_2 = weights_variable([3,3,10,10], 0.1)
biases_2 = biases_variable([10], 0.1)
conv_2 = conv2d(relu_1, weights_2)
relu_2 = tf.nn.relu(conv_2 + biases_2)
LRnorm_2 = LRnorm(relu_2)
max_pool_2 = max_pool(LRnorm_2)
print(max_pool_2.shape)

# 全连接层1
max_pool_2_flatten = tf.reshape(max_pool_2, [-1,16*16*10])
weights_fc1 = weights_variable([16*16*10,100], 0.1)
biases_fc1 = biases_variable([100], 0.1)
features_fc1 = tf.matmul(max_pool_2_flatten, weights_fc1) + biases_fc1
relu_fc1 = tf.nn.relu(features_fc1)
print(relu_fc1.shape)

# 全连接层2
weights_fc2 = weights_variable([100,10], 0.1)
biases_fc2 = biases_variable([10], 0.1)
features_fc2 = tf.matmul(relu_fc1, weights_fc2) + biases_fc2
print(features_fc2.shape)
(?, 32, 32, 10)
(?, 16, 16, 10)
(?, 100)
(?, 10)
# 定义批量大小
batch_size = 2000
n_batch = 50000 // batch_size

# 损失函数和优化器
loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(logits=features_fc2, labels=labels_placeholder))
train_step = tf.train.GradientDescentOptimizer(3e-1).minimize(loss)

# 定义准确度
correct_prediction = tf.equal(tf.argmax(features_fc2, 1), labels_placeholder)
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

# 初始化函数
init = tf.global_variables_initializer()


# 会话
with tf.Session() as sess:
  sess.run(init)
  
  for epoch in range(1, 1001):
    if epoch % 100 == 0:
      start_time = time.time()

    for n in range(n_batch):
      index = slice(n * batch_size, (n + 1) * batch_size)
      images_batch, labels_batch = train_images[index], train_labels[index]
      _, cross_loss = sess.run([train_step, loss], feed_dict={images_placeholder:images_batch, labels_placeholder:labels_batch})
    acc = sess.run(accuracy, feed_dict={images_placeholder:test_images, labels_placeholder:test_labels})

    if epoch % 100 == 0:
      spend_time = time.time() - start_time
      print('epoch {0}: accuracy={1}, loss={2}, spend_time={3}s'.format(epoch, acc, cross_loss, spend_time))

epoch 100: accuracy=0.29809999465942383, loss=1.9919211864471436, spend_time=1.129845142364502s
epoch 200: accuracy=0.32330000400543213, loss=1.9389268159866333, spend_time=1.1280453205108643s
epoch 300: accuracy=0.3481999933719635, loss=1.8605142831802368, spend_time=1.1415705680847168s
epoch 400: accuracy=0.3637000024318695, loss=1.9034373760223389, spend_time=1.1427974700927734s
epoch 500: accuracy=0.3601999878883362, loss=1.8122082948684692, spend_time=1.14522385597229s
epoch 600: accuracy=0.3691999912261963, loss=1.8035513162612915, spend_time=1.1477584838867188s
epoch 700: accuracy=0.3700999915599823, loss=1.804998517036438, spend_time=1.1441717147827148s
epoch 800: accuracy=0.3939000070095062, loss=1.7281118631362915, spend_time=1.13543701171875s
epoch 900: accuracy=0.3878999948501587, loss=1.754409909248352, spend_time=1.1523475646972656s

卷积神经网络2

       上面的神经网络准确率并不高,所以对上面的网络进行一些修改并引如dropout减少过拟合。
       4个卷积层、2个最大池化层、2个全连接层,并使用dropout

# 设置占位符
images_placeholder = tf.placeholder(tf.float32, [None,32,32,3])
labels_placeholder = tf.placeholder(tf.int64, [None])
keep_prob = tf.placeholder(tf.float32)

# 权值变量初始化函数
def weights_variable(shape, stddev):
  return tf.Variable(tf.truncated_normal(shape, stddev=stddev))

# 偏置值初始化函数
def biases_variable(shape, stddev):
  return tf.Variable(tf.truncated_normal(shape, stddev=stddev))

# 卷积初始化函数
def conv2d(x, W):
  return tf.nn.conv2d(x, W, strides=[1,1,1,1], padding='SAME')

# 最大池化初始化函数
def max_pool(x):
  return tf.nn.max_pool(x, ksize=[1,2,2,1], strides=[1,2,2,1], padding='SAME')


# 卷积层1
weights_1_1 = weights_variable([3,3,3,32], 0.1)
biases_1_1 = biases_variable([32], 0.1)
conv_1_1 = conv2d(images_placeholder, weights_1_1) + biases_1_1
relu_1_1 = tf.nn.relu(conv_1_1)
dropout_1_1 = tf.nn.dropout(relu_1_1, keep_prob)
print(dropout_1_1.shape)

weights_1_2 = weights_variable([3,3,32,64], 0.1)
biases_1_2 = biases_variable([64], 0.1)
conv_1_2 = conv2d(relu_1_1, weights_1_2) + biases_1_2
dropout_1_2 = tf.nn.dropout(conv_1_2, keep_prob)
max_pool_1 = max_pool(dropout_1_2)
print(max_pool_1.shape)


# 卷积层2
weights_2_1 = weights_variable([3,3,64,32], 0.01)
biases_2_1 = biases_variable([32], 0.01)
conv_2_1 = conv2d(max_pool_1, weights_2_1) + biases_2_1
relu_2_1 = tf.nn.relu(conv_2_1)
dropout_2_1 = tf.nn.dropout(relu_2_1, keep_prob)
print(dropout_2_1.shape)

weights_2_2 = weights_variable([3,3,32,16], 0.01)
biases_2_2 = biases_variable([16], 0.01)
conv_2_2 = conv2d(relu_2_1, weights_2_2) + biases_2_2
relu_2_2 = tf.nn.relu(conv_2_2)
dropout_2_2 = tf.nn.dropout(relu_2_2, keep_prob)
max_pool_2 = max_pool(dropout_2_2)
print(max_pool_2.shape)


# 全连接层
max_pool_flatten = tf.reshape(max_pool_2, [-1,8*8*16])
weights_3 = weights_variable([8*8*16,128], 0.01)
biases_3 = biases_variable([128], 0.01)
relu_3 = tf.nn.relu(tf.matmul(max_pool_flatten, weights_3) + biases_3)
print(relu_3.shape)

weights_4 = weights_variable([128,10], 0.01)
biases_4 = biases_variable([10], 0.01)
logits = tf.matmul(relu_3, weights_4) + biases_4
print(logits.shape)
(?, 32, 32, 32)
(?, 16, 16, 64)
(?, 16, 16, 32)
(?, 8, 8, 16)
(?, 128)
(?, 10)
# 定义批量大小
batch_size = 2000
n_batch = 50000 // batch_size

# 损失函数和优化器
loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits, labels=labels_placeholder))
train_step = tf.train.GradientDescentOptimizer(1.3e-3).minimize(loss)

# 定义准确度
correct_prediction = tf.equal(tf.argmax(logits, 1), labels_placeholder)
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

# 初始化函数
init = tf.global_variables_initializer()


# 会话
with tf.Session() as sess:
  sess.run(init)
  total_start = time.time()
  
  for epoch in range(1, 2501):
    if epoch % 100 == 0:
      start_time = time.time()

    for n in range(n_batch):
      index = slice(n * batch_size, (n + 1) * batch_size)
      images_batch, labels_batch = train_images[index], train_labels[index]
      _, cross_loss = sess.run([train_step, loss], feed_dict={images_placeholder:images_batch, labels_placeholder:labels_batch, keep_prob:0.5})
    
    if epoch % 100 == 0:
      acc = sess.run(accuracy, feed_dict={images_placeholder:test_images, labels_placeholder:test_labels, keep_prob:1.0})
      train_acc = sess.run(accuracy, feed_dict={images_placeholder:train_images[:10000], labels_placeholder:train_labels[:10000], keep_prob:1.0})
      spend_time = (time.time() - start_time)
      print('epoch {0}: acc={1}, train_acc={2} loss={3}, spend_time={4}s'.format(
          epoch, round(acc, 4), round(train_acc, 4), round(cross_loss, 4), round(spend_time, 4)))
  print('total time:', time.time() - total_start)
!/opt/bin/nvidia-smi
epoch 100: acc=0.44440001249313354, train_acc=0.46230000257492065 loss=1.5571999549865723, spend_time=3.045s
epoch 200: acc=0.5058000087738037, train_acc=0.5239999890327454 loss=1.3623000383377075, spend_time=3.0011s
epoch 300: acc=0.5404000282287598, train_acc=0.5665000081062317 loss=1.2599999904632568, spend_time=2.9813s
epoch 400: acc=0.5759999752044678, train_acc=0.6057999730110168 loss=1.1812000274658203, spend_time=2.9894s
epoch 500: acc=0.5843999981880188, train_acc=0.6220999956130981 loss=1.087499976158142, spend_time=2.9768s
epoch 600: acc=0.5942000150680542, train_acc=0.6414999961853027 loss=1.0327999591827393, spend_time=2.966s
epoch 700: acc=0.6067000031471252, train_acc=0.6574000120162964 loss=0.9991999864578247, spend_time=2.9895s
epoch 800: acc=0.6279000043869019, train_acc=0.6883999705314636 loss=0.9380000233650208, spend_time=2.9679s
epoch 900: acc=0.628000020980835, train_acc=0.6955000162124634 loss=0.9157000184059143, spend_time=2.9899s
epoch 1000: acc=0.6384999752044678, train_acc=0.7186999917030334 loss=0.9246000051498413, spend_time=2.9645s
epoch 1100: acc=0.6434999704360962, train_acc=0.7261000275611877 loss=0.8478999733924866, spend_time=2.9789s
epoch 1200: acc=0.6438000202178955, train_acc=0.7386999726295471 loss=0.8327999711036682, spend_time=2.9629s
epoch 1300: acc=0.6460999846458435, train_acc=0.7479000091552734 loss=0.7997999787330627, spend_time=2.9818s
epoch 1400: acc=0.6478000283241272, train_acc=0.7581999897956848 loss=0.8011999726295471, spend_time=2.9604s
epoch 1500: acc=0.6498000025749207, train_acc=0.7641000151634216 loss=0.7519999742507935, spend_time=2.967s
epoch 1600: acc=0.6527000069618225, train_acc=0.7731000185012817 loss=0.7523000240325928, spend_time=2.9902s
epoch 1700: acc=0.652899980545044, train_acc=0.7773000001907349 loss=0.720300018787384, spend_time=2.9887s
epoch 1800: acc=0.6473000049591064, train_acc=0.7809000015258789 loss=0.6916999816894531, spend_time=2.9718s
epoch 1900: acc=0.6546000242233276, train_acc=0.7960000038146973 loss=0.692799985408783, spend_time=2.9756s
epoch 2000: acc=0.6424000263214111, train_acc=0.7843999862670898 loss=0.6625000238418579, spend_time=2.9677s
epoch 2100: acc=0.6549000144004822, train_acc=0.8105000257492065 loss=0.6502000093460083, spend_time=2.9705s
epoch 2200: acc=0.6466000080108643, train_acc=0.8051000237464905 loss=0.6184999942779541, spend_time=2.9707s
epoch 2300: acc=0.6388000249862671, train_acc=0.7962999939918518 loss=0.6225000023841858, spend_time=2.9724s
epoch 2400: acc=0.6539999842643738, train_acc=0.8223999738693237 loss=0.578499972820282, spend_time=2.9974s
epoch 2500: acc=0.6381000280380249, train_acc=0.8055999875068665 loss=0.6011999845504761, spend_time=3.0051s
total time: 6383.969125747681
Mon May 11 15:52:57 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.67       Driver Version: 418.67       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla P100-PCIE...  Off  | 00000000:00:04.0 Off |                    0 |
| N/A   66C    P0    43W / 250W |   9573MiB / 16280MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

遇到的问题

  1. 对数据处理还有一些问题,一开始对数据合并一直都问题,参考了其他人的代码解决。
  2. 一开始对卷积神经网络1并没有使用LRnorm正则化,导致acc和loss一直不变,可能是过拟合了。加入正则化或dropout就可以解决。
  3. 卷积神经网络2的一开始只是用了1层全连接层,直接从8x8x16直接降到10类的效果并不好,所以多引进1层全连接层从8x8x16降到128最后降到10类

总结

       通过卷积神经网络对CIFAR10进行分类,较全连接网络有一个比较大的提高从27%提升高65%左右。不过虽然训练了2500次但是网络在1000次左右就已经趋向收敛了。
       接下来将使用一些网上的模型对CIFAR10分类准确性进一步提高。

你可能感兴趣的:(Deep,Learning,卷积,神经网络,tensorflow,深度学习)