用自己的数据集训练卷积神经网络

      在一般讲解如何编写卷积神经网络的教科书如《TensorFlow实战Google深度学习框架》中的例子都是用的MNIST数据集,而数据集的导入是直接由TensorFlow的封装函数 

import tensorflow.examples.tutorials.mnist.input_data as input_data
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

导入的。而且在这个封装函数中训练集和测试集都已经划分好,在把卷积神经网络搭建好过后给神经网络喂数据时直接就可以用

xs, ys = mnist.train.next_batch(batch_size)

一个batch一个bacth地喂给神经网络,这对我这个菜鸟非常不友好。

      以TensorFlow构建的神经网络的输入是一个四维矩阵,即 [batch_size, width, high, channel],来表示一张图, 但是这种四维矩阵不符合我们的直观感受,我们习惯的表示是[batch_size, channel, width, high],因为我习惯的表示是:

>>> a = = np.linspace(1, 50, 50).reshape(1, 2, 5, 5)
>>> a = [[[[ 1.  2.  3.  4.  5.]
           [ 6.  7.  8.  9. 10.]
           [11. 12. 13. 14. 15.]
           [16. 17. 18. 19. 20.]
           [21. 22. 23. 24. 25.]]

         [[26. 27. 28. 29. 30.]
          [31. 32. 33. 34. 35.]
          [36. 37. 38. 39. 40.]
          [41. 42. 43. 44. 45.]
          [46. 47. 48. 49. 50.]]]]

但是要把样本装换为TensorFlow接受的表示时需要用到 np.transpose()

>>> a = np.linspace(1, 50, 50).reshape(1, 2, 5, 5)
>>> b = np.transpose(0,2,3,1)
>>> b = [[[[1. 26.]
            [2. 27.]
            [3. 28.]
            [4. 29.]
            [5. 30.]]

           [[6. 31.]
            [7. 32.]
            [8. 33.]
            [9. 34.]
            [10. 35.]]
            
           [[11. 36.]
            [12. 37.]
            [13. 38.]
            [14. 39.]
            [15. 40.]]

           [[16. 41.]
            [17. 42.]
            [18. 43.]
            [19. 44.]
            [20. 45.]]

           [[21. 46.]
            [22. 47.]
            [23. 48.]
            [24. 49.]
            [25. 50.]]]] 

最后在附上一个 next_batch的代码(亲测,非常实用):

import numpy as np


class DataSet(object):

    def __init__(self, images, labels, num_examples):
        self._images = images
        self._labels = labels
        self._epochs_completed = 0  # 完成遍历轮数
        self._index_in_epochs = 0   # 调用next_batch()函数后记住上一次位置
        self._num_examples = num_examples  # 训练样本数

    def next_batch(self, batch_size, fake_data=False, shuffle=True):
        start = self._index_in_epochs

        if self._epochs_completed == 0 and start == 0 and shuffle:
            index0 = np.arange(self._num_examples)
#            print(index0)
            np.random.shuffle(index0)
#            print(index0)
            self._images = np.array(self._images)[index0]
            self._labels = np.array(self._labels)[index0]
#            print(self._images)
#            print(self._labels)
#            print("-----------------")

        if start + batch_size > self._num_examples:
            self._epochs_completed += 1
            rest_num_examples = self._num_examples - start
            images_rest_part = self._images[start:self._num_examples]
            labels_rest_part = self._labels[start:self._num_examples]
            if shuffle:
                index = np.arange(self._num_examples)
                np.random.shuffle(index)
                self._images = self._images[index]
                self._labels = self._labels[index]
            start = 0
            self._index_in_epochs = batch_size - rest_num_examples
            end = self._index_in_epochs
            images_new_part = self._images[start:end]
            labels_new_part = self._labels[start:end]
            return np.concatenate((images_rest_part, images_new_part), axis=0), np.concatenate(
                (labels_rest_part, labels_new_part), axis=0)

        else:
            self._index_in_epochs += batch_size
            end = self._index_in_epochs
            return self._images[start:end], self._labels[start:end]


if __name__ == '__main__':
    input = ['a', 'b', '1', '2', '*', '3', 'c', '&', '#']
    output = ["Letter", "Letter", "Number", "Number", "Symbol", "Number", "Letter", "Symbol", "Symbol"]
    ds = DataSet(input, output, 9)
    for i in range(3):
        image_batch, label_batch = ds.next_batch(4)
        print(image_batch)
        print(label_batch)

 

你可能感兴趣的:(用自己的数据集训练卷积神经网络)