DYF-AI

Tensorflow:VGG-net训练过程总结

前言：原本我是比较偏向数据挖掘方法的，后来找到的工作更偏向CV和NLP，所以也在慢慢的往CV和NLP靠，有什么说的不好，望指正。

1、VGG-net简介

VGG-net是牛津大学和Deepmind一起研发的一种深度卷积神经网络，在2014年的ILSVRC比赛在取得第二名的成绩，将Top-5的错误率将到7.3%，而当年排名第一的正是大名鼎鼎的Google-Net，后面有空在详细说。VGG-net的泛化能力非常好，在许多不同的数据集都有较好的表现。2019年的现在，VGG仍然经常被用来提取特征图像，VGG对于新入门的同学也是比较友好（没有太多复杂的结构）。

2、问题解答

（1）什么是感受野？

在卷积神经网络中，感受野（Receptive Field）的定义是卷积神经网络每一层输出的特征图（feature map）上的像素点在输入图片上映射的区域大小。再通俗点的解释是，特征图上的一个点对应输入图上的区域，如下图所示。

感受野

2）为什么说连续两个3*3的卷积核等价于一个5*5的卷积核，连续三个3*3的卷积核等价于一个7*7的卷积核？

首先解释为什么要使用更小的卷积核，每一张图片使用的卷积神经网络的核是共享的，使用更小的核可以降低模型的参数量，然而较小的核函数的感受野较小，为了使得3*3核的感受野和5*5的感受野一致，VGG提出了使用连续的两个3*3卷积核来替代5*5的卷积核，参数量从5*5=25，降为3*3*2=18（两个3*3的卷积核参数量）。网上我也找到相关的问题（1）的解释，没看太懂，下面说下我的理解：

padding为VALID，假设一幅图片是w*w(不是一般性，设为32*32)，卷积核大小设为k*k，步长设为S，padding的像素个数P（一般情况为0），输出的图片尺寸为N*N，其中N=((w-k+2P)/S)+1。

1）对于一个5*5的卷积核，步长S为1，P=0，卷积后的输出尺寸边长为((32-5+2*0))/1+1=28,图片为28*28。

2）对于一个3*3的卷积核，其它设置一样，输出的尺寸为30*30，对得到的30*30的图片继续使用3*3卷积核，得到的输出的图片尺寸为28*28。

因此，连续使用两次3*3的卷积核得到的图片尺寸和一次使用5*5卷积核的尺寸一样，但参数量更少。

3、VGG-net的网络结构

VGG16经典模型图

VGG参数量计算（参数量和卷积核大小和个数有关）

4、训练过程

（1）导入相关包，设置模型参数

# -*- coding:utf-8 -*-
import tensorflow as tf
import numpy as np
import time
import os
import sys
import random
import pickle
import tarfile

class_num = 10
image_size = 32
img_channels = 3
batch_size = 250
iterations = 200   # 训练集50000=250*200
total_epoch = 200
weight_decay = 0.0003
dropout_rate = 0.5
momentum_rate = 0.9
log_save_path = './vgg_16_logs'
model_save_path = './model/'

（2）下载数据集


def download_data():
    dirname = 'CAFIR-10_data/cifar-10-python'  # 解压后的文件夹
    origin = 'http://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz'
    fname = './CAFIR-10_data/cifar-10-python.tar.gz'
    #fname = './CAFIR-10_data'
    fpath = './' + dirname

    download = False
    if os.path.exists(fpath) or os.path.isfile(fname):
        download = False
        print("DataSet already exist!")
        import tarfile
        if fname.endswith("tar.gz"):
            tar = tarfile.open(fname, "r:gz")
            tar.extractall()
            tar.close()
            #tarfile.open(fpath, "r:gz").extractall(fname)
        elif fname.endswith("tar"):
            tar = tarfile.open(fname, "r:")
            tar.extractall()
            tar.close()
    else:
        download = True
    if download:
        print('Downloading data from', origin)
        import urllib.request
        import tarfile

        def reporthook(count, block_size, total_size):
            global start_time
            if count == 0:
                start_time = time.time()
                return
            duration = time.time() - start_time
            progress_size = int(count * block_size)
            speed = int(progress_size / (1024 * duration))
            percent = min(int(count*block_size*100/total_size),100)
            sys.stdout.write("\r...%d%%, %d MB, %d KB/s, %d seconds passed" %
                            (percent, progress_size / (1024 * 1024), speed, duration))
            sys.stdout.flush()
        urllib.request.urlretrieve(origin, fname, reporthook)
        print('Download finished. Start extract!', origin)
        if fname.endswith("tar.gz"):
            tar = tarfile.open(fname, "r:gz")
            tar.extractall()
            tar.close()
        elif fname.endswith("tar"):
            tar = tarfile.open(fname, "r:")
            tar.extractall()
            tar.close()

def unpickle(file):
    with open(file, 'rb') as fo:
        dict = pickle.load(fo, encoding='bytes')
    return dict

def load_data_one(file):
    batch = unpickle(file)
    data = batch[b'data']
    labels = batch[b'labels']
    print("Loading %s : %d." % (file, len(data)))
    return data, labels

def load_data(files, data_dir, label_count):
    global image_size, img_channels
    data, labels = load_data_one(data_dir + '/' + files[0])
    for f in files[1:]:
        data_n, labels_n = load_data_one(data_dir + '/' + f)
        data = np.append(data, data_n, axis=0)
        labels = np.append(labels, labels_n, axis=0)
    labels = np.array([[float(i == label) for i in range(label_count)] for label in labels])
    data = data.reshape([-1, img_channels, image_size, image_size])
    data = data.transpose([0, 2, 3, 1])
    return data, labels


def prepare_data():
    print("======Loading data======")
    download_data()
    #data_dir = './CAFIR-10_data/cifar-10-python/cifar-10-batches-py'
    data_dir = './cifar-10-batches-py'   # 数据集解压后的文件夹
    image_dim = image_size * image_size * img_channels
    meta = unpickle(data_dir + '/batches.meta')

    label_names = meta[b'label_names']
    label_count = len(label_names)
    train_files = ['data_batch_%d' % d for d in range(1, 6)]
    train_data, train_labels = load_data(train_files, data_dir, label_count)
    test_data, test_labels = load_data(['test_batch'], data_dir, label_count)

    print("Train data:", np.shape(train_data), np.shape(train_labels))
    print("Test data :", np.shape(test_data), np.shape(test_labels))
    print("======Load finished======")

    print("======Shuffling data======")
    indices = np.random.permutation(len(train_data))
    train_data = train_data[indices]
    train_labels = train_labels[indices]
    print("======Prepare Finished======")

    return train_data, train_labels, test_data, test_labels


def bias_variable(shape):
    initial = tf.constant(0.1, shape=shape, dtype=tf.float32)
    return tf.Variable(initial)

def conv2d(x, W):
    return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')

def max_pool(input, k_size=1, stride=1, name=None):
    return tf.nn.max_pool(input, ksize=[1, k_size, k_size, 1], strides=[1, stride, stride, 1],
                          padding='SAME', name=name)

def batch_norm(input):
    return tf.contrib.layers.batch_norm(input, decay=0.9, center=True, scale=True, epsilon=1e-3,
                                        is_training=train_flag, updates_collections=None)

# random crop 随机裁剪,下面的数据增强
def _random_crop(batch, crop_shape, padding=None):
    oshape = np.shape(batch[0])

    if padding:
        oshape = (oshape[0] + 2*padding, oshape[1] + 2*padding)
    new_batch = []
    npad = ((padding, padding), (padding, padding), (0, 0))
    for i in range(len(batch)):
        new_batch.append(batch[i])
        if padding:
            new_batch[i] = np.lib.pad(batch[i], pad_width=npad,
                                      mode='constant', constant_values=0)
        nh = random.randint(0, oshape[0] - crop_shape[0])
        nw = random.randint(0, oshape[1] - crop_shape[1])
        new_batch[i] = new_batch[i][nh:nh + crop_shape[0],
                                    nw:nw + crop_shape[1]]
    return new_batch


def _random_flip_leftright(batch):
        for i in range(len(batch)):
            if bool(random.getrandbits(1)):
                batch[i] = np.fliplr(batch[i])
        return batch


def data_augmentation(batch):
    batch = _random_flip_leftright(batch)
    batch = _random_crop(batch, [32, 32], 4)
    return batch

def data_preprocessing(x_train,x_test):
    x_train = x_train.astype('float32')
    x_test = x_test.astype('float32')

    x_train[:, :, :, 0] = (x_train[:, :, :, 0] - np.mean(x_train[:, :, :, 0])) / np.std(x_train[:, :, :, 0])
    x_train[:, :, :, 1] = (x_train[:, :, :, 1] - np.mean(x_train[:, :, :, 1])) / np.std(x_train[:, :, :, 1])
    x_train[:, :, :, 2] = (x_train[:, :, :, 2] - np.mean(x_train[:, :, :, 2])) / np.std(x_train[:, :, :, 2])

    x_test[:, :, :, 0] = (x_test[:, :, :, 0] - np.mean(x_test[:, :, :, 0])) / np.std(x_test[:, :, :, 0])
    x_test[:, :, :, 1] = (x_test[:, :, :, 1] - np.mean(x_test[:, :, :, 1])) / np.std(x_test[:, :, :, 1])
    x_test[:, :, :, 2] = (x_test[:, :, :, 2] - np.mean(x_test[:, :, :, 2])) / np.std(x_test[:, :, :, 2])

    return x_train, x_test

def learning_rate_schedule(epoch_num):
    if epoch_num < 81:
        return 0.01#0.1
    elif epoch_num < 121:
        return 0.001#0.01
    else:

def run_testing(sess, ep):
    acc = 0.0
    loss = 0.0
    pre_index = 0
    add = 1000
    for it in range(10):
        batch_x = test_x[pre_index:pre_index+add]
        batch_y = test_y[pre_index:pre_index+add]
        pre_index = pre_index + add
        loss_, acc_  = sess.run([cross_entropy, accuracy],
                                feed_dict={x: batch_x, y_: batch_y, keep_prob: 1.0, train_flag: False})
        loss += loss_ / 10.0
        acc += acc_ / 10.0
    summary = tf.Summary(value=[tf.Summary.Value(tag="test_loss", simple_value=loss),
                                tf.Summary.Value(tag="test_accuracy", simple_value=acc)])
    return acc, loss, summary


if __name__ == '__main__':

    train_x, train_y, test_x, test_y = prepare_data()
    train_x, test_x = data_preprocessing(train_x, test_x)

    # define placeholder x, y_ , keep_prob, learning_rate
    x = tf.placeholder(tf.float32,[None, image_size, image_size, 3])
    y_ = tf.placeholder(tf.float32, [None, class_num])
    keep_prob = tf.placeholder(tf.float32)
    learning_rate = tf.placeholder(tf.float32)
    train_flag = tf.placeholder(tf.bool)

    # build_network
    W_conv1_1 = tf.get_variable('conv1_1', shape=[3, 3, 3, 64], initializer=tf.contrib.keras.initializers.he_normal())
    b_conv1_1 = bias_variable([64])
    output = tf.nn.relu(batch_norm(conv2d(x, W_conv1_1) + b_conv1_1))

    W_conv1_2 = tf.get_variable('conv1_2', shape=[3, 3, 64, 64], initializer=tf.contrib.keras.initializers.he_normal())
    b_conv1_2 = bias_variable([64])
    output = tf.nn.relu(batch_norm(conv2d(output, W_conv1_2) + b_conv1_2))
    output = max_pool(output, 2, 2, "pool1")

    W_conv2_1 = tf.get_variable('conv2_1', shape=[3, 3, 64, 128], initializer=tf.contrib.keras.initializers.he_normal())
    b_conv2_1 = bias_variable([128])
    output = tf.nn.relu(batch_norm(conv2d(output, W_conv2_1) + b_conv2_1))

    W_conv2_2 = tf.get_variable('conv2_2', shape=[3, 3, 128, 128], initializer=tf.contrib.keras.initializers.he_normal())
    b_conv2_2 = bias_variable([128])
    output = tf.nn.relu(batch_norm(conv2d(output, W_conv2_2) + b_conv2_2))
    output = max_pool(output, 2, 2, "pool2")

    W_conv3_1 = tf.get_variable('conv3_1', shape=[3, 3, 128, 256], initializer=tf.contrib.keras.initializers.he_normal())
    b_conv3_1 = bias_variable([256])
    output = tf.nn.relu( batch_norm(conv2d(output,W_conv3_1) + b_conv3_1))

    W_conv3_2 = tf.get_variable('conv3_2', shape=[3, 3, 256, 256], initializer=tf.contrib.keras.initializers.he_normal())
    b_conv3_2 = bias_variable([256])
    output = tf.nn.relu(batch_norm(conv2d(output, W_conv3_2) + b_conv3_2))

    W_conv3_3 = tf.get_variable('conv3_3', shape=[3, 3, 256, 256], initializer=tf.contrib.keras.initializers.he_normal())
    b_conv3_3 = bias_variable([256])
    output = tf.nn.relu( batch_norm(conv2d(output, W_conv3_3) + b_conv3_3))
    output = max_pool(output, 2, 2, "pool3")

    W_conv4_1 = tf.get_variable('conv4_1', shape=[3, 3, 256, 512], initializer=tf.contrib.keras.initializers.he_normal())
    b_conv4_1 = bias_variable([512])
    output = tf.nn.relu(batch_norm(conv2d(output, W_conv4_1) + b_conv4_1))

    W_conv4_2 = tf.get_variable('conv4_2', shape=[3, 3, 512, 512], initializer=tf.contrib.keras.initializers.he_normal())
    b_conv4_2 = bias_variable([512])
    output = tf.nn.relu(batch_norm(conv2d(output, W_conv4_2) + b_conv4_2))

    W_conv4_3 = tf.get_variable('conv4_3', shape=[3, 3, 512, 512], initializer=tf.contrib.keras.initializers.he_normal())
    b_conv4_3 = bias_variable([512])
    output = tf.nn.relu(batch_norm(conv2d(output, W_conv4_3) + b_conv4_3))
    output = max_pool(output, 2, 2)

    W_conv5_1 = tf.get_variable('conv5_1', shape=[3, 3, 512, 512], initializer=tf.contrib.keras.initializers.he_normal())
    b_conv5_1 = bias_variable([512])
    output = tf.nn.relu(batch_norm(conv2d(output, W_conv5_1) + b_conv5_1))

    W_conv5_2 = tf.get_variable('conv5_2', shape=[3, 3, 512, 512], initializer=tf.contrib.keras.initializers.he_normal())
    b_conv5_2 = bias_variable([512])
    output = tf.nn.relu(batch_norm(conv2d(output, W_conv5_2) + b_conv5_2))

    W_conv5_3 = tf.get_variable('conv5_3', shape=[3, 3, 512, 512], initializer=tf.contrib.keras.initializers.he_normal())
    b_conv5_3 = bias_variable([512])
    output = tf.nn.relu(batch_norm(conv2d(output, W_conv5_3) + b_conv5_3))
    #output = max_pool(output, 2, 2)

    # output = tf.contrib.layers.flatten(output)
    output = tf.reshape(output, [-1, 2*2*512])

    W_fc1 = tf.get_variable('fc1', shape=[2048, 4096], initializer=tf.contrib.keras.initializers.he_normal())
    b_fc1 = bias_variable([4096])
    output = tf.nn.relu(batch_norm(tf.matmul(output, W_fc1) + b_fc1) )
    output = tf.nn.dropout(output, keep_prob)

    W_fc2 = tf.get_variable('fc7', shape=[4096, 4096], initializer=tf.contrib.keras.initializers.he_normal())
    b_fc2 = bias_variable([4096])
    output = tf.nn.relu(batch_norm(tf.matmul(output, W_fc2) + b_fc2))
    output = tf.nn.dropout(output, keep_prob)

    W_fc3 = tf.get_variable('fc3', shape=[4096, 10], initializer=tf.contrib.keras.initializers.he_normal())
    b_fc3 = bias_variable([10])
    output = tf.nn.relu(batch_norm(tf.matmul(output, W_fc3) + b_fc3))

    # output  = tf.reshape(output,[-1,10])

    # loss function: cross_entropy
    # train_step: training operation
    cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=output))
    l2 = tf.add_n([tf.nn.l2_loss(var) for var in tf.trainable_variables()])
    train_step = tf.train.MomentumOptimizer(learning_rate, momentum_rate, use_nesterov=True).\
        minimize(cross_entropy + l2 * weight_decay)

    correct_prediction = tf.equal(tf.argmax(output, 1), tf.argmax(y_, 1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

    # initial an saver to save model
    saver = tf.train.Saver()
    with tf.Session() as sess:

        sess.run(tf.global_variables_initializer())
        summary_writer = tf.summary.FileWriter(log_save_path,sess.graph)

        # epoch = 164
        # make sure [bath_size * iteration = data_set_number]

        for ep in range(1, total_epoch+1):
            lr = learning_rate_schedule(ep)
            pre_index = 0
            train_acc = 0.0
            train_loss = 0.0
            start_time = time.time()

            print("\n epoch %d/%d:" % (ep, total_epoch))

            for it in range(1, iterations+1):
                batch_x = train_x[pre_index:pre_index+batch_size]
                batch_y = train_y[pre_index:pre_index+batch_size]

                batch_x = data_augmentation(batch_x)

                _, batch_loss = sess.run([train_step, cross_entropy],
                                         feed_dict={x: batch_x, y_: batch_y, keep_prob: dropout_rate,
                                                    learning_rate: lr, train_flag: True})
                batch_acc = accuracy.eval(feed_dict={x: batch_x, y_: batch_y, keep_prob: 1.0, train_flag: True})

                train_loss += batch_loss
                train_acc += batch_acc
                pre_index += batch_size

                if it == iterations:
                    train_loss /= iterations
                    train_acc /= iterations

                    loss_, acc_ = sess.run([cross_entropy, accuracy],
                                           feed_dict={x: batch_x, y_: batch_y, keep_prob: 1.0, train_flag: True})
                    train_summary = tf.Summary(value=[tf.Summary.Value(tag="train_loss", simple_value=train_loss),
                                               tf.Summary.Value(tag="train_accuracy", simple_value=train_acc)])

                    val_acc, val_loss, test_summary = run_testing(sess, ep)

                    summary_writer.add_summary(train_summary, ep)
                    summary_writer.add_summary(test_summary, ep)
                    summary_writer.flush()

                    print("iteration: %d/%d, cost_time: %ds, train_loss: %.4f, "
                          "train_acc: %.4f, test_loss: %.4f, test_acc: %.4f"
                          % (it, iterations, int(time.time()-start_time), train_loss, train_acc, val_loss, val_acc))
                else:
                    print("iteration: %d/%d, train_loss: %.4f, train_acc: %.4f"
                          % (it, iterations, train_loss / it, train_acc / it))

        save_path = saver.save(sess, model_save_path)
        print("Model saved in file: %s" % save_path)

代码汇总：

# -*- coding:utf-8 -*-
import tensorflow as tf
import numpy as np
import time
import os
import sys
import random
import pickle
import tarfile

class_num = 10
image_size = 32
img_channels = 3
batch_size = 250
iterations = 200   # 训练集50000=250*200
total_epoch = 200
weight_decay = 0.0003
dropout_rate = 0.5
momentum_rate = 0.9
log_save_path = './vgg_16_logs'
model_save_path = './model/'

def download_data():
    dirname = 'CAFIR-10_data/cifar-10-python'  # 解压后的文件夹
    origin = 'http://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz'
    fname = './CAFIR-10_data/cifar-10-python.tar.gz'
    #fname = './CAFIR-10_data'
    fpath = './' + dirname

    download = False
    if os.path.exists(fpath) or os.path.isfile(fname):
        download = False
        print("DataSet already exist!")
        import tarfile
        if fname.endswith("tar.gz"):
            tar = tarfile.open(fname, "r:gz")
            tar.extractall()
            tar.close()
            #tarfile.open(fpath, "r:gz").extractall(fname)
        elif fname.endswith("tar"):
            tar = tarfile.open(fname, "r:")
            tar.extractall()
            tar.close()
    else:
        download = True
    if download:
        print('Downloading data from', origin)
        import urllib.request
        import tarfile

        def reporthook(count, block_size, total_size):
            global start_time
            if count == 0:
                start_time = time.time()
                return
            duration = time.time() - start_time
            progress_size = int(count * block_size)
            speed = int(progress_size / (1024 * duration))
            percent = min(int(count*block_size*100/total_size),100)
            sys.stdout.write("\r...%d%%, %d MB, %d KB/s, %d seconds passed" %
                            (percent, progress_size / (1024 * 1024), speed, duration))
            sys.stdout.flush()
        urllib.request.urlretrieve(origin, fname, reporthook)
        print('Download finished. Start extract!', origin)
        if fname.endswith("tar.gz"):
            tar = tarfile.open(fname, "r:gz")
            tar.extractall()
            tar.close()
        elif fname.endswith("tar"):
            tar = tarfile.open(fname, "r:")
            tar.extractall()
            tar.close()

def unpickle(file):
    with open(file, 'rb') as fo:
        dict = pickle.load(fo, encoding='bytes')
    return dict

def load_data_one(file):
    batch = unpickle(file)
    data = batch[b'data']
    labels = batch[b'labels']
    print("Loading %s : %d." % (file, len(data)))
    return data, labels

def load_data(files, data_dir, label_count):
    global image_size, img_channels
    data, labels = load_data_one(data_dir + '/' + files[0])
    for f in files[1:]:
        data_n, labels_n = load_data_one(data_dir + '/' + f)
        data = np.append(data, data_n, axis=0)
        labels = np.append(labels, labels_n, axis=0)
    labels = np.array([[float(i == label) for i in range(label_count)] for label in labels])
    data = data.reshape([-1, img_channels, image_size, image_size])
    data = data.transpose([0, 2, 3, 1])
    return data, labels

def prepare_data():
    print("======Loading data======")
    download_data()
    #data_dir = './CAFIR-10_data/cifar-10-python/cifar-10-batches-py'
    data_dir = './cifar-10-batches-py'   # 数据集解压后的文件夹
    image_dim = image_size * image_size * img_channels
    meta = unpickle(data_dir + '/batches.meta')

    label_names = meta[b'label_names']
    label_count = len(label_names)
    train_files = ['data_batch_%d' % d for d in range(1, 6)]
    train_data, train_labels = load_data(train_files, data_dir, label_count)
    test_data, test_labels = load_data(['test_batch'], data_dir, label_count)

    print("Train data:", np.shape(train_data), np.shape(train_labels))
    print("Test data :", np.shape(test_data), np.shape(test_labels))
    print("======Load finished======")

    print("======Shuffling data======")
    indices = np.random.permutation(len(train_data))
    train_data = train_data[indices]
    train_labels = train_labels[indices]
    print("======Prepare Finished======")

    return train_data, train_labels, test_data, test_labels


def bias_variable(shape):
    initial = tf.constant(0.1, shape=shape, dtype=tf.float32)
    return tf.Variable(initial)

def conv2d(x, W):
    return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')

def max_pool(input, k_size=1, stride=1, name=None):
    return tf.nn.max_pool(input, ksize=[1, k_size, k_size, 1], strides=[1, stride, stride, 1],
                          padding='SAME', name=name)

def batch_norm(input):
    return tf.contrib.layers.batch_norm(input, decay=0.9, center=True, scale=True, epsilon=1e-3,
                                        is_training=train_flag, updates_collections=None)

# random crop 随机裁剪,下面的数据增强
def _random_crop(batch, crop_shape, padding=None):
    oshape = np.shape(batch[0])

    if padding:
        oshape = (oshape[0] + 2*padding, oshape[1] + 2*padding)
    new_batch = []
    npad = ((padding, padding), (padding, padding), (0, 0))
    for i in range(len(batch)):
        new_batch.append(batch[i])
        if padding:
            new_batch[i] = np.lib.pad(batch[i], pad_width=npad,
                                      mode='constant', constant_values=0)
        nh = random.randint(0, oshape[0] - crop_shape[0])
        nw = random.randint(0, oshape[1] - crop_shape[1])
        new_batch[i] = new_batch[i][nh:nh + crop_shape[0],
                                    nw:nw + crop_shape[1]]
    return new_batch


def _random_flip_leftright(batch):
        for i in range(len(batch)):
            if bool(random.getrandbits(1)):
                batch[i] = np.fliplr(batch[i])
        return batch


def data_augmentation(batch):
    batch = _random_flip_leftright(batch)
    batch = _random_crop(batch, [32, 32], 4)
    return batch

def data_preprocessing(x_train,x_test):
    x_train = x_train.astype('float32')
    x_test = x_test.astype('float32')

    x_train[:, :, :, 0] = (x_train[:, :, :, 0] - np.mean(x_train[:, :, :, 0])) / np.std(x_train[:, :, :, 0])
    x_train[:, :, :, 1] = (x_train[:, :, :, 1] - np.mean(x_train[:, :, :, 1])) / np.std(x_train[:, :, :, 1])
    x_train[:, :, :, 2] = (x_train[:, :, :, 2] - np.mean(x_train[:, :, :, 2])) / np.std(x_train[:, :, :, 2])

    x_test[:, :, :, 0] = (x_test[:, :, :, 0] - np.mean(x_test[:, :, :, 0])) / np.std(x_test[:, :, :, 0])
    x_test[:, :, :, 1] = (x_test[:, :, :, 1] - np.mean(x_test[:, :, :, 1])) / np.std(x_test[:, :, :, 1])
    x_test[:, :, :, 2] = (x_test[:, :, :, 2] - np.mean(x_test[:, :, :, 2])) / np.std(x_test[:, :, :, 2])

    return x_train, x_test



def learning_rate_schedule(epoch_num):
    if epoch_num < 81:
        return 0.1 # 0.01
    elif epoch_num < 121:
        return 0.01 #0.001
    else:
        return 0.001 #0.0001

def run_testing(sess, ep):
    acc = 0.0
    loss = 0.0
    pre_index = 0
    add = 1000
    for it in range(10):
        batch_x = test_x[pre_index:pre_index+add]
        batch_y = test_y[pre_index:pre_index+add]
        pre_index = pre_index + add
        loss_, acc_  = sess.run([cross_entropy, accuracy],
                                feed_dict={x: batch_x, y_: batch_y, keep_prob: 1.0, train_flag: False})
        loss += loss_ / 10.0
        acc += acc_ / 10.0
    summary = tf.Summary(value=[tf.Summary.Value(tag="test_loss", simple_value=loss),
                                tf.Summary.Value(tag="test_accuracy", simple_value=acc)])
    return acc, loss, summary


if __name__ == '__main__':

    train_x, train_y, test_x, test_y = prepare_data()
    train_x, test_x = data_preprocessing(train_x, test_x)

    # define placeholder x, y_ , keep_prob, learning_rate
    x = tf.placeholder(tf.float32,[None, image_size, image_size, 3])
    y_ = tf.placeholder(tf.float32, [None, class_num])
    keep_prob = tf.placeholder(tf.float32)
    learning_rate = tf.placeholder(tf.float32)
    train_flag = tf.placeholder(tf.bool)

    # build_network
    # 32x32x3
    W_conv1_1 = tf.get_variable('conv1_1', shape=[3, 3, 3, 64], initializer=tf.contrib.keras.initializers.he_normal())
    b_conv1_1 = bias_variable([64])
    output = tf.nn.relu(batch_norm(conv2d(x, W_conv1_1) + b_conv1_1))

    W_conv1_2 = tf.get_variable('conv1_2', shape=[3, 3, 64, 64], initializer=tf.contrib.keras.initializers.he_normal())
    b_conv1_2 = bias_variable([64])
    output = tf.nn.relu(batch_norm(conv2d(output, W_conv1_2) + b_conv1_2))
    output = max_pool(output, 2, 2, "pool1") # 16x16x64

    W_conv2_1 = tf.get_variable('conv2_1', shape=[3, 3, 64, 128], initializer=tf.contrib.keras.initializers.he_normal())
    b_conv2_1 = bias_variable([128])
    output = tf.nn.relu(batch_norm(conv2d(output, W_conv2_1) + b_conv2_1))

    W_conv2_2 = tf.get_variable('conv2_2', shape=[3, 3, 128, 128], initializer=tf.contrib.keras.initializers.he_normal())
    b_conv2_2 = bias_variable([128])
    output = tf.nn.relu(batch_norm(conv2d(output, W_conv2_2) + b_conv2_2))
    output = max_pool(output, 2, 2, "pool2") # 8x8x128

    W_conv3_1 = tf.get_variable('conv3_1', shape=[3, 3, 128, 256], initializer=tf.contrib.keras.initializers.he_normal())
    b_conv3_1 = bias_variable([256])
    output = tf.nn.relu( batch_norm(conv2d(output,W_conv3_1) + b_conv3_1))

    W_conv3_2 = tf.get_variable('conv3_2', shape=[3, 3, 256, 256], initializer=tf.contrib.keras.initializers.he_normal())
    b_conv3_2 = bias_variable([256])
    output = tf.nn.relu(batch_norm(conv2d(output, W_conv3_2) + b_conv3_2))

    W_conv3_3 = tf.get_variable('conv3_3', shape=[3, 3, 256, 256], initializer=tf.contrib.keras.initializers.he_normal())
    b_conv3_3 = bias_variable([256])
    output = tf.nn.relu( batch_norm(conv2d(output, W_conv3_3) + b_conv3_3))
    output = max_pool(output, 2, 2, "pool3") # 4x4x256

    W_conv4_1 = tf.get_variable('conv4_1', shape=[3, 3, 256, 512], initializer=tf.contrib.keras.initializers.he_normal())
    b_conv4_1 = bias_variable([512])
    output = tf.nn.relu(batch_norm(conv2d(output, W_conv4_1) + b_conv4_1))

    W_conv4_2 = tf.get_variable('conv4_2', shape=[3, 3, 512, 512], initializer=tf.contrib.keras.initializers.he_normal())
    b_conv4_2 = bias_variable([512])
    output = tf.nn.relu(batch_norm(conv2d(output, W_conv4_2) + b_conv4_2))

    W_conv4_3 = tf.get_variable('conv4_3', shape=[3, 3, 512, 512], initializer=tf.contrib.keras.initializers.he_normal())
    b_conv4_3 = bias_variable([512])
    output = tf.nn.relu(batch_norm(conv2d(output, W_conv4_3) + b_conv4_3))
    output = max_pool(output, 2, 2) # 2x2x512

    W_conv5_1 = tf.get_variable('conv5_1', shape=[3, 3, 512, 512], initializer=tf.contrib.keras.initializers.he_normal())
    b_conv5_1 = bias_variable([512])
    output = tf.nn.relu(batch_norm(conv2d(output, W_conv5_1) + b_conv5_1))

    W_conv5_2 = tf.get_variable('conv5_2', shape=[3, 3, 512, 512], initializer=tf.contrib.keras.initializers.he_normal())
    b_conv5_2 = bias_variable([512])
    output = tf.nn.relu(batch_norm(conv2d(output, W_conv5_2) + b_conv5_2))

    W_conv5_3 = tf.get_variable('conv5_3', shape=[3, 3, 512, 512], initializer=tf.contrib.keras.initializers.he_normal())
    b_conv5_3 = bias_variable([512])
    output = tf.nn.relu(batch_norm(conv2d(output, W_conv5_3) + b_conv5_3))
    ##output = max_pool(output, 2, 2)   # 这里应该要池化 ######3

    # output = tf.contrib.layers.flatten(output)
    output = tf.reshape(output, [-1, 2*2*512])
    ##output = tf.reshape(output, [-1, 1 * 1 * 512])

    ##W_fc1 = tf.get_variable('fc1', shape=[1 * 1 * 512, 4096], initializer=tf.contrib.keras.initializers.he_normal())
    W_fc1 = tf.get_variable('fc1', shape=[2 * 2 * 512, 4096], initializer=tf.contrib.keras.initializers.he_normal())
    b_fc1 = bias_variable([4096])
    output = tf.nn.relu(batch_norm(tf.matmul(output, W_fc1) + b_fc1) )
    output = tf.nn.dropout(output, keep_prob)

    W_fc2 = tf.get_variable('fc7', shape=[4096, 4096], initializer=tf.contrib.keras.initializers.he_normal())
    b_fc2 = bias_variable([4096])
    output = tf.nn.relu(batch_norm(tf.matmul(output, W_fc2) + b_fc2))
    output = tf.nn.dropout(output, keep_prob)

    W_fc3 = tf.get_variable('fc3', shape=[4096, 10], initializer=tf.contrib.keras.initializers.he_normal())
    b_fc3 = bias_variable([10])
    output = tf.nn.relu(batch_norm(tf.matmul(output, W_fc3) + b_fc3))

    # output  = tf.reshape(output,[-1,10])

    # loss function: cross_entropy
    # train_step: training operation
    cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=output))
    l2 = tf.add_n([tf.nn.l2_loss(var) for var in tf.trainable_variables()])
    train_step = tf.train.MomentumOptimizer(learning_rate, momentum_rate, use_nesterov=True).\
        minimize(cross_entropy + l2 * weight_decay)

    correct_prediction = tf.equal(tf.argmax(output, 1), tf.argmax(y_, 1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

    # initial an saver to save model
    saver = tf.train.Saver()
    with tf.Session() as sess:

        sess.run(tf.global_variables_initializer())
        summary_writer = tf.summary.FileWriter(log_save_path,sess.graph)

        # epoch = 164
        # make sure [bath_size * iteration = data_set_number]

        for ep in range(1, total_epoch+1):
            lr = learning_rate_schedule(ep)
            pre_index = 0
            train_acc = 0.0
            train_loss = 0.0
            start_time = time.time()

            print("\n epoch %d/%d:" % (ep, total_epoch))

            for it in range(1, iterations+1):
                batch_x = train_x[pre_index:pre_index+batch_size]
                batch_y = train_y[pre_index:pre_index+batch_size]

                batch_x = data_augmentation(batch_x)

                _, batch_loss = sess.run([train_step, cross_entropy],
                                         feed_dict={x: batch_x, y_: batch_y, keep_prob: dropout_rate,
                                                    learning_rate: lr, train_flag: True})
                batch_acc = accuracy.eval(feed_dict={x: batch_x, y_: batch_y, keep_prob: 1.0, train_flag: True})

                train_loss += batch_loss
                train_acc += batch_acc
                pre_index += batch_size

                if it == iterations:
                    train_loss /= iterations
                    train_acc /= iterations

                    loss_, acc_ = sess.run([cross_entropy, accuracy],
                                           feed_dict={x: batch_x, y_: batch_y, keep_prob: 1.0, train_flag: True})
                    train_summary = tf.Summary(value=[tf.Summary.Value(tag="train_loss", simple_value=train_loss),
                                               tf.Summary.Value(tag="train_accuracy", simple_value=train_acc)])

                    val_acc, val_loss, test_summary = run_testing(sess, ep)

                    summary_writer.add_summary(train_summary, ep)
                    summary_writer.add_summary(test_summary, ep)
                    summary_writer.flush()

                    print("iteration: %d/%d, cost_time: %ds, train_loss: %.4f, "
                          "train_acc: %.4f, test_loss: %.4f, test_acc: %.4f"
                          % (it, iterations, int(time.time()-start_time), train_loss, train_acc, val_loss, val_acc))
                else:
                    print("iteration: %d/%d, train_loss: %.4f, train_acc: %.4f"
                          % (it, iterations, train_loss / it, train_acc / it))

        save_path = saver.save(sess, model_save_path)
        print("Model saved in file: %s" % save_path)

目前，在将代码改为tensorflow2.0版本，后面有时间在写一篇关于2.0的训练代码。

参考博文网址：https://blog.csdn/GOGO_YAO/article/details/80348200;

你可能感兴趣的:(图像分类)

番茄西红柿叶子病害分类数据集12882张11类别 futureflsl 数据集分类数据挖掘人工智能
数据集类型：图像分类用，不可用于目标检测无标注文件数据集格式：仅仅包含jpg图片，每个类别文件夹下面存放着对应图片图片数量(jpg文件个数)：12882分类类别数：11类别名称:["Bacterial_Spot_Bacteria","Early_Blight_Fungus","Healthy","Late_Blight_Water_Mold","Leaf_Mold_Fungus","Powdery
遥感影像的切片处理 sand&wich 计算机视觉 python 图像处理
在遥感影像分析中，经常需要将大尺寸的影像切分成小片段，以便于进行详细的分析和处理。这种方法特别适用于机器学习和图像处理任务，如对象检测、图像分类等。以下是如何使用Python和OpenCV库来实现这一过程，同时确保每个影像片段保留正确的地理信息。准备环境首先，确保安装了必要的Python库，包括numpy、opencv-python和xml.etree.ElementTree。这些库将用于图像处理
Python(PyTorch)和MATLAB及Rust和C++结构相似度指数测量导图亚图跨际 Python 交叉知识算法量化检查图像压缩质量低分辨率多光谱峰值信噪比端到端优化图像压缩手术机器人三维实景实时可微分渲染重建三维可视化
要点量化检查图像压缩质量低分辨率多光谱和高分辨率图像实现超分辨率分析图像质量图像索引/多尺度结构相似度指数和光谱角映射器及视觉信息保真度多种指标峰值信噪比和结构相似度指数测量结构相似性图像分类PNG和JPEG图像相似性近似算法图像压缩，视频压缩、端到端优化图像压缩、神经图像压缩、GPU变速图像压缩手术机器人深度估计算法重建三维可视化推理图像超分辨率算法模型三维实景实时可微分渲染算法MATLAB结构
CV、NLP、数据控掘推荐、量化海的那边- AI算法自然语言处理人工智能
下面是对CV（计算机视觉）、NLP（自然语言处理）、数据挖掘推荐和量化的简要概述及其应用领域的介绍：1.CV（计算机视觉，ComputerVision）定义：计算机视觉是一门让计算机能够从图像或视频中提取有用信息，并做出决策的学科。它通过模拟人类的视觉系统来识别、处理和理解视觉信息。主要任务：图像分类：识别图像中的物体并分类，比如猫、狗、车等。目标检测：在图像或视频中定位并识别多个对象，如人脸检测
基于Pytorch框架的CIFAR-10图像分类任务（附带完整代码）难得北窗高卧 pytorch 人工智能 python 深度学习
本文主要实现在pytorch框架下，训练CIFAR数据集，通过观察训练和验证的误差、准确率图像来进一步改善。保存最好的模型。测试集打印整体准确率和每一类别的准确率，并生成混淆矩阵，将其中每一个错误的图片并保存下来。语言：python实现方式：pytorch框架,CPU关键词:CIFAR-10数据集、Dataset和Dataloader、SummaryWriter画图、网络模型搭建、混淆矩阵、统计所
验证resneXt，densenet，mobilenet和SENet的特色结构 dfj77477 人工智能 python
简介图像分类对网络结构的要求，一个是精度，另一个是速度。这两个需求推动了网络结构的发展。resneXt：分组卷积，降低了网络参数个数。densenet：密集的跳连接。mobilenet：标准卷积分解成深度卷积和逐点卷积，即深度分离卷积。SENet：注意力机制。简单起见，使用了[1]的代码，注释掉layer4，作为基本框架resnet14。然后改变局部结构，验证分类效果。实验结果GPU：gtx107
基于深度学习的对抗样本生成与防御 SEU-WYL 深度学习dnn 深度学习人工智能
基于深度学习的对抗样本生成与防御是当前人工智能安全领域的关键研究方向。对抗样本是通过对输入数据进行微小扰动而产生的，能够导致深度学习模型做出错误预测。这对图像分类、自然语言处理、语音识别等应用构成了严重威胁，因此相应的防御措施也在不断发展。1.对抗样本生成对抗样本生成的方法主要有两大类：基于梯度的方法和基于优化的方法。1.1基于梯度的方法这些方法利用模型的梯度信息，通过细微的扰动来生成对抗样本，迫
【Python】成功解决TypeError: list indices must be integers or slices, not str 高斯小哥 BUG解决方案合集 python list 新手入门学习 debug
【Python】成功解决TypeError:listindicesmustbeintegersorslices,notstr欢迎进入我的个人主页，我是高斯小哥！博主档案：广东某985本硕，SCI顶刊一作，深耕深度学习多年，熟练掌握PyTorch框架。技术专长：擅长处理各类深度学习任务，包括但不限于图像分类、图像重构(去雾\去模糊\修复)、目标检测、图像分割、人脸识别、多标签分类、重识别(行人\车辆
Transformer+目标检测，这一篇入门就够了 BIT可达鸭 ▶深度学习-计算机视觉 transformer 深度学习目标检测计算机视觉自然语言处理
VisionTransformerforObjectDetection本文作者：Encoder-Decoder简介：Encoder-Decoder的缺陷：Attention机制：Self-Attention机制：Multi-HeadAttention：Transformer结构：图像分类之ViT：图像分类之PyramidViT：目标检测之DETR：目标检测之DeformableDETR：本文作者：
经典网络训练图像分类模型一三十度角阳光的问候分类数据挖掘人工智能
目录数据预处理部分：网络模块设置：网络模型保存与测试数据读取与预处理操作制作好数据源：读取标签对应的实际名字加载models中提供的模型，并且直接用训练的好权重当做初始化参数模型参数更新把模型输出层改成自己的设置哪些层需要训练优化器设置数据预处理部分：-数据增强：torchvision中transforms模块自带功能，比较实用-数据预处理：torchvision中transforms也帮我们实现
识别实验笔记和经验总结 Wils0nEdwards 笔记
1.跑对比实验之前，首先保证对比的公平性和可靠性！在进行图像分类模型对比实验时，为了确保对比的公平性和可靠性，以下几个因素需要重点考虑：数据集的一致性：数据集分割：确保训练集、验证集和测试集的划分是一致的。各模型使用相同的训练数据和测试数据。数据集大小：确保数据集的样本数量充足且具有代表性，避免数据集过小导致结果不具备普遍性。数据预处理：图像预处理方法：所有模型使用相同的预处理方法（如归一化、裁剪
[opencv]DNN图像分类 FL1623863129 opencv opencv dnn 分类
OpenCV是一个计算机视觉开源库，提供了处理图像和视频的能力。OpenCV的影响力非常大，有超过47000的社区用户，以及超过1400万次的下载量。其应用领域横跨图像处理、交互式艺术、视频监督、地图拼接和高级机器人等。作为一个有十几年历史的开源项目，OpenCV拥有广大的用户群体和开发者群体。在数字的世界中，一幅图像由多个点（像素）组成。图像处理就是对其中一个像素或者一个区域内的像素（块）进行处
快速使用transformers的pipeline实现各种深度学习任务 E寻数据 huggingface 计算机视觉 nlp 深度学习人工智能 python pipeline transformers
目录引言安装情感分析文本生成文本摘要图片分类实例分割目标检测音频分类自动语音识别视觉问答文档问题回答图文描述引言在这篇中文博客中，我们将深入探讨使用transformers库中的pipeline()函数，它为预训练模型提供了一个简单且快速的推理方法。pipeline()函数支持多种任务，包括文本分类、文本生成、摘要生成、图像分类、图像分割、对象检测、音频分类、自动语音识别、视觉问题回答、文档问题回
阿尔兹海默症-图像分类数据集数据集_深度学习分类数据挖掘人工智能 python 机器学习算法
阿尔兹海默症-图像分类数据集数据集：链接：https://pan.baidu.com/s/1gSUT74XrnHmg2Z11oZNd6A?pwd=wphh提取码：wphh数据集信息介绍：文件夹健康中的图片数量:8000文件夹早期轻度认知障碍中的图片数量:10000文件夹阿尔兹海默症中的图片数量:8000所有子文件夹中的图片总数量:26000阿尔兹海默症-图像分类数据集摘要阿尔兹海默症（Alzhei
基于深度学习的自适应架构 SEU-WYL 深度学习dnn 深度学习架构人工智能
基于深度学习的自适应架构是一种能够动态调整自身结构和参数的神经网络体系，以更好地适应不同的任务和环境需求。这类架构旨在提高模型的灵活性、效率和泛化能力，特别是在面对资源受限或任务多样化的情况下。以下是对该主题的详细介绍：1.背景与动机任务多样性：在现实世界中，模型可能需要处理各种不同的任务，如图像分类、物体检测、自然语言处理等。传统的固定架构模型往往难以在所有任务上都表现出色。资源受限环境：在边缘
[数据集][图像分类]河道污染分类数据集1923张4类别 FL1623863129 数据集分类数据挖掘人工智能
数据集类型：图像分类用，不可用于目标检测无标注文件数据集格式：仅仅包含jpg图片，每个类别文件夹下面存放着对应图片图片数量(jpg文件个数)：1922分类类别数：4类别名称:["lianghao","qingwei","yanzhong","zhongdu"]每个类别图片数：lianghao图片数：435qingwei图片数：423yanzhong图片数：577zhongdu图片数：487重要说明
线性代数在卷积神经网络（CNN）中的体现科学的N次方人工智能线性代数 cnn 人工智能
案例：深度学习中的卷积神经网络（CNN）在图像识别领域，卷积神经网络（ConvolutionalNeuralNetworks,CNN）是一个广泛应用深度学习模型，它在人脸识别、物体识别、医学图像分析等方面取得了显著成效。CNN中的核心操作——卷积，就是一个直接体现线性代数应用的例子。假设我们正在训练一个用于识别猫和狗的图像分类器，原始输入是一幅RGB彩色图片，可以将其视为一个高度、宽度和通道数（R
深入了解OpenCVSharp中常见的图像处理功能仰望大佬007 图像处理 opencv 计算机视觉 c#
深入了解OpenCVSharp中常见的图像处理功能前言1.图像加载与保存2.图像基本操作3.图像滤波4.边缘检测5.图像分割6.特征检测与描述子7.目标识别与跟踪8.图像融合与拼接9.形状匹配与模板匹配10.颜色空间转换与直方图11.图像转换与绘制12.图像分类与机器学习13.高级图像处理算法14.GPU加速与并行计算前言OpenCVSharp是C#语言中用于图像处理和计算机视觉的开源库，它提供了
[数据集][图像分类]鲜花分类数据集5735张102类别 FL1623863129 数据集计算机视觉
数据集类型：图像分类用，不可用于目标检测无标注文件数据集格式：仅仅包含jpg图片，每个类别文件夹下面存放着对应图片图片数量(jpg文件个数)：5735分类类别数：102类别名称:["0","1","2","3","4","5","6","7","8","9","10","11","12","13","14","15","16","17","18","19","20","21","22","23",
【深入了解PyTorch】PyTorch实战项目示例：深入探索图像分类、目标检测和情感分析 prince_zxill Python实战教程人工智能与机器学习教程 pytorch 分类目标检测
【深入了解PyTorch】PyTorch实战项目示例：深入探索图像分类、目标检测和情感分析PyTorch实战项目示例：深入探索图像分类、目标检测和情感分析项目一：图像分类数据集准备构建模型训练模型模型评估和预测项目二：目标检测数据集准备构建模型训练模型模型评估和预测项目三：情感分析数据集准备构建模型训练模型模型评估和预测
深度学习图像分类中，要求待分类图像中只有一类物体吗？如果这个图像中有两类物体，那么这个图像被分为哪一类？神笔馬良深度学习分类人工智能
问题描述：深度学习图像分类中，要求待分类图像中只有一类物体吗？如果这个图像中有两类物体，那么这个图像被分为哪一类？问题解答：在深度学习图像分类任务中，通常假设每张图像只包含一类物体。这是因为图像分类模型是针对特定类别的，模型训练的目标是学习如何将输入图像正确分类到这些预定义的类别中。因此，如果一张图像中包含多个类别的物体，那么根据通常的假设，该图像将被分为其中的主要类别或最突出的类别。具体来说，如
【深度学习】S2 数学基础 P6 概率论脚踏实地的大梦想家 #深度学习深度学习概率论
目录基本概率论概率论公理随机变量多个随机变量联合概率条件概率贝叶斯定理求和法则独立性期望与方差小结基本概率论机器学习本质上，就是做出预测。而概率论提供了一种量化和表达不确定性水平的方法，可以帮助我们量化对某个结果的确定性程度。在一个简单的图像分类任务中；如果我们非常确定图像中的对象是一只猫，那么我们可以说标签为“猫”的概率是1，即P(y=“猫”)=1P(y=“猫”)=1P(y=“猫”)=1;如果我
深度学习(16)--基于经典网络架构resnet训练图像分类模型 GodFishhh 深度学习深度学习 python 人工智能 pytorch
目录一.项目介绍二.项目流程详解2.1.引入所需的工具包2.2.数据读取和预处理2.3.加载resnet152模型2.4.初始化模型2.5.设置需要更新的参数2.6.训练模块设置2.7.再次训练所有层2.8.测试网络效果三.完整代码一.项目介绍使用PyTorch工具包调用经典网络架构resnet训练图像分类模型，用于分辨不同类型的花二.项目流程详解2.1.引入所需的工具包importosimpor
【AIGC】Stable Diffusion应用领域 AIGCExplore AIGC AIGC stable diffusion 人工智能
StableDiffusion是一个基于OpenAI的Diffusion模型的扩展版本，主要用于图像生成和处理任务。它并不是一个图像分类模型，而是一个生成式模型，可以生成高质量的图像。以下是StableDiffusion模型的主要功能和应用领域：图像生成：StableDiffusion可以生成各种类型的图像，包括人物肖像、风景、动物、静物等。它能够生成高分辨率、真实感和多样性的图像，具有良好的生成
ubuntu22.04@laptop OpenCV Get Started: 015_deep_learning_with_opencv_dnn_module lida2003 Linux opencv dnn 人工智能计算机视觉开源
ubuntu22.04@laptopOpenCVGetStarted:015_deep_learning_with_opencv_dnn_module1.源由2.应用Demo2.1C++应用Demo2.2Python应用Demo3.使用OpenCVDNN模块进行图像分类3.1导入模块并加载类名文本文件3.2从磁盘加载预训练DenseNet121模型3.3读取图像并准备为模型输入3.4通过模型进行前
【大厂AI课学习笔记】【2.2机器学习开发任务实例】（1）搭建一个机器学习模型 giszz 人工智能学习笔记人工智能学习笔记
今天学习的是，如何搭建一个机器学习模型。主要有以上的步骤：原始数据采集特征工程数据预处理特征提取特征转换（构造）预测识别（模型训练和测试）在实际工作中，特征比模型更重要。数据和特征的选择，已经决定了模型的天花板，模型算法只是去逼近这个上限。在上述的特征工程中：数据预处理，就是去除数据的噪声，例如文本中的错误、不再使用的词语等；特征提取，就是从原始数据中提取一些有效的特征。例如图像分类中，提取边缘、
Matlab DNN多层感知机进行图像分类——附源码分享我是狮子搏兔 Prediction matlab matlab dnn python
提示：麻烦点赞，拒绝白嫖文章目录前言一、数据来源二、训练+预测_一步到位源码1.DNN.m总结前言Python不香吗？非得用matlab来搞机器学习的东西？不是不是，matlab也有集成了许多机器学习算法，当然，都是一些非常基础的机器学习算法。深度学习还是得向python看齐。今天试用了一下matlab自带的DNN模型，封装在newff函数里，寥寥几行代码，非常简洁。提示：以下是本篇文章正文内容，
Seq2seq模型以及Beam Search 非洲小可爱自然语言处理 seq2seq bean search 贪心算法
seq2seq模型及BeamsearchSeq2Seq是一个Encoder-Deocder结构的模型，输入是一个序列，输出也是一个序列。Encoder将一个可变长度的输入序列变为固定长度的向量，Decoder将这个固定长度的向量解码成可变长度的输出序列。目标是最大化该目标函数：seq2seq模型种类onetoone结构，仅仅只是简单的给一个输入得到一个输出，此处并未体现序列的特征，例如图像分类场景
pytorch图像分类全流程(五)--图像分类算法精度评估指标已经大四了，继续努力 datawhale pytorch pytorch 分类深度学习
本次我们来学习图像分类算法精度的各种评估指标：precision、recall、accuracy、f1-score、AP、AUC。首先我们来学一个很重要的概念，混淆矩阵：1.精确率(Precision)：指的是所有被判定为正类（TP+FP）中，真实的正类（TP）占的比例。2.召回率(Recall)：指的是所有真实为正类（TP+FN）中，被判定为正类（TP）占的比例。3.准确率(accuracy)：
pytorch,cnn,rnn和yolo关系小小娱乐 pytorch cnn rnn
卷积神经网络（ConvolutionalNeuralNetworks,CNN）和YOLO（YouOnly卷积神经网络（ConvolutionalNeuralNetworks,CNN）和YOLO（YouOnlyLookOnce）都是深度学习中的重要技术，它们在处理图像数据方面有着广泛的应用。CNN是一种以卷积为核心的神经网络，被广泛用于图像分类、物体检测等任务。YOLO则是一种基于CNN的目标检测算
Java开发中，spring mvc 的线程怎么调用？小麦麦子 spring mvc
今天逛知乎，看到最近很多人都在问spring mvc 的线程http://www.maiziedu.com/course/java/ 的启动问题，觉得挺有意思的，那哥们儿问的也听仔细，下面的回答也很详尽，分享出来，希望遇对遇到类似问题的Java开发程序猿有所帮助。问题：在用spring mvc架构的网站上，设一线程在虚拟机启动时运行，线程里有一全局
maven依赖范围 bitcarter maven
1.test 测试的时候才会依赖，编译和打包不依赖，如junit不被打包 2.compile 只有编译和打包时才会依赖 3.provided 编译和测试的时候依赖，打包不依赖，如：tomcat的一些公用jar包 4.runtime 运行时依赖，编译不依赖 5.默认compile 依赖范围compile是支持传递的，test不支持传递 1.传递的意思是项目A，引用
Jaxb org.xml.sax.saxparseexception : premature end of file darrenzhu xml premature JAXB
如果在使用JAXB把xml文件unmarshal成vo(XSD自动生成的vo)时碰到如下错误： org.xml.sax.saxparseexception : premature end of file 很有可能时你直接读取文件为inputstream，然后将inputstream作为构建unmarshal需要的source参数。InputSource inputSource = new In
CSS Specificity 周凡杨 html 权重 Specificity css
有时候对于页面元素设置了样式，可为什么页面的显示没有匹配上呢？ because specificity CSS 的选择符是有权重的，当不同的选择符的样式设置有冲突时，浏览器会采用权重高的选择符设置的样式。规则： HTML标签的权重是1 Class 的权重是10 Id 的权重是100
java与servlet g21121 servlet
servlet 搞java web开发的人一定不会陌生，而且大家还会时常用到它。下面是java官方网站上对servlet的介绍： java官网对于servlet的解释写道 Java Servlet Technology Overview Servlets are the Java platform technology of choice for extending and enha
eclipse中安装maven插件 510888780 eclipse maven
1.首先去官网下载 Maven： http://www.apache.org/dyn/closer.cgi/maven/binaries/apache-maven-3.2.3-bin.tar.gz 下载完成之后将其解压，我将解压后的文件夹：apache-maven-3.2.3，并将它放在 D:\tools目录下，即 maven 最终的路径是：D:\tools\apache-mave
jpa@OneToOne关联关系布衣凌宇 jpa
Nruser里的pruserid关联到Pruser的主键id，实现对一个表的增删改，另一个表的数据随之增删改。 Nruser实体类 //***************************************************************** @Entity @Table(name="nruser") @DynamicInsert @Dynam
我的spring学习笔记11-Spring中关于声明式事务的配置 aijuans spring 事务配置
这两天学到事务管理这一块，结合到之前的terasoluna框架，觉得书本上讲的还是简单阿。我就把我从书本上学到的再结合实际的项目以及网上看到的一些内容，对声明式事务管理做个整理吧。我看得Spring in Action第二版中只提到了用TransactionProxyFactoryBean和<tx:advice/>,定义注释驱动这三种，我承认后两种的内容很好，很强大。但是实际的项目当中
java 动态代理简单实现 antlove java handler proxy dynamic service
dynamicproxy.service.HelloService package dynamicproxy.service; public interface HelloService { public void sayHello(); } dynamicproxy.service.impl.HelloServiceImpl package dynamicp
JDBC连接数据库百合不是茶 JDBC编程 JAVA操作oracle数据库
如果我们要想连接oracle公司的数据库，就要首先下载oralce公司的驱动程序，将这个驱动程序的jar包导入到我们工程中; JDBC链接数据库的代码和固定写法; 1,加载oracle数据库的驱动; &nb
单例模式中的多线程分析 bijian1013 java thread 多线程 java多线程
谈到单例模式，我们立马会想到饿汉式和懒汉式加载，所谓饿汉式就是在创建类时就创建好了实例，懒汉式在获取实例时才去创建实例，即延迟加载。饿汉式： package com.bijian.study; public class Singleton { private Singleton() { } // 注意这是private 只供内部调用 private static
javascript读取和修改原型特别需要注意原型的读写不具有对等性 bijian1013 JavaScript prototype
对于从原型对象继承而来的成员，其读和写具有内在的不对等性。比如有一个对象A，假设它的原型对象是B，B的原型对象是null。如果我们需要读取A对象的name属性值，那么JS会优先在A中查找，如果找到了name属性那么就返回；如果A中没有name属性，那么就到原型B中查找name，如果找到了就返回；如果原型B中也没有
【持久化框架MyBatis3六】MyBatis3集成第三方DataSource bit1129 dataSource
MyBatis内置了数据源的支持，如： <environments default="development"> <environment id="development"> <transactionManager type="JDBC" /> <data
我程序中用到的urldecode和base64decode,MD5 bitcarter c MD5 base64decode urldecode
这里是base64decode和urldecode，Md5在附件中。因为我是在后台所以需要解码： string Base64Decode(const char* Data,int DataByte,int& OutByte) { //解码表 const char DecodeTable[] = { 0, 0, 0, 0, 0, 0
腾讯资深运维专家周小军：QQ与微信架构的惊天秘密 ronin47
社交领域一直是互联网创业的大热门，从PC到移动端，从OICQ、MSN到QQ。到了移动互联网时代，社交领域应用开始彻底爆发，直奔黄金期。腾讯在过去几年里，社交平台更是火到爆，QQ和微信坐拥几亿的粉丝，QQ空间和朋友圈各种刷屏，写心得，晒照片，秀视频，那么谁来为企鹅保驾护航呢？支撑QQ和微信海量数据背后的架构又有哪些惊天内幕呢？本期大讲堂的内容来自今年2月份ChinaUnix对腾讯社交网络运营服务中心
java-69-旋转数组的最小元素。把一个数组最开始的若干个元素搬到数组的末尾，我们称之为数组的旋转。输入一个排好序的数组的一个旋转，输出旋转数组的最小元素 bylijinnan java
public class MinOfShiftedArray { /** * Q69 旋转数组的最小元素 * 把一个数组最开始的若干个元素搬到数组的末尾，我们称之为数组的旋转。输入一个排好序的数组的一个旋转，输出旋转数组的最小元素。 * 例如数组{3, 4, 5, 1, 2}为{1, 2, 3, 4, 5}的一个旋转，该数组的最小值为1。 */ publ
看博客，应该是有方向的 Cb123456 反省看博客
看博客，应该是有方向的: 我现在就复习以前的，在补补以前不会的，现在还不会的，同时完善完善项目，也看看别人的博客. 我刚突然想到的: 1.应该看计算机组成原理，数据结构，一些算法，还有关于android,java的。 2.对于我，也快大四了，看一些职业规划的，以及一些学习的经验，看看别人的工作总结的. 为什么要写
[开源与商业]做开源项目的人生活上一定要朴素,尽量减少对官方和商业体系的依赖 comsci 开源项目
为什么这样说呢？因为科学和技术的发展有时候需要一个平缓和长期的积累过程，但是行政和商业体系本身充满各种不稳定性和不确定性，如果你希望长期从事某个科研项目，但是却又必须依赖于某种行政和商业体系，那其中的过程必定充满各种风险。。。所以，为避免这种不确定性风险，我
一个 sql优化（[精华] 一个查询优化的分析调整全过程！很值得一看） cwqcwqmax9 sql
见 http://www.itpub.net/forum.php?mod=viewthread&tid=239011 Web翻页优化实例提交时间: 2004-6-18 15:37:49 回复发消息环境： Linux ve
Hibernat and Ibatis dashuaifu Hibernate ibatis
Hibernate VS iBATIS 简介 Hibernate 是当前最流行的O/R mapping框架，当前版本是3.05。它出身于sf.net，现在已经成为Jboss的一部分了 iBATIS 是另外一种优秀的O/R mapping框架，当前版本是2.0。目前属于apache的一个子项目了。相对Hibernate“O/R”而言，iBATIS 是一种“Sql Mappi
备份MYSQL脚本 dcj3sjt126com mysql
#!/bin/sh # this shell to backup mysql #[email protected] (QQ:1413161683 DuChengJiu) _dbDir=/var/lib/mysql/ _today=`date +%w` _bakDir=/usr/backup/$_today [ ! -d $_bakDir ] && mkdir -p
iOS第三方开源库的吐槽和备忘 dcj3sjt126com ios
转自 ibireme的博客做iOS开发总会接触到一些第三方库，这里整理一下，做一些吐槽。目前比较活跃的社区仍旧是Github，除此以外也有一些不错的库散落在Google Code、SourceForge等地方。由于Github社区太过主流，这里主要介绍一下Github里面流行的iOS库。首先整理了一份 Github上排名靠
html wlwmanifest.xml eoems html xml
所谓优化wp_head()就是把从wp_head中移除不需要元素，同时也可以加快速度。步骤：加入到function.php remove_action('wp_head', 'wp_generator'); //wp-generator移除wordpress的版本号，本身blog的版本号没什么意义，但是如果让恶意玩家看到，可能会用官网公布的漏洞攻击blog remov
浅谈Java定时器发展 hacksin java 并发 timer 定时器
java在jdk1.3中推出了定时器类Timer,而后在jdk1.5后由Dou Lea从新开发出了支持多线程的ScheduleThreadPoolExecutor，从后者的表现来看，可以考虑完全替代Timer了。 Timer与ScheduleThreadPoolExecutor对比： 1. Timer始于jdk1.3,其原理是利用一个TimerTask数组当作队列
移动端页面侧边导航滑入效果 ini jquery Web html5 css javascirpt
效果体验：http://hovertree.com/texiao/mobile/2.htm可以使用移动设备浏览器查看效果。效果使用到jquery-2.1.4.min.js，该版本的jQuery库是用于支持HTML5的浏览器上，不再兼容IE8以前的浏览器，现在移动端浏览器一般都支持HTML5，所以使用该jQuery没问题。HTML文件代码： <!DOCTYPE html> <h
AspectJ+Javasist记录日志 kane_xie aspectj javasist
在项目中碰到这样一个需求，对一个服务类的每一个方法，在方法开始和结束的时候分别记录一条日志，内容包括方法名，参数名+参数值以及方法执行的时间。 @Override public String get(String key) { // long start = System.currentTimeMillis(); // System.out.println("Be
redis学习笔记 MJC410621 redis NoSQL
1)nosql数据库主要由以下特点：非关系型的、分布式的、开源的、水平可扩展的。 1，处理超大量的数据 2，运行在便宜的PC服务器集群上， 3，击碎了性能瓶颈。 1)对数据高并发读写。 2)对海量数据的高效率存储和访问。 3)对数据的高扩展性和高可用性。 redis支持的类型： Sring 类型 set name lijie get name lijie set na
使用redis实现分布式锁 qifeifei
在多节点的系统中，如何实现分布式锁机制，其中用redis来实现是很好的方法之一，我们先来看一下jedis包中，有个类名BinaryJedis,它有个方法如下： public Long setnx(final byte[] key, final byte[] value) { checkIsInMulti(); client.setnx(key, value); ret
BI并非万能，中层业务管理报表要另辟蹊径张老师的菜大数据 BI 商业智能信息化
BI是商业智能的缩写，是可以帮助企业做出明智的业务经营决策的工具，其数据来源于各个业务系统，如ERP、CRM、SCM、进销存、HER、OA等。 BI系统不同于传统的管理信息系统，他号称是一个整体应用的解决方案，是融入管理思想的强大系统：有着系统整体的设计思想，支持对所有
安装rvm后出现rvm not a function 或者ruby -v后提示没安装ruby的问题 wudixiaotie function
1.在~/.bashrc最后加入 [[ -s "$HOME/.rvm/scripts/rvm" ]] && source "$HOME/.rvm/scripts/rvm" 2.重新启动terminal输入： rvm use ruby-2.2.1 --default 把当前安装的ruby版本设为默