Tensorflow2.0使用tf.keras
作为核心api
大大减少使用难度。在2.0中主要清理了废弃的和重复的api
,使用keras
和eager execution
轻松构建模型。
pip install tensorflow==2.0.0-beta0 # 预览版
model = keras.Sequential([
keras.layers.Flatten(input_shape=(28, 28)),
keras.layers.Dense(128, activation=tf.nn.relu),
keras.layers.Dense(10, activation=tf.nn.softmax)
])
在模型准备好进行训练之前,它还需要一些配置。这些是在模型的编译(compile)步骤中添加的:
损失函数 —这可以衡量模型在培训过程中的准确程度。 我们希望将此函数最小化以"驱使"模型朝正确的方向拟合。
优化器 —这就是模型根据它看到的数据及其损失函数进行更新的方式。
评价方式 —用于监控训练和测试步骤。以下示例使用准确率(accuracy),即正确分类的图像的分数。
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(train_images, train_labels, epochs=5)
test_loss, test_acc = model.evaluate(test_images, test_labels)
predictions = model.predict(test_images)
为了防止发生过拟合,最好的解决方案是使用更多训练数据。用更多数据进行训练的模型自然能够更好地泛化。如无法采用这种解决方案,则次优解决方案是使用正则化等技术。这些技术会限制模型可以存储的信息的数量和类型。如果网络只能记住少量模式,那么优化过程将迫使它专注于最突出的模式,因为这些模式更有机会更好地泛化。
l2_model = keras.models.Sequential([
keras.layers.Dense(16, kernel_regularizer=keras.regularizers.l2(0.001),
activation=tf.nn.relu, input_shape=(NUM_WORDS,)),
keras.layers.Dense(16, kernel_regularizer=keras.regularizers.l2(0.001),
activation=tf.nn.relu),
keras.layers.Dense(1, activation=tf.nn.sigmoid)
])
dpt_model = keras.models.Sequential([
keras.layers.Dense(16, activation=tf.nn.relu, input_shape=(NUM_WORDS,)),
keras.layers.Dropout(0.5),
keras.layers.Dense(16, activation=tf.nn.relu),
keras.layers.Dropout(0.5),
keras.layers.Dense(1, activation=tf.nn.sigmoid)
])
在训练期间或训练结束时自动保存检查点。这样一来,您便可以使用经过训练的模型,而无需重新训练该模型,或从上次暂停的地方继续训练,以防训练过程中断。
checkpoint_path = "training_1/cp.ckpt"
checkpoint_dir = os.path.dirname(checkpoint_path)
# Create checkpoint callback
cp_callback = tf.keras.callbacks.ModelCheckpoint(checkpoint_path,
save_weights_only=True,
verbose=1)
model = create_model()
model.fit(train_images, train_labels, epochs = 10,
validation_data = (test_images,test_labels),
callbacks = [cp_callback]) # pass callback to training
#------------------手动保存-----------------#
# 手动保存权重
# Save the weights
model.save_weights('./checkpoints/my_checkpoint')
# Restore the weights
model = create_model()
model.load_weights('./checkpoints/my_checkpoint')
loss,acc = model.evaluate(test_images, test_labels)
print("Restored model, accuracy: {:5.2f}%".format(100*acc))
保存整个模型,整个模型可以保存到一个文件中,其中包含权重值、模型配置乃至优化器配置。这样,您就可以为模型设置检查点,并稍后从完全相同的状态继续训练,而无需访问原始代码。
model = create_model()
model.fit(train_images, train_labels, epochs=5)
# Save entire model to a HDF5 file
model.save('my_model.h5')
#----------------USE--------------#
# Recreate the exact same model, including weights and optimizer.
new_model = keras.models.load_model('my_model.h5')
new_model.summary()
model.load_weights(checkpoint_path) #创建的模型需要一致
loss,acc = model.evaluate(test_images, test_labels)
print("Restored model, accuracy: {:5.2f}%".format(100*acc))
# 加载最新权重
latest = tf.train.latest_checkpoint(checkpoint_dir)
model.load_weights(latest)
loss, acc = model.evaluate(test_images, test_labels)
print("Restored model, accuracy: {:5.2f}%".format(100*acc))
它允许用户在不创建静态图的情况下运行tensorflow代码。
import tensorflow as tf
tf.enable_eager_execution()
print(tf.square(2) + tf.square(3))
TensorFlow
与Numpy的区别在于,Tensorflow可以使用GPU、TPU加速。在TensorFlow中能够自动把ndarrays转换成Tensor。也可以将Tensor转换为numpy。
import numpy as np
ndarray = np.ones([3, 3])
print("TensorFlow operations convert numpy arrays to Tensors automatically")
tensor = tf.multiply(ndarray, 42)
print(tensor)
print("And NumPy operations convert Tensors to numpy arrays automatically")
print(np.add(tensor, 1))
print("The .numpy() method explicitly converts a Tensor to a numpy array")
print(tensor.numpy())
自动微分:
x = tf.ones((2, 2))
with tf.GradientTape() as t:
t.watch(x)
y = tf.reduce_sum(x)
z = tf.multiply(y, y)
# Derivative of z with respect to the original input tensor x
dz_dx = t.gradient(z, x)
for i in [0, 1]:
for j in [0, 1]:
assert dz_dx[i][j].numpy() == 8.0
训练数据格式 (60000, 28, 28);测试数据格式 (10000, 28, 28);标签为:0到9之间的整数。
import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt
print(tf.__version__)
## Get Train Data
fashion_mnist = keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()
print('train_image_shape {}'.format(train_images.shape))
print('train_labels_shape {}'.format(train_labels.shape))
print('test_image_shape {}'.format(test_images.shape))
print('test_labels_shape {}'.format(test_labels.shape))
train_images = train_images / 255.0
test_images = test_images / 255.0
#----------Plot 25 Train Picture------------#
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
# for i in range(25):
# plt.subplot(5,5,i+1)
# plt.xticks([]) # plt.xticks([-1,0,1],['-1','0','1']) # 第一个:对应X轴上的值,第二个:显示的文字
# plt.yticks([])
# plt.imshow(train_images[i])
# plt.xlabel(class_names[train_labels[i]])
# plt.show()
#--------------Seting Model-------------------#
model = keras.Sequential([
keras.layers.Flatten(input_shape=(28, 28)),
keras.layers.Dense(128, activation='relu'),
keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy',metrics=['accuracy'])
model.fit(train_images, train_labels, epochs=10)
test_loss, test_acc = model.evaluate(test_images, test_labels)
print('\nTest accuracy', test_acc)
predictions = model.predict(test_images)
print('prediction [0] {}'.format(predictions[0]))
#----------------Plot Image------------------#
def plot_image(i, predictions_array, true_label, img):
predictions_array, true_label, img = predictions_array[i], true_label[i], img[i]
plt.grid(False)
plt.xticks([])
plt.yticks([])
plt.imshow(img, cmap=plt.cm.binary)
predicted_label = np.argmax(predictions_array)
color = 'blue' if predicted_label == true_label else 'red'
plt.xlabel("{} {:2.0f}% ({})".format(class_names[predicted_label],
100 * np.max(predictions_array),
class_names[true_label]),color=color)
def plot_value_array(i, predictions_array, true_label):
predictions_array, true_label = predictions_array[i], true_label[i]
plt.grid(False)
plt.xticks([])
plt.yticks([])
thisplot = plt.bar(range(10), predictions_array, color="#777777")
plt.ylim([0, 1])
predicted_label = np.argmax(predictions_array)
thisplot[predicted_label].set_color('red')
thisplot[true_label].set_color('blue')
# 绘制前X个测试图像,预测标签和真实标签
# 以蓝色显示正确的预测,红色显示不正确的预测
num_rows = 5
num_cols = 3
num_images = num_rows*num_cols
plt.figure(figsize=(2*2*num_cols, 2*num_rows))
for i in range(num_images):
plt.subplot(num_rows, 2*num_cols, 2*i+1)
plot_image(i, predictions, test_labels, test_images)
plt.subplot(num_rows, 2*num_cols, 2*i+2)
plot_value_array(i, predictions, test_labels)
plt.show()
import tensorflow as tf
from tensorflow import keras
# 下载数据集,数据shape都为(25000,) 影评文本已转换为整数,其中每个整数都表示字典中的一个特定字词。每条文本长度不一致。
imdb = keras.datasets.imdb
(train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000)
影评(整数数组)必须转换为张量,然后才能馈送到神经网络中。我们可以通过以下两种方法实现这种转换:
对数组进行独热编码,将它们转换为由 0 和 1 构成的向量。例如,序列 [3, 5] 将变成一个 10000 维的向量,除索引 3 和 5 转换为 1 之外,其余全转换为 0。然后,将它作为网络的第一层,一个可以处理浮点向量数据的密集层。不过,这种方法会占用大量内存,需要一个大小为 num_words
* num_reviews
的矩阵。
或者,我们可以填充数组,使它们都具有相同的长度,然后创建一个形状为 max_length
* num_reviews
的整数张量。我们可以使用一个能够处理这种形状的嵌入层作为网络中的第一层。
在本教程中,我们将使用第二种方法。
# A dictionary mapping words to an integer index
word_index = imdb.get_word_index()
# The first indices are reserved
word_index = {k:(v+3) for k,v in word_index.items()}
word_index["" ] = 0
word_index["" ] = 1
word_index["" ] = 2 # unknown
word_index["" ] = 3
train_data = keras.preprocessing.sequence.pad_sequences(train_data,
value=word_index["" ],
padding='post',
maxlen=256)
test_data = keras.preprocessing.sequence.pad_sequences(test_data,
value=word_index["" ],
padding='post',
maxlen=256)
vocab_size = 10000
按顺序堆叠各个层以构建分类器:
model = keras.Sequential()
model.add(keras.layers.Embedding(vocab_size, 16))
model.add(keras.layers.GlobalAveragePooling1D())
model.add(keras.layers.Dense(16, activation=tf.nn.relu))
model.add(keras.layers.Dense(1, activation=tf.nn.sigmoid))
model.summary()
model.compile(optimizer=tf.optimizers.Adam(),
loss='binary_crossentropy',
metrics=['accuracy'])
x_val = train_data[:10000]
partial_x_train = train_data[10000:]
y_val = train_labels[:10000]
partial_y_train = train_labels[10000:]
history = model.fit(partial_x_train,
partial_y_train,
epochs=40,
batch_size=512,
validation_data=(x_val, y_val),
verbose=1)
results = model.evaluate(test_data, test_labels)
print('results'.format(results))
history_dict = history.history
history_dict.keys()
import matplotlib.pyplot as plt
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs = range(1, len(acc) + 1)
# "bo" is for "blue dot"
plt.plot(epochs, loss, 'bo', label='Training loss')
# b is for "solid blue line"
plt.plot(epochs, val_loss, 'b', label='Validation loss')
plt.title('Training and validation loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
plt.show()
我的微信公众号名称:深度学习与先进智能决策
微信公众号ID:MultiAgent1024
公众号介绍:主要研究强化学习、计算机视觉、深度学习、机器学习等相关内容,分享学习过程中的学习笔记和心得!期待您的关注,欢迎一起学习交流进步!