RNN循环神经网络

用RNN(循环神经网络)实现连续数据的预测(以股票预测为例)

回顾卷积神经网络

卷积就是特征提取器，就是C(卷积)B(批标准化)A(激活)P(池化)D(随机丢弃)

卷积神经网络：借助卷积核提取空间特征后，送入全连接网络，实现离散数据的分类

循环神经网络

循环核

参数时间共享，循环层提取时间信息

前向传播时：记忆体内存储的状态信息ht，在每个时刻都被刷新，三个参数矩阵wxh、whh、why自始自终都是固定不变的。

反向传播时：三个参数矩阵wxh、whh、why被梯度下降法更新

循环核时间步展开

按照时间步展开，就是把循环核按照时间轴方向展开，每个时刻记忆体状态信息ht被更新，记忆体周围的参数矩阵wxh、whh和why是固定不变的，我们训练优化的就是这些参数矩阵，训练完成后，使用效果最好的参数矩阵，执行前向传播，输出预测结果，其实，这和我们人类的预测是一致的，你脑中的记忆体都会根据当前的输入而更新。当前的预测推理，是根据你以往的知识积累，用固化下来的参数矩阵进行的推理判断

循环神经网络：就是借助循环核提取时间特征后，再将提取到的时间特征信息送入全连接网络，实现连续数据的预测。

是整个循环网络的末层，从公式来看，就是一个全连接网络，借助全连接网络，实现连续数据预测

循环计算层

每个循环核构成一层循环计算层，循环计算层的层数是向输出方向增长的，每个循环计算层中的每个每个循环核中记忆体的个数是根据你的需求任意指定的

TF描述循环计算层(带大家计算循环计算层的前向传播)

tf.keras.layers.SimpleRNN(记忆体个数,activation='激活函数',return_sequences=是否每个时刻输出ht到下一层)

activation='激活函数'(不写，默认使用tanh)

return_sequences=True 各时间步输出ht

return_sequences=False 仅最后时间步输出ht(默认)

例：SimpleRNN(3,return_sequences=True)

此api对于送入循环层的数据维度是有要求的，要求送入循环层的数据是三维的。第一维是送入样本的总数量，第二维是循环核按时间步展开的步数，第三维是每个时间步输入特征的个数

循环计算过程

字母预测:输入a预测出b，输入b预测出c，输入c预测出d，输入d预测出e，输入e预测出a

使用独热码对a、b、c、d、e编码为10000、01000、00100、00010、00001

实践：ABCDE字母预测

One-hot
Embedding

实践：股票预测

RNN
LSTM
GRU

import numpy as np
import tensorflow as tf
from tensorflow.keras.layers import Dense, SimpleRNN
import matplotlib.pyplot as plt
import os

input_word = "abcde"
w_to_id = {'a': 0, 'b': 1, 'c': 2, 'd': 3, 'e': 4}  # 单词映射到数值id的词典
id_to_onehot = {0: [1., 0., 0., 0., 0.], 1: [0., 1., 0., 0., 0.], 2: [0., 0., 1., 0., 0.], 3: [0., 0., 0., 1., 0.],
                4: [0., 0., 0., 0., 1.]}  # id编码为one-hot

x_train = [id_to_onehot[w_to_id['a']], id_to_onehot[w_to_id['b']], id_to_onehot[w_to_id['c']],
           id_to_onehot[w_to_id['d']], id_to_onehot[w_to_id['e']]]
y_train = [w_to_id['b'], w_to_id['c'], w_to_id['d'], w_to_id['e'], w_to_id['a']]

np.random.seed(7)
np.random.shuffle(x_train)
np.random.seed(7)
np.random.shuffle(y_train)
tf.random.set_seed(7)

# 使x_train符合SimpleRNN输入要求：[送入样本数， 循环核时间展开步数， 每个时间步输入特征个数]。
# 此处整个数据集送入，送入样本数为len(x_train)；输入1个字母出结果，循环核时间展开步数为1; 表示为独热码有5个输入特征，每个时间步输入特征个数为5
x_train = np.reshape(x_train, (len(x_train), 1, 5))
y_train = np.array(y_train)

model = tf.keras.Sequential([
    SimpleRNN(3),
    Dense(5, activation='softmax')
])

model.compile(optimizer=tf.keras.optimizers.Adam(0.01),
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
              metrics=['sparse_categorical_accuracy'])

checkpoint_save_path = "./checkpoint/rnn_onehot_1pre1.ckpt"

if os.path.exists(checkpoint_save_path + '.index'):
    print('-------------load the model-----------------')
    model.load_weights(checkpoint_save_path)

cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path,
                                                 save_weights_only=True,
                                                 save_best_only=True,
                                                 monitor='loss')  # 由于fit没有给出测试集，不计算测试集准确率，根据loss，保存最优模型

history = model.fit(x_train, y_train, batch_size=32, epochs=100, callbacks=[cp_callback])

model.summary()

# print(model.trainable_variables)
file = open('./weights.txt', 'w')  # 参数提取
for v in model.trainable_variables:
    file.write(str(v.name) + '\n')
    file.write(str(v.shape) + '\n')
    file.write(str(v.numpy()) + '\n')
file.close()

###############################################    show   ###############################################

# 显示训练集和验证集的acc和loss曲线
acc = history.history['sparse_categorical_accuracy']
loss = history.history['loss']

plt.subplot(1, 2, 1)
plt.plot(acc, label='Training Accuracy')
plt.title('Training Accuracy')
plt.legend()

plt.subplot(1, 2, 2)
plt.plot(loss, label='Training Loss')
plt.title('Training Loss')
plt.legend()
plt.show()

############### predict #############

preNum = int(input("input the number of test alphabet:"))
for i in range(preNum):
    alphabet1 = input("input test alphabet:")
    alphabet = [id_to_onehot[w_to_id[alphabet1]]]
    # 使alphabet符合SimpleRNN输入要求：[送入样本数， 循环核时间展开步数， 每个时间步输入特征个数]。此处验证效果送入了1个样本，送入样本数为1；输入1个字母出结果，所以循环核时间展开步数为1; 表示为独热码有5个输入特征，每个时间步输入特征个数为5
    alphabet = np.reshape(alphabet, (1, 1, 5))
    result = model.predict(alphabet)
    pred = tf.argmax(result, axis=1)
    pred = int(pred)
    tf.print(alphabet1 + '->' + input_word[pred])

-------------load the model-----------------
Epoch 1/100
1/1 [==============================] - 0s 65ms/step - loss: 0.0397 - sparse_categorical_accuracy: 1.0000
Epoch 2/100
1/1 [==============================] - 0s 57ms/step - loss: 0.0395 - sparse_categorical_accuracy: 1.0000
Epoch 3/100
1/1 [==============================] - 0s 56ms/step - loss: 0.0393 - sparse_categorical_accuracy: 1.0000
Epoch 4/100
1/1 [==============================] - 0s 53ms/step - loss: 0.0391 - sparse_categorical_accuracy: 1.0000
Epoch 95/100
1/1 [==============================] - 0s 59ms/step - loss: 0.0251 - sparse_categorical_accuracy: 1.0000
Epoch 96/100
1/1 [==============================] - 0s 56ms/step - loss: 0.0250 - sparse_categorical_accuracy: 1.0000
Epoch 97/100
1/1 [==============================] - 0s 57ms/step - loss: 0.0249 - sparse_categorical_accuracy: 1.0000
Epoch 98/100
1/1 [==============================] - 0s 56ms/step - loss: 0.0248 - sparse_categorical_accuracy: 1.0000
Epoch 99/100
1/1 [==============================] - 0s 51ms/step - loss: 0.0247 - sparse_categorical_accuracy: 1.0000
Epoch 100/100
1/1 [==============================] - 0s 64ms/step - loss: 0.0246 - sparse_categorical_accuracy: 1.0000
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
simple_rnn (SimpleRNN)       multiple                  27        
_________________________________________________________________
dense (Dense)                multiple                  20        
=================================================================
Total params: 47
Trainable params: 47
Non-trainable params: 0
_________________________________________________________________

output_1_1.png

input the number of test alphabet:5
input test alphabet:a
a->b
input test alphabet:b
b->c
input test alphabet:d
d->e
input test alphabet:c
c->d
input test alphabet:a
a->b

循环计算过程

输入abcd输出e，输入bcde输出a，输入cdea输出b，输入deab输出c，输入eabc输出d

连续输入四个字母，预测下一个字母为例描述循环核按时间展开后，循环计算过程。
下面为代码

import numpy as np
import tensorflow as tf
from tensorflow.keras.layers import Dense, SimpleRNN
import matplotlib.pyplot as plt
import os

input_word = "abcde"
w_to_id = {'a': 0, 'b': 1, 'c': 2, 'd': 3, 'e': 4}  # 单词映射到数值id的词典
id_to_onehot = {0: [1., 0., 0., 0., 0.], 1: [0., 1., 0., 0., 0.], 2: [0., 0., 1., 0., 0.], 3: [0., 0., 0., 1., 0.],
                4: [0., 0., 0., 0., 1.]}  # id编码为one-hot

x_train = [
    [id_to_onehot[w_to_id['a']], id_to_onehot[w_to_id['b']], id_to_onehot[w_to_id['c']], id_to_onehot[w_to_id['d']]],
    [id_to_onehot[w_to_id['b']], id_to_onehot[w_to_id['c']], id_to_onehot[w_to_id['d']], id_to_onehot[w_to_id['e']]],
    [id_to_onehot[w_to_id['c']], id_to_onehot[w_to_id['d']], id_to_onehot[w_to_id['e']], id_to_onehot[w_to_id['a']]],
    [id_to_onehot[w_to_id['d']], id_to_onehot[w_to_id['e']], id_to_onehot[w_to_id['a']], id_to_onehot[w_to_id['b']]],
    [id_to_onehot[w_to_id['e']], id_to_onehot[w_to_id['a']], id_to_onehot[w_to_id['b']], id_to_onehot[w_to_id['c']]],
]
y_train = [w_to_id['e'], w_to_id['a'], w_to_id['b'], w_to_id['c'], w_to_id['d']]

np.random.seed(7)
np.random.shuffle(x_train)
np.random.seed(7)
np.random.shuffle(y_train)
tf.random.set_seed(7)

# 使x_train符合SimpleRNN输入要求：[送入样本数， 循环核时间展开步数， 每个时间步输入特征个数]。
# 此处整个数据集送入，送入样本数为len(x_train)；输入4个字母出结果，循环核时间展开步数为4; 表示为独热码有5个输入特征，每个时间步输入特征个数为5
x_train = np.reshape(x_train, (len(x_train), 4, 5))
y_train = np.array(y_train)

model = tf.keras.Sequential([
    SimpleRNN(3),
    Dense(5, activation='softmax')
])

model.compile(optimizer=tf.keras.optimizers.Adam(0.01),
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
              metrics=['sparse_categorical_accuracy'])

checkpoint_save_path = "./checkpoint/rnn_onehot_4pre1.ckpt"

if os.path.exists(checkpoint_save_path + '.index'):
    print('-------------load the model-----------------')
    model.load_weights(checkpoint_save_path)

cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path,
                                                 save_weights_only=True,
                                                 save_best_only=True,
                                                 monitor='loss')  # 由于fit没有给出测试集，不计算测试集准确率，根据loss，保存最优模型

history = model.fit(x_train, y_train, batch_size=32, epochs=100, callbacks=[cp_callback])

model.summary()

# print(model.trainable_variables)
file = open('./weights.txt', 'w')  # 参数提取
for v in model.trainable_variables:
    file.write(str(v.name) + '\n')
    file.write(str(v.shape) + '\n')
    file.write(str(v.numpy()) + '\n')
file.close()

###############################################    show   ###############################################

# 显示训练集和验证集的acc和loss曲线
acc = history.history['sparse_categorical_accuracy']
loss = history.history['loss']

plt.subplot(1, 2, 1)
plt.plot(acc, label='Training Accuracy')
plt.title('Training Accuracy')
plt.legend()

plt.subplot(1, 2, 2)
plt.plot(loss, label='Training Loss')
plt.title('Training Loss')
plt.legend()
plt.show()

############### predict #############

preNum = int(input("input the number of test alphabet:"))
for i in range(preNum):
    alphabet1 = input("input test alphabet:")
    alphabet = [id_to_onehot[w_to_id[a]] for a in alphabet1]
    # 使alphabet符合SimpleRNN输入要求：[送入样本数， 循环核时间展开步数， 每个时间步输入特征个数]。此处验证效果送入了1个样本，送入样本数为1；输入4个字母出结果，所以循环核时间展开步数为4; 表示为独热码有5个输入特征，每个时间步输入特征个数为5
    alphabet = np.reshape(alphabet, (1, 4, 5))
    result = model.predict(alphabet)
    pred = tf.argmax(result, axis=1)
    pred = int(pred)
    tf.print(alphabet1 + '->' + input_word[pred])

-------------load the model-----------------
Epoch 1/100
1/1 [==============================] - 0s 65ms/step - loss: 0.2530 - sparse_categorical_accuracy: 1.0000
Epoch 2/100
1/1 [==============================] - 0s 53ms/step - loss: 0.2490 - sparse_categorical_accuracy: 1.0000
Epoch 3/100
1/1 [==============================] - 0s 51ms/step - loss: 0.2451 - sparse_categorical_accuracy: 1.0000
Epoch 4/100
1/1 [==============================] - 0s 57ms/step - loss: 0.2412 - sparse_categorical_accuracy: 1.0000
Epoch 5/100
1/1 [==============================] - 0s 65ms/step - loss: 0.2375 - sparse_categorical_accuracy: 1.0000
Epoch 96/100
1/1 [==============================] - 0s 75ms/step - loss: 0.0873 - sparse_categorical_accuracy: 1.0000
Epoch 97/100
1/1 [==============================] - 0s 59ms/step - loss: 0.0866 - sparse_categorical_accuracy: 1.0000
Epoch 98/100
1/1 [==============================] - 0s 55ms/step - loss: 0.0859 - sparse_categorical_accuracy: 1.0000
Epoch 99/100
1/1 [==============================] - 0s 57ms/step - loss: 0.0852 - sparse_categorical_accuracy: 1.0000
Epoch 100/100
1/1 [==============================] - 0s 56ms/step - loss: 0.0846 - sparse_categorical_accuracy: 1.0000
Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
simple_rnn_2 (SimpleRNN)     multiple                  27        
_________________________________________________________________
dense_2 (Dense)              multiple                  20        
=================================================================
Total params: 47
Trainable params: 47
Non-trainable params: 0
_________________________________________________________________

output_3_1.png

input the number of test alphabet:5
input test alphabet:abcd
abcd->e
input test alphabet:bcda
bcda->b
input test alphabet:cdab
cdab->c
input test alphabet:dabc
dabc->d
input test alphabet:eabc
eabc->d

Embedding--一种编码方法

独热码：数据量大过于稀疏，映射之间是独立的，没有表现出关联性

Embedding：是一种单词编码方法，用低维向量实现了编码。这种编码通过神经网络训练优化，能表达出单词间的相关性

tf.keras.layers.Embedding(词汇表大小，编码维度)

编码维度就是用几个数字表达一个单词，对1-100进行编码，[4]编码为[0.25,0.1,0.11]

例：tf.keras.layers.Embedding(100,3)

入Embedding时，x_train维度：[送入样本数，循环核时间展开步数]

# Embedding编码实现输入一个字母预测一个字母例子
import numpy as np
import tensorflow as tf
from tensorflow.keras.layers import Dense, SimpleRNN, Embedding
import matplotlib.pyplot as plt
import os

input_word = "abcde"
w_to_id = {'a': 0, 'b': 1, 'c': 2, 'd': 3, 'e': 4}  # 单词映射到数值id的词典

x_train = [w_to_id['a'], w_to_id['b'], w_to_id['c'], w_to_id['d'], w_to_id['e']]
y_train = [w_to_id['b'], w_to_id['c'], w_to_id['d'], w_to_id['e'], w_to_id['a']]

np.random.seed(7)
np.random.shuffle(x_train)
np.random.seed(7)
np.random.shuffle(y_train)
tf.random.set_seed(7)

# 使x_train符合Embedding输入要求：[送入样本数， 循环核时间展开步数] ，
# 此处整个数据集送入所以送入，送入样本数为len(x_train)；输入1个字母出结果，循环核时间展开步数为1。
x_train = np.reshape(x_train, (len(x_train), 1))
y_train = np.array(y_train)

model = tf.keras.Sequential([
    Embedding(5, 2),
    SimpleRNN(3),
    Dense(5, activation='softmax')
])

model.compile(optimizer=tf.keras.optimizers.Adam(0.01),
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
              metrics=['sparse_categorical_accuracy'])

checkpoint_save_path = "./checkpoint/run_embedding_1pre1.ckpt"

if os.path.exists(checkpoint_save_path + '.index'):
    print('-------------load the model-----------------')
    model.load_weights(checkpoint_save_path)

cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path,
                                                 save_weights_only=True,
                                                 save_best_only=True,
                                                 monitor='loss')  # 由于fit没有给出测试集，不计算测试集准确率，根据loss，保存最优模型

history = model.fit(x_train, y_train, batch_size=32, epochs=100, callbacks=[cp_callback])

model.summary()

# print(model.trainable_variables)
file = open('./weights.txt', 'w')  # 参数提取
for v in model.trainable_variables:
    file.write(str(v.name) + '\n')
    file.write(str(v.shape) + '\n')
    file.write(str(v.numpy()) + '\n')
file.close()

###############################################    show   ###############################################

# 显示训练集和验证集的acc和loss曲线
acc = history.history['sparse_categorical_accuracy']
loss = history.history['loss']

plt.subplot(1, 2, 1)
plt.plot(acc, label='Training Accuracy')
plt.title('Training Accuracy')
plt.legend()

plt.subplot(1, 2, 2)
plt.plot(loss, label='Training Loss')
plt.title('Training Loss')
plt.legend()
plt.show()

############### predict #############

preNum = int(input("input the number of test alphabet:"))
for i in range(preNum):
    alphabet1 = input("input test alphabet:")
    alphabet = [w_to_id[alphabet1]]
    # 使alphabet符合Embedding输入要求：[送入样本数， 循环核时间展开步数]。
    # 此处验证效果送入了1个样本，送入样本数为1；输入1个字母出结果，循环核时间展开步数为1。
    alphabet = np.reshape(alphabet, (1, 1))
    result = model.predict(alphabet)
    pred = tf.argmax(result, axis=1)
    pred = int(pred)
    tf.print(alphabet1 + '->' + input_word[pred])

-------------load the model-----------------
Epoch 1/100
1/1 [==============================] - 0s 59ms/step - loss: 0.5450 - sparse_categorical_accuracy: 1.0000
Epoch 2/100
1/1 [==============================] - 0s 55ms/step - loss: 0.5380 - sparse_categorical_accuracy: 1.0000
Epoch 3/100
1/1 [==============================] - 0s 48ms/step - loss: 0.5312 - sparse_categorical_accuracy: 1.0000
Epoch 96/100
1/1 [==============================] - 0s 57ms/step - loss: 0.2340 - sparse_categorical_accuracy: 1.0000
Epoch 97/100
1/1 [==============================] - 0s 58ms/step - loss: 0.2324 - sparse_categorical_accuracy: 1.0000
Epoch 98/100
1/1 [==============================] - 0s 54ms/step - loss: 0.2308 - sparse_categorical_accuracy: 1.0000
Epoch 99/100
1/1 [==============================] - 0s 70ms/step - loss: 0.2292 - sparse_categorical_accuracy: 1.0000
Epoch 100/100
1/1 [==============================] - 0s 61ms/step - loss: 0.2275 - sparse_categorical_accuracy: 1.0000
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding (Embedding)        (None, None, 2)           10        
_________________________________________________________________
simple_rnn (SimpleRNN)       (None, 3)                 18        
_________________________________________________________________
dense (Dense)                (None, 5)                 20        
=================================================================
Total params: 48
Trainable params: 48
Non-trainable params: 0
_________________________________________________________________

output_5_1.png

input the number of test alphabet:5
input test alphabet:a
a->b
input test alphabet:c
c->d
input test alphabet:d
d->e
input test alphabet:e
e->a
input test alphabet:b
b->c

# Embedding编码实现输入四个字母预测一个字母例子
import numpy as np
import tensorflow as tf
from tensorflow.keras.layers import Dense, SimpleRNN, Embedding
import matplotlib.pyplot as plt
import os

input_word = "abcdefghijklmnopqrstuvwxyz"
w_to_id = {'a': 0, 'b': 1, 'c': 2, 'd': 3, 'e': 4,
           'f': 5, 'g': 6, 'h': 7, 'i': 8, 'j': 9,
           'k': 10, 'l': 11, 'm': 12, 'n': 13, 'o': 14,
           'p': 15, 'q': 16, 'r': 17, 's': 18, 't': 19,
           'u': 20, 'v': 21, 'w': 22, 'x': 23, 'y': 24, 'z': 25}  # 单词映射到数值id的词典

training_set_scaled = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
                       11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
                       21, 22, 23, 24, 25]

x_train = []
y_train = []

for i in range(4, 26):
    x_train.append(training_set_scaled[i - 4:i])
    y_train.append(training_set_scaled[i])

np.random.seed(7)
np.random.shuffle(x_train)
np.random.seed(7)
np.random.shuffle(y_train)
tf.random.set_seed(7)

# 使x_train符合Embedding输入要求：[送入样本数， 循环核时间展开步数] ，
# 此处整个数据集送入所以送入，送入样本数为len(x_train)；输入4个字母出结果，循环核时间展开步数为4。
x_train = np.reshape(x_train, (len(x_train), 4))
y_train = np.array(y_train)

model = tf.keras.Sequential([
    Embedding(26, 2),
    SimpleRNN(10),
    Dense(26, activation='softmax')
])

model.compile(optimizer=tf.keras.optimizers.Adam(0.01),
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
              metrics=['sparse_categorical_accuracy'])

checkpoint_save_path = "./checkpoint/rnn_embedding_4pre1.ckpt"

if os.path.exists(checkpoint_save_path + '.index'):
    print('-------------load the model-----------------')
    model.load_weights(checkpoint_save_path)

cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path,
                                                 save_weights_only=True,
                                                 save_best_only=True,
                                                 monitor='loss')  # 由于fit没有给出测试集，不计算测试集准确率，根据loss，保存最优模型

history = model.fit(x_train, y_train, batch_size=32, epochs=100, callbacks=[cp_callback])

model.summary()

file = open('./weights.txt', 'w')  # 参数提取
for v in model.trainable_variables:
    file.write(str(v.name) + '\n')
    file.write(str(v.shape) + '\n')
    file.write(str(v.numpy()) + '\n')
file.close()

###############################################    show   ###############################################

# 显示训练集和验证集的acc和loss曲线
acc = history.history['sparse_categorical_accuracy']
loss = history.history['loss']

plt.subplot(1, 2, 1)
plt.plot(acc, label='Training Accuracy')
plt.title('Training Accuracy')
plt.legend()

plt.subplot(1, 2, 2)
plt.plot(loss, label='Training Loss')
plt.title('Training Loss')
plt.legend()
plt.show()

################# predict ##################

preNum = int(input("input the number of test alphabet:"))
for i in range(preNum):
    alphabet1 = input("input test alphabet:")
    alphabet = [w_to_id[a] for a in alphabet1]
    # 使alphabet符合Embedding输入要求：[送入样本数， 时间展开步数]。
    # 此处验证效果送入了1个样本，送入样本数为1；输入4个字母出结果，循环核时间展开步数为4。
    alphabet = np.reshape(alphabet, (1, 4))
    result = model.predict([alphabet])
    pred = tf.argmax(result, axis=1)
    pred = int(pred)
    tf.print(alphabet1 + '->' + input_word[pred])

Epoch 1/100
1/1 [==============================] - 0s 60ms/step - loss: 3.2579 - sparse_categorical_accuracy: 0.0455
Epoch 2/100
1/1 [==============================] - 0s 56ms/step - loss: 3.2364 - sparse_categorical_accuracy: 0.0455
Epoch 3/100
1/1 [==============================] - 0s 51ms/step - loss: 3.2139 - sparse_categorical_accuracy: 0.0909
Epoch 4/100
1/1 [==============================] - 0s 54ms/step - loss: 3.1891 - sparse_categorical_accuracy: 0.0455
Epoch 5/100
1/1 [==============================] - 0s 53ms/step - loss: 3.1619 - sparse_categorical_accuracy: 0.0455
Epoch 96/100
1/1 [==============================] - 0s 80ms/step - loss: 0.2158 - sparse_categorical_accuracy: 1.0000
Epoch 97/100
1/1 [==============================] - 0s 74ms/step - loss: 0.2103 - sparse_categorical_accuracy: 1.0000
Epoch 98/100
1/1 [==============================] - 0s 66ms/step - loss: 0.2050 - sparse_categorical_accuracy: 1.0000
Epoch 99/100
1/1 [==============================] - 0s 74ms/step - loss: 0.1998 - sparse_categorical_accuracy: 1.0000
Epoch 100/100
1/1 [==============================] - 0s 66ms/step - loss: 0.1948 - sparse_categorical_accuracy: 1.0000
Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding_1 (Embedding)      (None, None, 2)           52        
_________________________________________________________________
simple_rnn_1 (SimpleRNN)     (None, 10)                130       
_________________________________________________________________
dense_1 (Dense)              (None, 26)                286       
=================================================================
Total params: 468
Trainable params: 468
Non-trainable params: 0
_________________________________________________________________

output_6_1.png

input the number of test alphabet:3
input test alphabet:abcd
abcd->e
input test alphabet:opqr
opqr->s
input test alphabet:wxyz
wxyz->o

RNN实现股票预测(使用60天的开盘价预测第61天的开盘价)

import numpy as np
import tensorflow as tf
from tensorflow.keras.layers import Dropout, Dense, SimpleRNN
import matplotlib.pyplot as plt
import os
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error, mean_absolute_error
import math

maotai = pd.read_csv('./SH600519.csv')  # 读取股票文件

training_set = maotai.iloc[0:2426 - 300, 2:3].values  # 前(2426-300=2126)天的开盘价作为训练集,表格从0开始计数，2:3 是提取[2:3)列，前闭后开,故提取出C列开盘价
test_set = maotai.iloc[2426 - 300:, 2:3].values  # 后300天的开盘价作为测试集

# 归一化
sc = MinMaxScaler(feature_range=(0, 1))  # 定义归一化：归一化到(0，1)之间
training_set_scaled = sc.fit_transform(training_set)  # 求得训练集的最大值，最小值这些训练集固有的属性，并在训练集上进行归一化
test_set = sc.transform(test_set)  # 利用训练集的属性对测试集进行归一化

x_train = []
y_train = []

x_test = []
y_test = []

# 测试集：csv表格中前2426-300=2126天数据
# 利用for循环，遍历整个训练集，提取训练集中连续60天的开盘价作为输入特征x_train，第61天的数据作为标签，for循环共构建2426-300-60=2066组数据。
for i in range(60, len(training_set_scaled)):
    x_train.append(training_set_scaled[i - 60:i, 0])
    y_train.append(training_set_scaled[i, 0])
# 对训练集进行打乱
np.random.seed(7)
np.random.shuffle(x_train)
np.random.seed(7)
np.random.shuffle(y_train)
tf.random.set_seed(7)
# 将训练集由list格式变为array格式
x_train, y_train = np.array(x_train), np.array(y_train)

# 使x_train符合RNN输入要求：[送入样本数， 循环核时间展开步数， 每个时间步输入特征个数]。
# 此处整个数据集送入，送入样本数为x_train.shape[0]即2066组数据；输入60个开盘价，预测出第61天的开盘价，循环核时间展开步数为60; 每个时间步送入的特征是某一天的开盘价，只有1个数据，故每个时间步输入特征个数为1
x_train = np.reshape(x_train, (x_train.shape[0], 60, 1))
# 测试集：csv表格中后300天数据
# 利用for循环，遍历整个测试集，提取测试集中连续60天的开盘价作为输入特征x_train，第61天的数据作为标签，for循环共构建300-60=240组数据。
for i in range(60, len(test_set)):
    x_test.append(test_set[i - 60:i, 0])
    y_test.append(test_set[i, 0])
# 测试集变array并reshape为符合RNN输入要求：[送入样本数， 循环核时间展开步数， 每个时间步输入特征个数]
x_test, y_test = np.array(x_test), np.array(y_test)
x_test = np.reshape(x_test, (x_test.shape[0], 60, 1))

model = tf.keras.Sequential([
    SimpleRNN(80, return_sequences=True),
    Dropout(0.2),
    SimpleRNN(100),
    Dropout(0.2),
    Dense(1)
])

model.compile(optimizer=tf.keras.optimizers.Adam(0.001),
              loss='mean_squared_error')  # 损失函数用均方误差
# 该应用只观测loss数值，不观测准确率，所以删去metrics选项，一会在每个epoch迭代显示时只显示loss值

checkpoint_save_path = "./checkpoint/rnn_stock.ckpt"

if os.path.exists(checkpoint_save_path + '.index'):
    print('-------------load the model-----------------')
    model.load_weights(checkpoint_save_path)

cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path,
                                                 save_weights_only=True,
                                                 save_best_only=True,
                                                 monitor='val_loss')

history = model.fit(x_train, y_train, batch_size=64, epochs=50, validation_data=(x_test, y_test), validation_freq=1,
                    callbacks=[cp_callback])

model.summary()

file = open('./weights.txt', 'w')  # 参数提取
for v in model.trainable_variables:
    file.write(str(v.name) + '\n')
    file.write(str(v.shape) + '\n')
    file.write(str(v.numpy()) + '\n')
file.close()

loss = history.history['loss']
val_loss = history.history['val_loss']

plt.plot(loss, label='Training Loss')
plt.plot(val_loss, label='Validation Loss')
plt.title('Training and Validation Loss')
plt.legend()
plt.show()

################## predict ######################
# 测试集输入模型进行预测
predicted_stock_price = model.predict(x_test)
# 对预测数据还原---从（0，1）反归一化到原始范围
predicted_stock_price = sc.inverse_transform(predicted_stock_price)
# 对真实数据还原---从（0，1）反归一化到原始范围
real_stock_price = sc.inverse_transform(test_set[60:])
# 画出真实数据和预测数据的对比曲线
plt.plot(real_stock_price, color='red', label='MaoTai Stock Price')
plt.plot(predicted_stock_price, color='blue', label='Predicted MaoTai Stock Price')
plt.title('MaoTai Stock Price Prediction')
plt.xlabel('Time')
plt.ylabel('MaoTai Stock Price')
plt.legend()
plt.show()

##########evaluate##############
# calculate MSE 均方误差 ---> E[(预测值-真实值)^2] (预测值减真实值求平方后求均值)
mse = mean_squared_error(predicted_stock_price, real_stock_price)
# calculate RMSE 均方根误差--->sqrt[MSE]    (对均方误差开方)
rmse = math.sqrt(mean_squared_error(predicted_stock_price, real_stock_price))
# calculate MAE 平均绝对误差----->E[|预测值-真实值|](预测值减真实值求绝对值后求均值）
mae = mean_absolute_error(predicted_stock_price, real_stock_price)
print('均方误差: %.6f' % mse)
print('均方根误差: %.6f' % rmse)
print('平均绝对误差: %.6f' % mae)

Epoch 1/50
33/33 [==============================] - 3s 82ms/step - loss: 0.1162 - val_loss: 0.0434
Epoch 2/50
33/33 [==============================] - 2s 75ms/step - loss: 0.0262 - val_loss: 0.0044
Epoch 46/50
33/33 [==============================] - 3s 106ms/step - loss: 0.0012 - val_loss: 0.0011
Epoch 47/50
33/33 [==============================] - 3s 98ms/step - loss: 0.0013 - val_loss: 0.0098
Epoch 48/50
33/33 [==============================] - 3s 105ms/step - loss: 0.0013 - val_loss: 0.0010
Epoch 49/50
33/33 [==============================] - 3s 99ms/step - loss: 0.0012 - val_loss: 0.0042
Epoch 50/50
33/33 [==============================] - 3s 100ms/step - loss: 0.0011 - val_loss: 0.0081
Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
simple_rnn_2 (SimpleRNN)     multiple                  6560      
_________________________________________________________________
dropout (Dropout)            multiple                  0         
_________________________________________________________________
simple_rnn_3 (SimpleRNN)     multiple                  18100     
_________________________________________________________________
dropout_1 (Dropout)          multiple                  0         
_________________________________________________________________
dense_2 (Dense)              multiple                  101       
=================================================================
Total params: 24,761
Trainable params: 24,761
Non-trainable params: 0
_________________________________________________________________

output_8_1.png

output_8_2.png

均方误差: 4064.391444
均方根误差: 63.752580
平均绝对误差: 59.361189

用LSTM实现股票预测

传统循环网络RNN可以通过记忆体实现短期记忆进行连续数据的预测，但是当连续数据的序列变长时，会使展开时间步过长。在反向传播更新参数时，梯度要按照时间步连续相乘，会导致梯度消失。所以在1997年Hochreitere等人提出了长短期记忆网络LSTM

LSTM计算过程

长短期记忆网络引入了三个门限，输入门

、遗忘门

、输出门

，引入了表征长期记忆的细胞态

，引入了等待存入长期记忆的候选态

输入门(门限)：

遗忘门(门限)：

输出门(门限)：

细胞态(长期记忆)：

记忆体(短期记忆)：

候选态(归纳出的新知识)：

TF描述LSTM层

tf.keras.layers.LSTM(记忆体个数,return_sequences=是否返回输出)

return_sequences=True 各时间步输出ht

return_sequences=False 仅最后时间步输出ht(默认)

举个例子

model=tf.keras.Sequential([
    LSTM(80,return_sequences=True),
    Dropout(0.2),
    LSTM(100),
    Dropout(0.2),
    Dense(1)
])

import numpy as np
import tensorflow as tf
from tensorflow.keras.layers import Dropout, Dense, LSTM
import matplotlib.pyplot as plt
import os
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error, mean_absolute_error
import math

maotai = pd.read_csv('./SH600519.csv')  # 读取股票文件

training_set = maotai.iloc[0:2426 - 300, 2:3].values  # 前(2426-300=2126)天的开盘价作为训练集,表格从0开始计数，2:3 是提取[2:3)列，前闭后开,故提取出C列开盘价
test_set = maotai.iloc[2426 - 300:, 2:3].values  # 后300天的开盘价作为测试集

# 归一化
sc = MinMaxScaler(feature_range=(0, 1))  # 定义归一化：归一化到(0，1)之间
training_set_scaled = sc.fit_transform(training_set)  # 求得训练集的最大值，最小值这些训练集固有的属性，并在训练集上进行归一化
test_set = sc.transform(test_set)  # 利用训练集的属性对测试集进行归一化

x_train = []
y_train = []

x_test = []
y_test = []

# 测试集：csv表格中前2426-300=2126天数据
# 利用for循环，遍历整个训练集，提取训练集中连续60天的开盘价作为输入特征x_train，第61天的数据作为标签，for循环共构建2426-300-60=2066组数据。
for i in range(60, len(training_set_scaled)):
    x_train.append(training_set_scaled[i - 60:i, 0])
    y_train.append(training_set_scaled[i, 0])
# 对训练集进行打乱
np.random.seed(7)
np.random.shuffle(x_train)
np.random.seed(7)
np.random.shuffle(y_train)
tf.random.set_seed(7)
# 将训练集由list格式变为array格式
x_train, y_train = np.array(x_train), np.array(y_train)

# 使x_train符合RNN输入要求：[送入样本数， 循环核时间展开步数， 每个时间步输入特征个数]。
# 此处整个数据集送入，送入样本数为x_train.shape[0]即2066组数据；输入60个开盘价，预测出第61天的开盘价，循环核时间展开步数为60; 每个时间步送入的特征是某一天的开盘价，只有1个数据，故每个时间步输入特征个数为1
x_train = np.reshape(x_train, (x_train.shape[0], 60, 1))
# 测试集：csv表格中后300天数据
# 利用for循环，遍历整个测试集，提取测试集中连续60天的开盘价作为输入特征x_train，第61天的数据作为标签，for循环共构建300-60=240组数据。
for i in range(60, len(test_set)):
    x_test.append(test_set[i - 60:i, 0])
    y_test.append(test_set[i, 0])
# 测试集变array并reshape为符合RNN输入要求：[送入样本数， 循环核时间展开步数， 每个时间步输入特征个数]
x_test, y_test = np.array(x_test), np.array(y_test)
x_test = np.reshape(x_test, (x_test.shape[0], 60, 1))

model = tf.keras.Sequential([
    LSTM(80, return_sequences=True),
    Dropout(0.2),
    LSTM(100),
    Dropout(0.2),
    Dense(1)
])

model.compile(optimizer=tf.keras.optimizers.Adam(0.001),
              loss='mean_squared_error')  # 损失函数用均方误差
# 该应用只观测loss数值，不观测准确率，所以删去metrics选项，一会在每个epoch迭代显示时只显示loss值

checkpoint_save_path = "./checkpoint/LSTM_stock.ckpt"

if os.path.exists(checkpoint_save_path + '.index'):
    print('-------------load the model-----------------')
    model.load_weights(checkpoint_save_path)

cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path,
                                                 save_weights_only=True,
                                                 save_best_only=True,
                                                 monitor='val_loss')

history = model.fit(x_train, y_train, batch_size=64, epochs=50, validation_data=(x_test, y_test), validation_freq=1,
                    callbacks=[cp_callback])

model.summary()

file = open('./weights.txt', 'w')  # 参数提取
for v in model.trainable_variables:
    file.write(str(v.name) + '\n')
    file.write(str(v.shape) + '\n')
    file.write(str(v.numpy()) + '\n')
file.close()

loss = history.history['loss']
val_loss = history.history['val_loss']

plt.plot(loss, label='Training Loss')
plt.plot(val_loss, label='Validation Loss')
plt.title('Training and Validation Loss')
plt.legend()
plt.show()

################## predict ######################
# 测试集输入模型进行预测
predicted_stock_price = model.predict(x_test)
# 对预测数据还原---从（0，1）反归一化到原始范围
predicted_stock_price = sc.inverse_transform(predicted_stock_price)
# 对真实数据还原---从（0，1）反归一化到原始范围
real_stock_price = sc.inverse_transform(test_set[60:])
# 画出真实数据和预测数据的对比曲线
plt.plot(real_stock_price, color='red', label='MaoTai Stock Price')
plt.plot(predicted_stock_price, color='blue', label='Predicted MaoTai Stock Price')
plt.title('MaoTai Stock Price Prediction')
plt.xlabel('Time')
plt.ylabel('MaoTai Stock Price')
plt.legend()
plt.show()

##########evaluate##############
# calculate MSE 均方误差 ---> E[(预测值-真实值)^2] (预测值减真实值求平方后求均值)
mse = mean_squared_error(predicted_stock_price, real_stock_price)
# calculate RMSE 均方根误差--->sqrt[MSE]    (对均方误差开方)
rmse = math.sqrt(mean_squared_error(predicted_stock_price, real_stock_price))
# calculate MAE 平均绝对误差----->E[|预测值-真实值|](预测值减真实值求绝对值后求均值）
mae = mean_absolute_error(predicted_stock_price, real_stock_price)
print('均方误差: %.6f' % mse)
print('均方根误差: %.6f' % rmse)
print('平均绝对误差: %.6f' % mae)

Epoch 1/50
33/33 [==============================] - 1s 38ms/step - loss: 0.0135 - val_loss: 0.0493
Epoch 2/50
33/33 [==============================] - 1s 25ms/step - loss: 0.0013 - val_loss: 0.0140
Epoch 3/50
33/33 [==============================] - 1s 25ms/step - loss: 0.0011 - val_loss: 0.0070
Epoch 4/50
33/33 [==============================] - 1s 22ms/step - loss: 0.0013 - val_loss: 0.0204
Epoch 5/50
33/33 [==============================] - 1s 26ms/step - loss: 0.0011 - val_loss: 0.0066
Epoch 49/50
33/33 [==============================] - 1s 22ms/step - loss: 6.8864e-04 - val_loss: 0.0059
Epoch 50/50
33/33 [==============================] - 1s 22ms/step - loss: 5.6762e-04 - val_loss: 0.0061
Model: "sequential_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
lstm (LSTM)                  multiple                  26240     
_________________________________________________________________
dropout_2 (Dropout)          multiple                  0         
_________________________________________________________________
lstm_1 (LSTM)                multiple                  72400     
_________________________________________________________________
dropout_3 (Dropout)          multiple                  0         
_________________________________________________________________
dense_3 (Dense)              multiple                  101       
=================================================================
Total params: 98,741
Trainable params: 98,741
Non-trainable params: 0
_________________________________________________________________

![[图片上传中...(output_12_1.png-a5d73f-1601466475068-0)]

output_12_2.png

](https://upload-images.jianshu.io/upload_images/4469078-a2c3276490817af9.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)

output_10_2.png

均方误差: 3084.908334
均方根误差: 55.541951
平均绝对误差: 49.045563

用GRU实现股票预测

在2014年cho等人简化了LSTM结构，提出了GRU网络，GRU使记忆体ht融合了长期记忆和短期记忆

更新门：

重置门：

记忆体：

候选隐藏层：

TF描述GRU层

tf.keras.layers.GRU(记忆体个数，return_sequences=是否返回输出)

return_sequences=True 各时间步输出ht,一般最后一层用false，中间层用true