问题:
数轴上,
[ 0,100]是第1类
[100,500]是第2类
[500,1000]是第3类
现在,要搭建一个NN,来达到分类的效果。
引用
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout
import numpy as np
数据生成(为了快捷,这里直接这么写了,因为python用得不熟···)
def get_rand(low=0, high=1, size=(1,1)):
if type(size) == type(1):
return low + (high-low)*np.random.rand(size)
if len(size) == 2:
return low + (high-low)*np.random.rand(size[0], size[1])
if len(size) == 3:
return low + (high-low)*np.random.rand(size[0], size[1], size[2])
data_len = 1000
data_x = get_rand(low=1, high=1000, size=(data_len,1))
#np.random.randint(3, size=[10, 1])
data_y = np.arange(data_len)*0
for i in range(data_len):
if data_x[i] < 100 :
data_y[i] = 0
elif data_x[i] < 500 :
data_y[i] = 1
else :
data_y[i] = 2
train_x = np.array(data_x)
train_y = keras.utils.to_categorical(data_y, num_classes=3)
模型搭建和训练1:
使用非生成器的fit,直接使用上面的train数据
batch_size=100
model = Sequential()
model.add(Dense(100, input_dim=1, activation='sigmoid'))
model.add(Dropout(0.2))
model.add(Dense(3, activation='sigmoid'))
#model.add(Dense(3, activation='softmax')) #二分问题,最后一层激活函数为sigmoid,损失函数loss为binary_crossentropy
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(x=train_x, y=train_y, epochs=50, batch_size=batch_size )
模型搭建和训练2:
使用生成器的fit_generator,需要做一个生成函数
def generate_datas(batch_size=100):
batches = data_len//batch_size #这里有可能导致溢出,暂时先这么写吧,还没分析清楚
while(True): #不停地生成batchs,应该是类似于epochs,每循环一个while,是一个epoch
for i in range(batches):
tmp_x = train_x[i*batch_size : (i+1)*batch_size]
tmp_y = train_y[i*batch_size : (i+1)*batch_size]
#print('\ni=', i, '\tx=', tmp_x.shape, '\ty=', tmp_y.shape)
yield (tmp_x, tmp_y)
batch_size = 100
model = Sequential()
model.add(Dense(100, input_dim=1, activation='sigmoid'))
model.add(Dropout(0.2))
#model.add(Dense(1, activation='sigmoid'))
model.add(Dense(3, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit_generator(generator=generate_datas(batch_size=batch_size), epochs=50, steps_per_epoch=data_len//batch_size )
#还要思考一下,steps_per_epoch这个的含义是什么。从代码来看,这个就是一个epoch的batchs数量。
运行结果:
没有使用生成器的:
Epoch 40/50
1000/1000 [==============================] - 0s 203us/step - loss: 0.2436 - acc: 0.9270
Epoch 41/50
1000/1000 [==============================] - 0s 78us/step - loss: 0.2358 - acc: 0.9220
Epoch 42/50
1000/1000 [==============================] - 0s 47us/step - loss: 0.2239 - acc: 0.9370
Epoch 43/50
1000/1000 [==============================] - 0s 94us/step - loss: 0.2213 - acc: 0.9380
Epoch 44/50
1000/1000 [==============================] - 0s 109us/step - loss: 0.2158 - acc: 0.9320
Epoch 45/50
1000/1000 [==============================] - 0s 203us/step - loss: 0.2113 - acc: 0.9410
Epoch 46/50
1000/1000 [==============================] - 0s 78us/step - loss: 0.2117 - acc: 0.9390
Epoch 47/50
1000/1000 [==============================] - 0s 78us/step - loss: 0.2226 - acc: 0.9160
Epoch 48/50
1000/1000 [==============================] - 0s 47us/step - loss: 0.2191 - acc: 0.9170
Epoch 49/50
1000/1000 [==============================] - 0s 125us/step - loss: 0.1989 - acc: 0.9450
Epoch 50/50
1000/1000 [==============================] - 0s 63us/step - loss: 0.2013 - acc: 0.9370
使用生成器的:
Epoch 40/50
10/10 [==============================] - 0s 42ms/step - loss: 0.2319 - acc: 0.9350
Epoch 41/50
10/10 [==============================] - 0s 6ms/step - loss: 0.2224 - acc: 0.9410
Epoch 42/50
10/10 [==============================] - 0s 5ms/step - loss: 0.2272 - acc: 0.9380
Epoch 43/50
10/10 [==============================] - 0s 5ms/step - loss: 0.2243 - acc: 0.9380
Epoch 44/50
10/10 [==============================] - 0s 14ms/step - loss: 0.2259 - acc: 0.9350
Epoch 45/50
10/10 [==============================] - 0s 14ms/step - loss: 0.2237 - acc: 0.9260
Epoch 46/50
10/10 [==============================] - 0s 5ms/step - loss: 0.2220 - acc: 0.9320
Epoch 47/50
10/10 [==============================] - 0s 5ms/step - loss: 0.2134 - acc: 0.9450
Epoch 48/50
10/10 [==============================] - 0s 9ms/step - loss: 0.2220 - acc: 0.9280
Epoch 49/50
10/10 [==============================] - 0s 5ms/step - loss: 0.2199 - acc: 0.9300
Epoch 50/50
10/10 [==============================] - 0s 3ms/step - loss: 0.1961 - acc: 0.9420
可见两者收敛时间是一致的。 使用生成器来fit,是为了避免内存不足等问题。