TensorFlow2.x——回归模型(regression)搭建

回归模型(regression)搭建

回归模型是一种预测性的建模技术,它研究的是因变量(目标)和自变量(预测器)之间的关系。这种技术通常用于预测分析,时间序列模型以及发现变量之间的因果关系。例如,司机的鲁莽驾驶与道路交通事故数量之间的关系,房价预测等。

代码示例:

import matplotlib as mpl
import matplotlib.pyplot as plt 
%matplotlib inline    
#为了能在notebook中显示图像
import numpy as np
import sklearn   
import pandas as pd 
import os 
import sys 
import time 
import tensorflow as tf 
from tensorflow import keras 
from sklearn.datasets import fetch_california_housing #从sklearn中引用加州的房价数据

housing = fetch_california_housing()
print(housing.DESCR)
print(housing.data.shape)
print(housing.target.shape)

(20640, 8)
(20640,)

import pprint
#引用pprint展示部分数据,pprint打印更加完整的数据结构,易于查看
pprint.pprint(housing.data[0:5])
pprint.pprint(housing.target[0:5])
array([[ 8.32520000e+00,  4.10000000e+01,  6.98412698e+00,
         1.02380952e+00,  3.22000000e+02,  2.55555556e+00,
         3.78800000e+01, -1.22230000e+02],
       [ 8.30140000e+00,  2.10000000e+01,  6.23813708e+00,
         9.71880492e-01,  2.40100000e+03,  2.10984183e+00,
         3.78600000e+01, -1.22220000e+02],
       [ 7.25740000e+00,  5.20000000e+01,  8.28813559e+00,
         1.07344633e+00,  4.96000000e+02,  2.80225989e+00,
         3.78500000e+01, -1.22240000e+02],
       [ 5.64310000e+00,  5.20000000e+01,  5.81735160e+00,
         1.07305936e+00,  5.58000000e+02,  2.54794521e+00,
         3.78500000e+01, -1.22250000e+02],
       [ 3.84620000e+00,  5.20000000e+01,  6.28185328e+00,
         1.08108108e+00,  5.65000000e+02,  2.18146718e+00,
         3.78500000e+01, -1.22250000e+02]])
array([4.526, 3.585, 3.521, 3.413, 3.422]
#引用train_test_split对数据集进行拆分
# test_size 控制切分比例,默认切分比例3:1
from sklearn.model_selection import train_test_split  

#拆分数据集,加载数据集后返回训练集以及测试集
x_train_all, x_test, y_train_all, y_test = train_test_split(housing.data, housing.target, random_state = 1) 

#将训练集进行一次拆分为验证集和测试集
x_train, x_valid, y_train, y_valid = train_test_split(x_train_all, y_train_all, random_state=2)

print(x_train.shape, y_train.shape)
print(x_valid.shape, y_valid.shape)
print(x_test.shape, y_test.shape)

(11610, 8) (11610,)
(3870, 8) (3870,)
(5160, 8) (5160,)

from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()
#对数据进行归一化处理

#由于transform处理处理数据时二维数组,所以要将数据转化一下
#x_train: [none, 28, 28] -> [none, 784]
#对于使用fit_transform 和transform 请参考我的TensorFlow中的博客
x_train_scaled = scaler.fit_transform(x_train)
x_valid_scaled = scaler.transform(x_valid)
x_test_scaled = scaler.transform(x_test)

#注意在归一化数据后,之后使用的数据要使用新的归一化数据
#使用序贯模型Sequential   tf.keras.models.sequential()

model = keras.models.Sequential([
    #keras.layers.Flatten(input_shape = x_train.shape[1:]),如果数据已经展平,真不用再使用flatten。
    keras.layers.Dense(30, activation="relu",input_shape = x_train.shape[1:]),
    keras.layers.Dense(1),
])
model.summary()
Model: "sequential_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_6 (Dense)              (None, 30)                270       
_________________________________________________________________
dense_7 (Dense)              (None, 1)                 31        
=================================================================
Total params: 301
Trainable params: 301
Non-trainable params: 0
_________________________________________________________________
#编译compile
model.compile(loss = "mean_squared_error",   #损失函数:使用均方根误差
             optimizer = "sgd", #优化函数 
             ) 


#训练模型会,返回一个结果保存在history中
history = model.fit(x_train_scaled, y_train, epochs=50, 
                    validation_data=(x_valid_scaled, y_valid), 
                    ) 
Train on 11610 samples, validate on 3870 samples
Epoch 1/50
11610/11610 [==============================] - 2s 167us/sample - loss: 0.3132 - val_loss: 0.3154
Epoch 2/50
11610/11610 [==============================] - 1s 106us/sample - loss: 0.3109 - val_loss: 0.3166
Epoch 3/50
11610/11610 [==============================] - 1s 97us/sample - loss: 0.3138 - val_loss: 0.3184
Epoch 4/50
11610/11610 [==============================] - 1s 101us/sample - loss: 0.3107 - val_loss: 0.3133
Epoch 5/50
11610/11610 [==============================] - 1s 94us/sample - loss: 0.3150 - val_loss: 0.3164
Epoch 6/50
11610/11610 [==============================] - 1s 95us/sample - loss: 0.3243 - val_loss: 0.3166
Epoch 7/50
11610/11610 [==============================] - 1s 96us/sample - loss: 0.3161 - val_loss: 0.3155
Epoch 8/50
11610/11610 [==============================] - 1s 99us/sample - loss: 0.3143 - val_loss: 0.3162
Epoch 9/50
11610/11610 [==============================] - 1s 103us/sample - loss: 0.3118 - val_loss: 0.3161
model.evaluate(x_test_scaled, y_test)

0.33426328003406525

你可能感兴趣的:(TensorFlow)