- 从图中我们可以看到,该函数实现了从区间[0,1] 到区间(- ∞ ∞ ,+ ∞ ∞ ) 之间的映。我们只要将 y 用一个输入的线性函数替换,那么久实现了输入的线性变化到区间 [0,1] 之间的映射。
由上面的讲解可知,我们先计算对数几率函数的逆函数后得如下结果:这里我们记 logistic(z) 为 g (z), 转换之后, logit−1(z) l o g i t − 1 ( z ) 就是上面提到的二项分布发生的概率 p
其中,我们认为 Sigmoid 函数最为漂亮的是它的求导后形式非常简介、易用。如下:
(注:当然可以将 z 替换成多维度的线性回归方程如: z=θTx=θ0x0+θ1x1+...+θnxn=∑ni=0θixi z = θ T x = θ 0 x 0 + θ 1 x 1 + . . . + θ n x n = ∑ i = 0 n θ i x i 读者可自行推理)
%matplotlib inline
import pandas as pd
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
import tensorflow as tf
data = pd.read_csv('CHD.csv',header=0)
print data.describe() # 根据这个统计,我们可对下列的坐标 x,y 进行相应的调整
# 原始数据
plt.figure() # Create a new figure
# 坐标轴 x 从 0 ~ 70,y 从 -0.2 ~ 1.2
plt.axis([0,70,-0.2,1.2])
plt.title('Original data')
plt.scatter(data['age'],data['chd'])
# 数据中心零均值处理
plt.figure()
plt.axis([-30,30,-0.2,1.2])
plt.title('Zero mean')
plt.scatter(data['age'] - 44.38,data['chd'])
plt.figure()
plt.axis([-5,5,-0.2,1.2])
plt.title('Scaled by std dev')
plt.scatter((data['age']-44.8)/11.7,data['chd'])
print '\n',(data['age']/11.721327).describe()
age chd
count 100.000000 100.00000
mean 44.380000 0.43000
std 11.721327 0.49757
min 20.000000 0.00000
25% 34.750000 0.00000
50% 44.000000 0.00000
75% 55.000000 1.00000
max 69.000000 1.00000
# Parameters
learning_rate = 0.2
# 将训练数据分为 5个训练迭代循环,即 5 个 epochs,一个 epoch 大小为199,batch
training_epochs = 5
batch_size = 100
display_step = 1 # 每几个 epoch 画一次图
sess = tf.Session()
b = np.zeros((100,2)) # 100 行 2 列,全 0
# one-hot 编码,这里只是先用来说明一下 tf.one_hat 的使用方法
print '列表 [1,3,2,4] 的 one-hot 编码为: \n\n',sess.run(tf.one_hot(indices=[1,3,2,4],depth=5,on_value= 1,off_value=0,axis=1,name='a'))
列表 [1,3,2,4] 的 one-hot 编码为:
[[0 1 0 0 0]
[0 0 0 1 0]
[0 0 1 0 0]
[0 0 0 0 1]]
# tf Graph Input
x = tf.placeholder('float',[None,1]) # 数据的第一列, Placeholder for the 1D data, 前面的 None 可以根据后面 Feed 自适应
y = tf.placeholder('float',[None,2]) # 数据的第二列, Placeholder for the classes {2}
# Create model
# Set model weights 这里我们为数据流图创建初识变量和占位符。x, y 都为浮点变量
W = tf.Variable(tf.zeros([1,2])) # [[0],[0]]
b = tf.Variable(tf.zeros([2])) # [0,0]
# Construct model
# 注意:tf.nn :提供神经网络相关操作的支持,包括卷积操作(conv)、池化操作(pooling)、
# 归一化、loss、分类操作、embedding、RNN、Evaluation。等
# tf.layers:主要提供的高层的神经网络,主要和卷积相关的,是对tf.nn的进一步封装,tf.nn会更底层一些。
# 我们创建激活函数,并且将其作用于线性方程之上
activation = tf.nn.softmax(tf.matmul(x,W) + b)
# Minimize erroe using cross entropy
# 我们选择交叉熵作为损失函数,定义有优化器操作,选择梯度下降算法。
cost = tf.reduce_mean(-tf.reduce_sum(y*tf.log(activation),reduction_indices=1)) # Cross entropy
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost) # Gradient Descent
# Initializeing the variables
init = tf.global_variables_initializer()
# Launch the graph
with tf.Session() as sess:
tf.train.write_graph(sess.graph,'/Users/duzhe/Downloads/chen/Jupter-notebook/VSCODE_tensorflow/My_pratise/graph/Logist_one/','graph.pbtxt')
sess.run(init)
writer = tf.summary.FileWriter('/Users/duzhe/Downloads/chen/Jupter-notebook/VSCODE_tensorflow/My_pratise/graph/Logist_one/',sess.graph)
# Initialize the drawing graph structure
graphnumber = 321
# Generate a new graph
plt.figure(1)
# Iterate through all the epochs
for epoch in range(training_epochs):
avg_cost = 0.
total_batch = 400 / batch_size
# Loop over all batches
for i in range(total_batch):
# Transform the array into a one hot format
# 由 API 可知 因为 axis = -1,故shape 为 features x depth
# 此外 deep 一般选择 indices 加一
temp = tf.one_hot(indices=data['chd'], depth=2, on_value = 1, off_value = 0, axis = -1 , name = "a")
# 由此说明 前面站位符处定义的 shape 和 Feed 的 shape 要保持一样才行
# batch_xs, batch_ys = (data['age'].T - 44.38) / 11.721327, temp
batch_xs, batch_ys = (np.transpose([data['age']]) - 44.38) / 11.721327, temp
# Fit training using batch data
sess.run(optimizer,feed_dict={x: batch_xs.astype(float),y: batch_ys.eval()})
# Compute average loss, suming the corrent cost divided by the batch total number
avg_cost += sess.run(cost,feed_dict={x: batch_xs.astype(float),y: batch_ys.eval()})
# Display logs per epoch step
if epoch % display_step == 0:
print 'Epoch:', '%05d'%(epoch + 1), 'cost=',"{:.8f}".format(avg_cost)
# Generate a new graph, and add it to the complete graph
trX = np.linspace(-30, 30, 100)
print b.eval()
print W.eval()
Wdos = 2*W.eval()[0][0] / 11.721327
bdos = 2* b.eval()[0]
# Cenerate the probability function (即广义化了的 Sigmoid 函数)
trY = np.exp(-(Wdos*trX)+bdos)/(1+np.exp(-(Wdos*trX)+bdos))
# Draw the samples and the probability function, whithout the normalization
plt.subplots(graphnumber)
graphnumber = graphnumber + 1
#Plot a scatter draw of the random datapoints
plt.scatter(data['age'],data['chd'])
plt.plot(trX+44.38,trY) # Plot a scatter draw of the random datapoints
plt.grid(True)
# Plot the final graph
plt.savefig('test.svg')
Epoch: 00001 cost= 0.66692954
[ 0.014 -0.014]
[[-0.05061885 0.05061885]]
Epoch: 00001 cost= 1.31255966
[ 0.02660366 -0.02660367]
[[-0.09623589 0.09623589]]
···· 此处省略部分结果
import numpy as np
import pandas as pd
import tensorflow.contrib.learn as skflow
from sklearn import datasets, metrics, preprocessing
# Read data
data = pd.read_csv('./CHD.csv',header=0)
print data.describe()
age chd
count 100.000000 100.00000
mean 44.380000 0.43000
std 11.721327 0.49757
min 20.000000 0.00000
25% 34.750000 0.00000
50% 44.000000 0.00000
75% 55.000000 1.00000
max 69.000000 1.00000
# 像 sklearn 一样定义模型
def my_model(X,y):
return skflow.models.logistic_regression(X,y)
# 正则化 scaler
scaler = preprocessing.StandardScaler()
X = scaler.fit_transform([data['age'].astype(float)])
print scaler.get_params()b
{'copy': True, 'with_mean': True, 'with_std': True}
age = tf.feature_column.numeric_column('age')
# 建立自己的估计量
classifier = skflow.Estimator(model_fn=my_model,model_dir='/Users/duzhe/Downloads/chen/Jupter-notebook/VSCODE_tensorflow/My_pratise/skflow/Logistic/')
classifier.fit(X,data['chd'].astype(float))
print classifier.get_tensor_value('logistic_regression/bias:0')
print(classifier.get_tensor_value('logistic_regression/weight:0'))
score = metrics.accuracy_score(data['chd'].astype(float),classifier.predict(X))
print ('Accuracy: %f' % score)
import tensorflow as tf
import numpy as np
import pandas as pd
from keras.models import Sequential
from keras.layers import Dense
from sklearn import datasets, metrics,preprocessing
from sklearn.utils import shuffle
# Load the dataset
data = pd.read_csv('./CHD.csv',header=0)
print data.describe()
# Normalize the input data
scaler = preprocessing.StandardScaler()
# print data['age'].shape,type(data['age'])
# print data['age'].reshape(-1,1).shape, type(data['age'].reshape(-1,1))
# 将 Series 转化为 numpy (100, 1)
X = scaler.fit_transform(data['age'].reshape(-1,1))
# Shuffle the data
x, y = shuffle(X, data['chd'])
# Define the model as a logistic regression with
model = Sequential()
# 在这里我们的输出是 1 个 units ,输入特征是一个维度
model.add(Dense(1,activation='sigmoid',input_dim=1))
# 这里我们选用 rmsprop 优化方法,loss 选用 交叉熵
model.compile(optimizer='rmsprop',loss='binary_crossentropy')
# Fit the model with the first 90 elements, and spliting 70% /30% of them for training/validation set.
# verbose:日志显示,0为不在标准输出流输出日志信息,1为输出进度条记录,2为每个epoch输出一行记录
# epochs:整数,训练终止时的epoch值,训练将在达到该epoch值时停止
model.fit(x[:90],y[:90],nb_epoch=100,validation_split=0.33,shuffle=True,verbose=2)
# Evaluate the model with the last 10 elements
score = model.evaluate(x[90:],y[90:],verbose=2)
print model.metrics_names
print score
age chd
count 100.000000 100.00000
mean 44.380000 0.43000
std 11.721327 0.49757
min 20.000000 0.00000
25% 34.750000 0.00000
50% 44.000000 0.00000
75% 55.000000 1.00000
max 69.000000 1.00000
Train on 60 samples, validate on 30 samples
Epoch 1/100
- 0s - loss: 1.1083 - val_loss: 0.9735
Epoch 2/100
- 0s - loss: 1.1058 - val_loss: 0.9721
Epoch 3/100
- 0s - loss: 1.1040 - val_loss: 0.9710
Epoch 4/100
- 0s - loss: 1.1027 - val_loss: 0.9701
Epoch 5/100
- 0s - loss: 1.1012 - val_loss: 0.9691
···· 此处省略部分结果
- 0s - loss: 1.0101 - val_loss: 0.9033
Epoch 100/100
- 0s - loss: 1.0092 - val_loss: 0.9026
['loss']
1.1765038967132568
注:本文参考文献有(非常感谢下面这些优秀的博主,让我学到了很多): https://blog.csdn.net/behamcheung/article/details/71911133