神经网络，英文neural network，跟random forest一样，是众多机器学习方法的一种。

tensorflow，是实现神经网络的其中一种框架。如官网所说:an open-sourse mechine learning framework for everyone，当然还有其他框架可以选择，比如caffe，PyTorch，keras等等。

tensorflow基础知识点：
关于tensorflow实现neural network的一些基本概念和实现，请参考系列文章。

深度学习模型之激活函数（Activation Function）
深度学习策略之损失函数（Loss Function OR Cost Function）
深度学习之算法（Algorithm）
深度学习之评估标准(F1)
深度学习训练之Batch
深度学习可视化之Tensorboard

正式开启重启版的第三部。

一、读取数据并转换格式

读取数据，并将label标签one-hot，再将Dataframe结果转成ndarray。

# 读取数据
data = pd.read_csv("haoma11yue_after_onehot_and_RobustScaler.csv", index_col=0, parse_dates=True)
print(data.shape) #(43777, 70)

# 将X和Y拆分开
from sklearn.model_selection  import train_test_split
X = data.loc[:, data.columns != 'yonghuzhuangtai']
y = data.loc[:, data.columns == 'yonghuzhuangtai']
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size = 0.22, random_state = 0)

# label进行onthot
y_train_one_hot = pd.get_dummies(y_train['yonghuzhuangtai'], prefix= 'yonghuzhuangtai')
y_test_one_hot = pd.get_dummies(y_test['yonghuzhuangtai'], prefix= 'yonghuzhuangtai')

# DataFrame转ndarray
y_train_one_hot_n=y_train_one_hot.values
X_train_n        =X_train.values

y_test_one_hot_n=y_test_one_hot.values
X_test_n        =X_test.values

接下来就是逐步构建这个flow graph

image.png

二、为inputs占坑placeholder

placeholder只是明确了列数，行是使用None，动态的。

#-------输入输出的维度参数----------
features = X_train.shape[1]  # 输入层，xx个特征
numClasses = 2  # 输出层，稀疏表示

# 指定X（输入层）Y（输出层）的大小，占坑
with tf.name_scope("inputs"):
    X_input = tf.placeholder(tf.float32, shape = [None, features],name="X_input")
    y_true  = tf.placeholder(tf.float32, shape = [None, numClasses],name="y_true")

image.png

三、模型neural network

先定义神经网络中一层的函数，后续直接调用构建多层neural network。后续深入看看weight和baises的初始化值对结果的影响。

# 函数add_layer，添加一个神经网络隐藏层
# layoutname，隐藏层的名字
# inputs，隐藏层的输入，也就是前一层
# in_size，输入的纬度，也就是前一层的neurons数目
# out_size，输出的纬度，也就是该隐藏层的neurons数目
# activatuib_funaction，激活函数

def add_layer(layoutname, inputs, in_size, out_size, activatuib_funaction=None):
    with tf.name_scope(layoutname):
        with tf.name_scope('weights'):
            Weights=tf.Variable(tf.random_normal([in_size,out_size], stddev=0.1),name='W')  #stddev=0.1，无意中试过，加这个，效率速度快很多。其实涉及到W的初始化问题了。
            tf.summary.histogram('Weights',Weights)  #histogram_summary用于生成分布图，也可以用scalar_summary记录存数值
            
        with tf.name_scope('biases'):
            biases = tf.Variable(tf.constant(0.1, shape=[out_size]), name='b')
            tf.summary.histogram('Biases',biases)
        
        with tf.name_scope('Wx_plus_b'):
            Wx_plus_b=tf.add(tf.matmul(inputs,Weights),biases)

    if activatuib_funaction is None:
        outputs=Wx_plus_b
    else :
        outputs=activatuib_funaction(Wx_plus_b)
    return outputs

构建neural network，这里构建三个隐藏层，加上输入层和输出层，总共有5层，使用到的激活函数暂时是tf.nn.relu，后续试试tf.tanh，看看差别。

num_HiddenNeurons1 = 50  # 隐藏层第一层
num_HiddenNeurons2 = 40  # 隐藏层第二层
num_HiddenNeurons3 = 20  # 隐藏层第三层
    
with tf.name_scope('first_hindden_layer'):
    first_hindden_layer=add_layer("first_hindden_layer",X_input,features,num_HiddenNeurons1,activatuib_funaction=tf.nn.relu)

with tf.name_scope('second_hindden_layer'):
    second_hindden_layer=add_layer("second_hindden_layer",first_hindden_layer,num_HiddenNeurons1,num_HiddenNeurons2,activatuib_funaction=tf.nn.relu)

with tf.name_scope('third_hindden_layer'):
    third_hindden_layer=add_layer("third_hindden_layer",second_hindden_layer,num_HiddenNeurons2,num_HiddenNeurons3,activatuib_funaction=tf.nn.relu)    
    
with tf.name_scope('prediction'):
    y_prediction =add_layer('prediction',third_hindden_layer,num_HiddenNeurons3,numClasses,activatuib_funaction=None)
    

# y_prediction的输出并不是概率分布，没有经过softmax
# [[ 84.97052765  47.09545517]
#  [ 84.97052765  47.09545517]]

with tf.name_scope('prediction_softmax'):
    y_prediction_softmax = tf.nn.softmax(y_prediction)

with tf.name_scope('Save'):
    saver = tf.train.Saver(max_to_keep=4)  #保留最近四次的模型

image.png

四、策略loss function

暂时使用square再mean的loss function，后续试试softmax_cross_entropy_with_logits，看看区别。
其实这里有个疑问，策略是优化的方向，mean_square，主要体现的应该是precision，但是最后评判的结果是f1，有loss function直接体现f1的吗？

with tf.name_scope("loss"):
# 结构风险=经验风险+正则化，经验风险使用交叉熵，正则化使用L2。
# 暂时不使用正则化，效果好像好点，或者说正则化还用得不好啊。
# ------------------------------------------------square--------------------------------------------------------
    loss = tf.reduce_mean(tf.square(y_true - y_prediction_softmax))

image.png

五、算法

with tf.name_scope("train"):
# ----------------------------------指数衰减学习率---------------------------------------
# exponential_decay(learning_rate, global_step, decay_steps, decay_rate, staircase=False, name=None)
# decayed_learning_rate = learning_rate * decay_rate ^ (global_step / decay_steps)   ，（If the argument `staircase` is `True）

    Iterations = 0
    learning_rate = tf.train.exponential_decay(learning_rate=0.1, global_step=Iterations, decay_steps=10000, decay_rate=0.99, staircase=True)    #staircase 楼梯。

# ----------------------------------算法---------------------------------------
    opt = tf.train.GradientDescentOptimizer(learning_rate = learning_rate).minimize(loss)

image.png

六、评价

主要还是那几个：f1，recall，precision，只是用tensor实现而已，最看重的还是f1。

def tf_confusion_metrics(model, actual_classes, session, feed_dict):
    predictions = tf.argmax(model, 1)
    actuals = tf.argmax(actual_classes, 1)

    ones_like_actuals = tf.ones_like(actuals)  # tf.ones_like: A `Tensor` with all elements set to 1.
    zeros_like_actuals = tf.zeros_like(actuals)
    ones_like_predictions = tf.ones_like(predictions)
    zeros_like_predictions = tf.zeros_like(predictions)

    # true positive 猜测和真实一致
    tp_op = tf.reduce_sum(                               # tf.reduce_sum，统计1的个数
    tf.cast(                                             # tf.cast：  Casts a tensor to a new type.把true变回1
      tf.logical_and(                                    # tf.logical_and: A `Tensor` of type `bool`.  把预测的true和实际的true取且操作
        tf.equal(actuals, ones_like_actuals),            # tf.equal：A `Tensor` of type `bool`.其实就是把1变成TRUE.
        tf.equal(predictions, ones_like_predictions)
      ), 
      "float"
    )
    )

    # true negative 猜测和真实一致
    tn_op = tf.reduce_sum(
    tf.cast(
      tf.logical_and(
        tf.equal(actuals, zeros_like_actuals), 
        tf.equal(predictions, zeros_like_predictions)
      ), 
      "float"
    )
    )

    # false positive 实际是0，猜测是1
    fp_op = tf.reduce_sum(
    tf.cast(
      tf.logical_and(
        tf.equal(actuals, zeros_like_actuals), 
        tf.equal(predictions, ones_like_predictions)
      ), 
      "float"
    )
    )

    # false negative 实际是1，猜测是0
    fn_op = tf.reduce_sum(
    tf.cast(
      tf.logical_and(
        tf.equal(actuals, ones_like_actuals), 
        tf.equal(predictions, zeros_like_predictions)
      ), 
      "float"
    )
    )

    tp, tn, fp, fn = \
    session.run(
      [tp_op, tn_op, fp_op, fn_op], 
      feed_dict
    )

    with tf.name_scope("confusion_matrix"):
        with tf.name_scope("precision"):
            if((float(tp) + float(fp)) == 0):
                precision = 0
            else:
                precision = float(tp)/(float(tp) + float(fp))
            tf.summary.scalar("Precision",precision)
            
        with tf.name_scope("recall"):
            if((float(tp) + float(fn)) ==0):
                recall = 0
            else:
                recall = float(tp) / (float(tp) + float(fn))
            tf.summary.scalar("Recall",recall)

        with tf.name_scope("f1_score"):
            if((precision + recall) ==0):
                f1_score = 0
            else:   
                f1_score = (2 * (precision * recall)) / (precision + recall)
            tf.summary.scalar("F1_score",f1_score)
            
        with tf.name_scope("accuracy"):
            accuracy = (float(tp) + float(tn))  /  (float(tp) + float(fp) + float(fn) + float(tn))
            tf.summary.scalar("Accuracy",accuracy)

    print ('F1 Score = ', f1_score, ', Precision = ', precision,', Recall = ', recall, ', Accuracy = ', accuracy)

image.png

除了TensorFlow方式实现外，还可以用sklearn实现。

import sklearn as sk
import numpy as np
from sklearn.metrics import confusion_matrix

# 打印所有的scores参数，包括precision、recall、f1等等
    # y_pred_score,神经网络的预测结果,经过softmax,type:  
    # y_true_onehot_score,神经网络的true值输入，是one-hot编码后的type:  
def scores_all(y_pred_onehot_score, y_true_onehot_score):

    y_pred_score = np.argmax(y_pred_onehot_score, axis = 1) # 反one-hot编码
    y_true_score = np.argmax(y_true_onehot_score, axis = 1) # 反one-hot编码

#     print("precision:",sk.metrics.precision_score(y_true_score,y_pred_score), \
#           "recall:",sk.metrics.recall_score(y_true_score,y_pred_score), \
#           "f1:",sk.metrics.f1_score(y_true_score,y_pred_score))

    print("f1:",sk.metrics.f1_score(y_true_score,y_pred_score))

七、batch

batch的实现。

# --------------函数说明-----------------
# sourceData_feature ：训练集的feature部分
# sourceData_label   ：训练集的label部分
# batch_size  ： 牛肉片的厚度
# num_epochs  ： 牛肉翻煮多少次
# shuffle ： 是否打乱数据

def batch_iter(sourceData_feature,sourceData_label, batch_size, num_epochs, shuffle=True):
    
    data_size = len(sourceData_feature)
    
    num_batches_per_epoch = int(data_size / batch_size)  # 样本数/batch块大小,多出来的“尾数”，不要了
    
    for epoch in range(num_epochs):
        # Shuffle the data at each epoch
        if shuffle:
            shuffle_indices = np.random.permutation(np.arange(data_size))
            
            shuffled_data_feature = sourceData_feature[shuffle_indices]
            shuffled_data_label   = sourceData_label[shuffle_indices]
        else:
            shuffled_data_feature = sourceData_feature
            shuffled_data_label = sourceData_label

        for batch_num in range(num_batches_per_epoch):   # batch_num取值0到num_batches_per_epoch-1
            start_index = batch_num * batch_size
            end_index = min((batch_num + 1) * batch_size, data_size)

            yield (shuffled_data_feature[start_index:end_index] , shuffled_data_label[start_index:end_index])

八、训练train

进行迭代训练train。


batchSize = 1000 # 定义具体的牛肉厚度   
epoch_count = 200  # 训练的epoch次数
Iterations = 0  #  记录迭代的次数

print("how many steps would train: ", (epoch_count * int((len(X_train_n)/batchSize))))
print('---------------------------start training------------------------------')

# sess
sess = tf.Session()
merged = tf.summary.merge_all()  #Merges all summaries collected in the default graph.

# 定义训练过程中的参数（比如loss，weight，baises）保存到哪里
writer_val = tf.summary.FileWriter("logs/val", sess.graph)
# 保存训练过程中各个环节消耗的时间和内存
writer_timeandplace = tf.summary.FileWriter("logs/timeandplace", sess.graph)

sess.run(tf.global_variables_initializer())

# 迭代 必须注意batch_iter是yield→generator，所以for语句有特别
for (batchInput, batchLabels) in batch_iter(X_train_n, y_train_one_hot_n, batchSize, epoch_count, shuffle=True):

    if Iterations%1000 == 0:  
        # --------------------------------训练并记录-----------------------------------------------
        run_options = tf.RunOptions(trace_level = tf.RunOptions.FULL_TRACE) # 配置运行时需要记录的信息
        run_metadata = tf.RunMetadata()  # 运行时记录运行信息的proto
        
        # train
        trainingopt,trainingLoss,merged_r,y_prediction_softmax_r,y_prediction_r= \
                    sess.run([opt,loss,merged,y_prediction_softmax,y_prediction],  \
                    feed_dict={X_input:batchInput, y_true:batchLabels}, options =run_options, run_metadata = run_metadata)    
#         print(batchInput[0:5,:])
#         print(y_prediction_r[0:5,:])
#         print(y_prediction_softmax_r[0:5,:])
#         print(batchLabels[0:5,:])
        # 记录参数
        writer_val.add_summary(merged_r, Iterations)
        # 将节点在运行时的信息写入日志文件
        writer_timeandplace.add_run_metadata(run_metadata, 'Iterations%03d' % Iterations)
        
        # 输出效果 
        print("step %d, %d people leave in this batch, loss is %g" \
              %(Iterations, sum(np.argmax(batchLabels,axis = 1)) ,trainingLoss ))
        
        print('--------------------train scores------------------')        
        tf_confusion_metrics(y_prediction_softmax_r, batchLabels, sess, feed_dict={X_input:batchInput, y_true:batchLabels})
#         scores_all(y_prediction_softmax_r ,batchLabels)
        
        # test set 的效果
        trainingLoss, y_prediction_softmax_r = sess.run([loss,y_prediction_softmax], feed_dict = {X_input: X_test_n, y_true:y_test_one_hot_n})
        print('**********************test score**********************')
        tf_confusion_metrics(y_prediction_softmax_r, y_test_one_hot_n, sess, feed_dict = {X_input: X_test_n, y_true:y_test_one_hot_n})
#         scores_all(y_prediction_softmax_r ,y_test_one_hot_n)
        
    else:
        # train
        trainingopt, trainingLoss, merged_r = sess.run([opt,loss,merged], feed_dict={X_input: batchInput, y_true:batchLabels})
        # 记录参数
        writer_val.add_summary(merged_r, Iterations)
        
    
    if Iterations%3000 == 0:  # 每迭代三千次，save模型
        saver.save(sess, 'tf_model/my_test_model',global_step=Iterations)
           
    Iterations=Iterations+1    

writer_val.close()
writer_timeandplace.close()
# sess.close()

训练结果：64000次训练后，f1大概是56.77%，倒数第二个是59.31%。

image.png

九、tensorboard看参数

loss：很快在0.0140附近震荡了。

image.png

十、优化

对weight和baises的初始值进行微调，效果不明显。

Weights=tf.Variable(tf.random_normal([in_size,out_size], mean =0, stddev=0.2),name='W')
biases = tf.Variable(tf.random_normal(shape=[out_size], mean =0, stddev=0.2),name='b')

image.png

试试改变网络结果，增加层或改激活函数
改loss function，最想改这个，挂钩f1，还有加入正则化
过采样到10:1
看方差，看偏差

后续，继续尝试特征优化，获取更多的特征，优化各种机器学习方法，
其实还有如何实现，呈现等
还有更重要的，回归到基础数学，统计学，线性代数，最优化等

跟算法玩个游戏，偷偷把label放到feature里面，看看算法能找到吗？

结果的二维边界怎么画

机器学习预判客户流失-神经网络模型（neural network）

一、读取数据并转换格式

二、为inputs占坑placeholder

三、模型neural network

四、策略loss function

五、算法

六、评价

七、batch

八、训练train

九、tensorboard看参数

十、优化

你可能感兴趣的:(机器学习预判客户流失-神经网络模型（neural network）)