机器学习预判客户流失-神经网络模型(neural network)


神经网络,英文neural network,跟random forest一样,是众多机器学习方法的一种。

tensorflow,是实现神经网络的其中一种框架。如官网所说:an open-sourse mechine learning framework for everyone,当然还有其他框架可以选择,比如caffe,PyTorch,keras等等。

tensorflow基础知识点:
关于tensorflow实现neural network的一些基本概念和实现,请参考系列文章。

  • 深度学习模型之激活函数(Activation Function)
  • 深度学习策略之损失函数(Loss Function OR Cost Function)
  • 深度学习之算法(Algorithm)
  • 深度学习之评估标准(F1)
  • 深度学习训练之Batch
  • 深度学习可视化之Tensorboard

正式开启重启版的第三部。

一、读取数据并转换格式

读取数据,并将label标签one-hot,再将Dataframe结果转成ndarray。

# 读取数据
data = pd.read_csv("haoma11yue_after_onehot_and_RobustScaler.csv", index_col=0, parse_dates=True)
print(data.shape) #(43777, 70)

# 将X和Y拆分开
from sklearn.model_selection  import train_test_split
X = data.loc[:, data.columns != 'yonghuzhuangtai']
y = data.loc[:, data.columns == 'yonghuzhuangtai']
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size = 0.22, random_state = 0)

# label进行onthot
y_train_one_hot = pd.get_dummies(y_train['yonghuzhuangtai'], prefix= 'yonghuzhuangtai')
y_test_one_hot = pd.get_dummies(y_test['yonghuzhuangtai'], prefix= 'yonghuzhuangtai')

# DataFrame转ndarray
y_train_one_hot_n=y_train_one_hot.values
X_train_n        =X_train.values

y_test_one_hot_n=y_test_one_hot.values
X_test_n        =X_test.values

接下来就是逐步构建这个flow graph


image.png

二、为inputs占坑placeholder

placeholder只是明确了列数,行是使用None,动态的。

#-------输入输出的维度参数----------
features = X_train.shape[1]  # 输入层,xx个特征
numClasses = 2  # 输出层,稀疏表示

# 指定X(输入层)Y(输出层)的大小,占坑
with tf.name_scope("inputs"):
    X_input = tf.placeholder(tf.float32, shape = [None, features],name="X_input")
    y_true  = tf.placeholder(tf.float32, shape = [None, numClasses],name="y_true")  
image.png

三、模型neural network

  1. 先定义神经网络中一层的函数,后续直接调用构建多层neural network。后续深入看看weight和baises的初始化值对结果的影响。
# 函数add_layer,添加一个神经网络隐藏层
# layoutname,隐藏层的名字
# inputs,隐藏层的输入,也就是前一层
# in_size,输入的纬度,也就是前一层的neurons数目
# out_size,输出的纬度,也就是该隐藏层的neurons数目
# activatuib_funaction,激活函数

def add_layer(layoutname, inputs, in_size, out_size, activatuib_funaction=None):
    with tf.name_scope(layoutname):
        with tf.name_scope('weights'):
            Weights=tf.Variable(tf.random_normal([in_size,out_size], stddev=0.1),name='W')  #stddev=0.1,无意中试过,加这个,效率速度快很多。其实涉及到W的初始化问题了。
            tf.summary.histogram('Weights',Weights)  #histogram_summary用于生成分布图,也可以用scalar_summary记录存数值
            
        with tf.name_scope('biases'):
            biases = tf.Variable(tf.constant(0.1, shape=[out_size]), name='b')
            tf.summary.histogram('Biases',biases)
        
        with tf.name_scope('Wx_plus_b'):
            Wx_plus_b=tf.add(tf.matmul(inputs,Weights),biases)

    if activatuib_funaction is None:
        outputs=Wx_plus_b
    else :
        outputs=activatuib_funaction(Wx_plus_b)
    return outputs
  1. 构建neural network,这里构建三个隐藏层,加上输入层和输出层,总共有5层,使用到的激活函数暂时是tf.nn.relu,后续试试tf.tanh,看看差别。
num_HiddenNeurons1 = 50  # 隐藏层第一层
num_HiddenNeurons2 = 40  # 隐藏层第二层
num_HiddenNeurons3 = 20  # 隐藏层第三层
    
with tf.name_scope('first_hindden_layer'):
    first_hindden_layer=add_layer("first_hindden_layer",X_input,features,num_HiddenNeurons1,activatuib_funaction=tf.nn.relu)

with tf.name_scope('second_hindden_layer'):
    second_hindden_layer=add_layer("second_hindden_layer",first_hindden_layer,num_HiddenNeurons1,num_HiddenNeurons2,activatuib_funaction=tf.nn.relu)

with tf.name_scope('third_hindden_layer'):
    third_hindden_layer=add_layer("third_hindden_layer",second_hindden_layer,num_HiddenNeurons2,num_HiddenNeurons3,activatuib_funaction=tf.nn.relu)    
    
with tf.name_scope('prediction'):
    y_prediction =add_layer('prediction',third_hindden_layer,num_HiddenNeurons3,numClasses,activatuib_funaction=None)
    

# y_prediction的输出并不是概率分布,没有经过softmax
# [[ 84.97052765  47.09545517]
#  [ 84.97052765  47.09545517]]

with tf.name_scope('prediction_softmax'):
    y_prediction_softmax = tf.nn.softmax(y_prediction)

with tf.name_scope('Save'):
    saver = tf.train.Saver(max_to_keep=4)  #保留最近四次的模型
image.png
image.png

四、策略loss function

暂时使用square再mean的loss function,后续试试softmax_cross_entropy_with_logits,看看区别。
其实这里有个疑问,策略是优化的方向,mean_square,主要体现的应该是precision,但是最后评判的结果是f1,有loss function直接体现f1的吗?

with tf.name_scope("loss"):
# 结构风险=经验风险+正则化,经验风险使用交叉熵,正则化使用L2。
# 暂时不使用正则化,效果好像好点,或者说正则化还用得不好啊。
# ------------------------------------------------square--------------------------------------------------------
    loss = tf.reduce_mean(tf.square(y_true - y_prediction_softmax))    
image.png

五、算法

with tf.name_scope("train"):
# ----------------------------------指数衰减学习率---------------------------------------
# exponential_decay(learning_rate, global_step, decay_steps, decay_rate, staircase=False, name=None)
# decayed_learning_rate = learning_rate * decay_rate ^ (global_step / decay_steps)   ,(If the argument `staircase` is `True)

    Iterations = 0
    learning_rate = tf.train.exponential_decay(learning_rate=0.1, global_step=Iterations, decay_steps=10000, decay_rate=0.99, staircase=True)    #staircase 楼梯。

# ----------------------------------算法---------------------------------------
    opt = tf.train.GradientDescentOptimizer(learning_rate = learning_rate).minimize(loss)

image.png

六、评价

主要还是那几个:f1,recall,precision,只是用tensor实现而已,最看重的还是f1。

def tf_confusion_metrics(model, actual_classes, session, feed_dict):
    predictions = tf.argmax(model, 1)
    actuals = tf.argmax(actual_classes, 1)

    ones_like_actuals = tf.ones_like(actuals)  # tf.ones_like: A `Tensor` with all elements set to 1.
    zeros_like_actuals = tf.zeros_like(actuals)
    ones_like_predictions = tf.ones_like(predictions)
    zeros_like_predictions = tf.zeros_like(predictions)

    # true positive 猜测和真实一致
    tp_op = tf.reduce_sum(                               # tf.reduce_sum,统计1的个数
    tf.cast(                                             # tf.cast:  Casts a tensor to a new type.把true变回1
      tf.logical_and(                                    # tf.logical_and: A `Tensor` of type `bool`.  把预测的true和实际的true取且操作
        tf.equal(actuals, ones_like_actuals),            # tf.equal:A `Tensor` of type `bool`.其实就是把1变成TRUE.
        tf.equal(predictions, ones_like_predictions)
      ), 
      "float"
    )
    )

    # true negative 猜测和真实一致
    tn_op = tf.reduce_sum(
    tf.cast(
      tf.logical_and(
        tf.equal(actuals, zeros_like_actuals), 
        tf.equal(predictions, zeros_like_predictions)
      ), 
      "float"
    )
    )

    # false positive 实际是0,猜测是1
    fp_op = tf.reduce_sum(
    tf.cast(
      tf.logical_and(
        tf.equal(actuals, zeros_like_actuals), 
        tf.equal(predictions, ones_like_predictions)
      ), 
      "float"
    )
    )

    # false negative 实际是1,猜测是0
    fn_op = tf.reduce_sum(
    tf.cast(
      tf.logical_and(
        tf.equal(actuals, ones_like_actuals), 
        tf.equal(predictions, zeros_like_predictions)
      ), 
      "float"
    )
    )

    tp, tn, fp, fn = \
    session.run(
      [tp_op, tn_op, fp_op, fn_op], 
      feed_dict
    )

    with tf.name_scope("confusion_matrix"):
        with tf.name_scope("precision"):
            if((float(tp) + float(fp)) == 0):
                precision = 0
            else:
                precision = float(tp)/(float(tp) + float(fp))
            tf.summary.scalar("Precision",precision)
            
        with tf.name_scope("recall"):
            if((float(tp) + float(fn)) ==0):
                recall = 0
            else:
                recall = float(tp) / (float(tp) + float(fn))
            tf.summary.scalar("Recall",recall)

        with tf.name_scope("f1_score"):
            if((precision + recall) ==0):
                f1_score = 0
            else:   
                f1_score = (2 * (precision * recall)) / (precision + recall)
            tf.summary.scalar("F1_score",f1_score)
            
        with tf.name_scope("accuracy"):
            accuracy = (float(tp) + float(tn))  /  (float(tp) + float(fp) + float(fn) + float(tn))
            tf.summary.scalar("Accuracy",accuracy)

    print ('F1 Score = ', f1_score, ', Precision = ', precision,', Recall = ', recall, ', Accuracy = ', accuracy)

image.png

除了TensorFlow方式实现外,还可以用sklearn实现。

import sklearn as sk
import numpy as np
from sklearn.metrics import confusion_matrix

# 打印所有的scores参数,包括precision、recall、f1等等
    # y_pred_score,神经网络的预测结果,经过softmax,type:  
    # y_true_onehot_score,神经网络的true值输入,是one-hot编码后的type:  
def scores_all(y_pred_onehot_score, y_true_onehot_score):

    y_pred_score = np.argmax(y_pred_onehot_score, axis = 1) # 反one-hot编码
    y_true_score = np.argmax(y_true_onehot_score, axis = 1) # 反one-hot编码

#     print("precision:",sk.metrics.precision_score(y_true_score,y_pred_score), \
#           "recall:",sk.metrics.recall_score(y_true_score,y_pred_score), \
#           "f1:",sk.metrics.f1_score(y_true_score,y_pred_score))

    print("f1:",sk.metrics.f1_score(y_true_score,y_pred_score))

七、batch

batch的实现。

# --------------函数说明-----------------
# sourceData_feature :训练集的feature部分
# sourceData_label   :训练集的label部分
# batch_size  : 牛肉片的厚度
# num_epochs  : 牛肉翻煮多少次
# shuffle : 是否打乱数据

def batch_iter(sourceData_feature,sourceData_label, batch_size, num_epochs, shuffle=True):
    
    data_size = len(sourceData_feature)
    
    num_batches_per_epoch = int(data_size / batch_size)  # 样本数/batch块大小,多出来的“尾数”,不要了
    
    for epoch in range(num_epochs):
        # Shuffle the data at each epoch
        if shuffle:
            shuffle_indices = np.random.permutation(np.arange(data_size))
            
            shuffled_data_feature = sourceData_feature[shuffle_indices]
            shuffled_data_label   = sourceData_label[shuffle_indices]
        else:
            shuffled_data_feature = sourceData_feature
            shuffled_data_label = sourceData_label

        for batch_num in range(num_batches_per_epoch):   # batch_num取值0到num_batches_per_epoch-1
            start_index = batch_num * batch_size
            end_index = min((batch_num + 1) * batch_size, data_size)

            yield (shuffled_data_feature[start_index:end_index] , shuffled_data_label[start_index:end_index])

八、训练train

进行迭代训练train。


batchSize = 1000 # 定义具体的牛肉厚度   
epoch_count = 200  # 训练的epoch次数
Iterations = 0  #  记录迭代的次数

print("how many steps would train: ", (epoch_count * int((len(X_train_n)/batchSize))))
print('---------------------------start training------------------------------')

# sess
sess = tf.Session()
merged = tf.summary.merge_all()  #Merges all summaries collected in the default graph.

# 定义训练过程中的参数(比如loss,weight,baises)保存到哪里
writer_val = tf.summary.FileWriter("logs/val", sess.graph)
# 保存训练过程中各个环节消耗的时间和内存
writer_timeandplace = tf.summary.FileWriter("logs/timeandplace", sess.graph)

sess.run(tf.global_variables_initializer())

# 迭代 必须注意batch_iter是yield→generator,所以for语句有特别
for (batchInput, batchLabels) in batch_iter(X_train_n, y_train_one_hot_n, batchSize, epoch_count, shuffle=True):

    if Iterations%1000 == 0:  
        # --------------------------------训练并记录-----------------------------------------------
        run_options = tf.RunOptions(trace_level = tf.RunOptions.FULL_TRACE) # 配置运行时需要记录的信息
        run_metadata = tf.RunMetadata()  # 运行时记录运行信息的proto
        
        # train
        trainingopt,trainingLoss,merged_r,y_prediction_softmax_r,y_prediction_r= \
                    sess.run([opt,loss,merged,y_prediction_softmax,y_prediction],  \
                    feed_dict={X_input:batchInput, y_true:batchLabels}, options =run_options, run_metadata = run_metadata)    
#         print(batchInput[0:5,:])
#         print(y_prediction_r[0:5,:])
#         print(y_prediction_softmax_r[0:5,:])
#         print(batchLabels[0:5,:])
        # 记录参数
        writer_val.add_summary(merged_r, Iterations)
        # 将节点在运行时的信息写入日志文件
        writer_timeandplace.add_run_metadata(run_metadata, 'Iterations%03d' % Iterations)
        
        # 输出效果 
        print("step %d, %d people leave in this batch, loss is %g" \
              %(Iterations, sum(np.argmax(batchLabels,axis = 1)) ,trainingLoss ))
        
        print('--------------------train scores------------------')        
        tf_confusion_metrics(y_prediction_softmax_r, batchLabels, sess, feed_dict={X_input:batchInput, y_true:batchLabels})
#         scores_all(y_prediction_softmax_r ,batchLabels)
        
        # test set 的效果
        trainingLoss, y_prediction_softmax_r = sess.run([loss,y_prediction_softmax], feed_dict = {X_input: X_test_n, y_true:y_test_one_hot_n})
        print('**********************test score**********************')
        tf_confusion_metrics(y_prediction_softmax_r, y_test_one_hot_n, sess, feed_dict = {X_input: X_test_n, y_true:y_test_one_hot_n})
#         scores_all(y_prediction_softmax_r ,y_test_one_hot_n)
        
    else:
        # train
        trainingopt, trainingLoss, merged_r = sess.run([opt,loss,merged], feed_dict={X_input: batchInput, y_true:batchLabels})
        # 记录参数
        writer_val.add_summary(merged_r, Iterations)
        
    
    if Iterations%3000 == 0:  # 每迭代三千次,save模型
        saver.save(sess, 'tf_model/my_test_model',global_step=Iterations)
           
    Iterations=Iterations+1    

writer_val.close()
writer_timeandplace.close()
# sess.close()

训练结果:64000次训练后,f1大概是56.77%,倒数第二个是59.31%。


image.png

九、tensorboard看参数

loss:很快在0.0140附近震荡了。


image.png

image.png

十、优化

  1. 对weight和baises的初始值进行微调,效果不明显。
Weights=tf.Variable(tf.random_normal([in_size,out_size], mean =0, stddev=0.2),name='W')
biases = tf.Variable(tf.random_normal(shape=[out_size], mean =0, stddev=0.2),name='b')
image.png
  1. 试试改变网络结果,增加层或改激活函数
  2. 改loss function,最想改这个,挂钩f1,还有加入正则化
  3. 过采样到10:1
  4. 看方差,看偏差

后续,继续尝试特征优化,获取更多的特征,优化各种机器学习方法,
其实还有如何实现,呈现等
还有更重要的,回归到基础数学,统计学,线性代数,最优化等

跟算法玩个游戏,偷偷把label放到feature里面,看看算法能找到吗?

结果的二维边界怎么画

你可能感兴趣的:(机器学习预判客户流失-神经网络模型(neural network))