属性规约

属性规约

  • 原始数据中属性太多,根据航空公司客户价值LRFMC模型,选择与模型相关的六个属性

  • 删除其他无用属性,如会员卡号等等

def reduction_data(data):
    data = data[['LOAD_TIME', 'FFP_DATE', 'LAST_TO_END', 'FLIGHT_COUNT', 'SEG_KM_SUM', 'avg_discount']]
    # data['L']=pd.datetime(data['LOAD_TIME'])-pd.datetime(data['FFP_DATE'])
    # data['L']=int(((parse(data['LOAD_TIME'])-parse(data['FFP_ADTE'])).days)/30)
    d_ffp = pd.to_datetime(data['FFP_DATE'])
    d_load = pd.to_datetime(data['LOAD_TIME'])
    res = d_load - d_ffp
    data2=data.copy()
    data2['L'] = res.map(lambda x: x / np.timedelta64(30 * 24 * 60, 'm'))
    data2['R'] = data['LAST_TO_END']
    data2['F'] = data['FLIGHT_COUNT']
    data2['M'] = data['SEG_KM_SUM']
    data2['C'] = data['avg_discount']
    data3 = data2[['L', 'R', 'F', 'M', 'C']]
    return data3
data3=reduction_data(data)
print(data3)
data3=reduction_data(data)
print(data3)
————————————以下是以上代码处理后数据————————————
                L    R    F       M         C
0       90.200000    1  210  580717  0.961639
1       86.566667    7  140  293678  1.252314
2       87.166667   11  135  283712  1.254676
3       68.233333   97   23  281336  1.090870
4       60.533333    5  152  309928  0.970658
5       74.700000   79   92  294585  0.967692
6       97.700000    1  101  287042  0.965347
7       48.400000    3   73  287230  0.962070
8       34.266667    6   56  321489  0.828478

你可能感兴趣的:(属性规约)