2018-04-03 开胃学习Data系列 - Feature Creation for Machine Learning

导入

前提条件:

# Import the libraries we will be using
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

%matplotlib inline
sns.set(style='ticks', palette='Set2')
plt.rcParams['figure.figsize'] = 10, 8

import sys
sys.path.append("..")
from ds_utils.features_pipeline import pipeline_from_config

我们使用真正的直接营销活动 direct marketing campaign 中的邮件回复数据集。每个记录都代表一个直接 marketing offer 的个人。solicitation 请求慈善募捐。
The columns (features) are:

x x
income household income
Firstdate data assoc. with the first gift by this individual
Lastdate data associated with the most recent gift
Amount average amount by this individual over all periods (incl. zeros)
rfaf2 frequency code
rfaa2 donation amount code
pepstrfl flag indicating a star donator
glast amount of last gift
gavr amount of average gift

The target variables is class and is equal to one if they gave in this campaign and zero otherwise.

# Load the data
mailing_url = "https://gist.githubusercontent.com/anonymous/5275f1f59be561ec9734c90d80d176b9/raw/f92227f9b8cdca188c1e89094804b8e46f14f30b/-"
mailing_df = pd.read_csv(mailing_url)
# Let's take a look at the data
mailing_df.head(5)
2018-04-03 开胃学习Data系列 - Feature Creation for Machine Learning_第1张图片

你可能感兴趣的:(2018-04-03 开胃学习Data系列 - Feature Creation for Machine Learning)