Python数据可视化练习

首先练习python的matplotlib和seaborn两个模块画图:

%matplotlib inline
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
np.random.seed(sum(map(ord, "aesthetics")))

def sinplot(flip=1):
    x = np.linspace(0, 16, 100)
    for i in range(1, 8):
        plt.plot(x, np.sin(x + i * .5) * (8 - i) * flip)
sinplot()

看看结果:

Python数据可视化练习_第1张图片

转成seaborn模块:

import seaborn as sns
sinplot()

Python数据可视化练习_第2张图片

立马感觉高大上啊!

跟着kaggle上的大神做一下数据的分析处理。点击这里查看

****import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
sns.set(style="white",color_codes=True)
train=pd.read_csv("input/train.csv")
test=pd.read_csv("input/test.csv")
train.tail(5)****

运行这段代码时报错了,说是key值有问题,我也没搞清楚为什么:KeyError: u’no item named TARGET’

df=pd.DataFrame(train.TARGET.value_counts())
df['Percentage']=100*df['TARGET']/train.shape[0]
df

统计下一行中TARGET为0的个数

x=train.iloc[:,:-1]
y=train.TARGET

x['n0']=(x==0).sum(axis=1)
train['n0']=x['n0']

将数据中的bank products和numbers of products统计下,画成直方图

train.num_var4.hist(bins=100)
plt.xlabel('Number of bank products')
plt.ylabel('Number of customers in train')
plt.title('Most customers have 1 product with the bank')
plt.show()

Python数据可视化练习_第3张图片

你可能感兴趣的:(Python,python)