原文地址:https://www.sohu.com/a/201013669_797291
本文主要演示pandas中DataFrame对象corr()方法的用法,该方法用来计算DataFrame对象中所有列之间的相关系数(包括pearson相关系数、Kendall Tau相关系数和spearman秩相关)。
# coding: utf-8
import numpy as np
import pandas as pd
def Pearson(df): # 计算 Pearson 相关系数
return df.corr()
def Kendall(df): # 计算 Kendall Tau 相关系数
return df.corr('kendall')
def Spearman(df): # 计算 Spearman 秩相关
return df.corr('spearman')
if __name__ == "__main__":
df = pd.DataFrame({'A': np.random.randint(1, 100, 10), # low、high、size
'B': np.random.randint(1, 100, 10),
'C': np.random.randint(1, 100, 10)})
print df
print "Pearson"
print Pearson(df)
print "Kendall Tau"
print Kendall(df)
print "Spearman:"
print Spearman(df)
输出:
A B C
0 5 30 6
1 11 91 42
2 55 15 58
3 36 88 11
4 28 19 21
5 12 57 79
6 87 6 80
7 16 16 47
8 14 78 32
9 45 56 89
Pearson
A B C
A 1.000000 -0.446958 0.499898
B -0.446958 1.000000 -0.202255
C 0.499898 -0.202255 1.000000
Kendall Tau
A B C
A 1.000000 -0.422222 0.333333
B -0.422222 1.000000 -0.288889
C 0.333333 -0.288889 1.000000
Spearman:
A B C
A 1.000000 -0.563636 0.527273
B -0.563636 1.000000 -0.345455
C 0.527273 -0.345455 1.000000