日常学习之:在 numpy数组中挑出不是 nan 的值;判断两组数据之间是否存在显著差异

挑出非 nan 值

import numpy as np
x = np.array([1,2,3,4,5,np.nan,3,4,np.nan])
x = x[np.logical_not(np.isnan(x))]
print(x)

[1. 2. 3. 4. 5. 3. 4.]
Process finished with exit code 0

判断两组数据的显著差异

  • 采用 t-test https://blog.csdn.net/qq_33039859/article/details/74625879
  • 采用卡方检验 https://zhuanlan.zhihu.com/p/288338508
  • p 值的范围分别是 <0.05, < 0.01, <0.005p 的数值越小代表两组数据之间的差异越大
import numpy as np
from scipy.stats import ttest_ind

x = np.array([1,2,3,4,5,6,7,8])
y = np.array([1,2,3,4,5,6,7,8])
z = np.array([8,7,6,5,4,3,2,1])

print(ttest_ind(x,y))
print(ttest_ind(y,z))
print(ttest_ind(x,z))

D:\Anaconda3\envs\data\python.exe G:/Gait_Reconstruct/步态年龄/test.py
Ttest_indResult(statistic=0.0, pvalue=1.0)
Ttest_indResult(statistic=0.0, pvalue=1.0)
Ttest_indResult(statistic=0.0, pvalue=1.0)

import numpy as np
from scipy.stats import ttest_ind

x = np.random.random_integers(0,1000,(10,))
y = np.random.random_integers(0,1000,(10,))
z = np.random.random_integers(0,1000,(15,))

print(x,"\n",y,"\n",z)
print(ttest_ind(x,y))
print(ttest_ind(y,z))
print(ttest_ind(x,z))

[ 20 374 326 461 736 664 488 216 406 569]
[580 559 605 96 378 167 822 583 549 925]
[106 311 6 105 959 469 999 950 253 988 27 59 186 627 87]
Ttest_indResult(statistic=-0.9539930303387232, pvalue=0.3527241236351204)
Ttest_indResult(statistic=0.8363315018252558, pvalue=0.4115735907403949)
Ttest_indResult(statistic=0.1269712528308796, pvalue=0.9000666802890523)

你可能感兴趣的:(Python数据分析与挖掘,日常学习,学习,缓存,机器学习)