python之pandas库的series — 缺失值处理

python之pandas库的series — 缺失值处理

缺失值处理有两种方法:
1.缺失值直接丢弃
2.缺失值填充

#缺失值直接丢弃#
import pandas as pd 
import numpy as np
sr = pd.Series([33,np.nan,32,np.nan],index = ['a','b','c','d'])
a = sr.isnull() #nan为true
b = sr.notnull() #非nan为true
sr1 = sr[b] #删除nan方法1
sr2 = sr.dropna() #删除nan方法2 

print('sr=',sr,'\n')
print('a=',a,'\n')
print('b=',b,'\n')
print('sr1=',sr1,'\n')
print('sr2=',sr2,'\n')

sr= a    33.0
b     NaN
c    32.0
d     NaN
dtype: float64 

a= a    False
b     True
c    False
d     True
dtype: bool 

b= a     True
b    False
c     True
d    False
dtype: bool 

sr1= a    33.0
c    32.0
dtype: float64 

sr2= a    33.0
c    32.0
dtype: float64 
#缺失值填充#
sr = pd.Series([33,np.nan,32,np.nan],index = ['a','b','c','d'])
sr1 = sr.fillna(0) #将缺失值填充为0
sr2 = sr.fillna(np.mean(sr)) #将缺失值填充为平均数

print('sr=',sr,'\n')
print('sr1=',sr1,'\n')
print('sr2=',sr2,'\n')
sr= a    33.0
b     NaN
c    32.0
d     NaN
dtype: float64 

sr1= a    33.0
b     0.0
c    32.0
d     0.0
dtype: float64 

sr2= a    33.0
b    32.5
c    32.0
d    32.5
dtype: float64 

你可能感兴趣的:(python基础知识,python,pandas,数据分析)