python进行数据处理与分析

  • 轴向连接
arr = np.arange(12).reshape((3,4))    

np.concatenate([arr,arr],axis = 1)   

np.concatenate([arr,arr],axis = 0) 
  • concat函数
// 对于没有重叠索引的Series
s1 = Series([0,1],index = ['a','b'])

s2 = Series([2,3,4],index = ['c','d','e']) 

s3 = Series([5,6],index = ['f','g'])

pd.concat([s1,s2,s3])  

默认情况下,concat是在axis=0上工作的,最终产生一个新的Series;当指定axis=1时,结果产生一个DataFrame

pd.concat([s1,s2,s3],axis = 1)  

这种情况下,另外一条轴上没有交集
pd.concat([s1,s2,s3],axis = 1,join = 'inner') 

用join_axes来指定在其他轴上使用的索引:

pd.concat([s1,s4],axis = 1,join_axes = [['a','c','b','e']])  

在连接轴上创建一个层次化索引

pd.concat([s1,s2,s3],keys = ['one','two','three']) 

axis = 1 时,变为DataFrame的列头
pd.concat([s1,s2,s3],keys = ['one','two','three'],axis = 1)

对DataFrame对象进行处理

df1 = DataFrame(np.arange(6).reshape(3,2),index = ['a','b','c'],column
     ...: s = ['one','two'])   

df2 = DataFrame(5 + np.arange(4).reshape(2,2),index = ['a','c'],column
     ...: s = ['three','four']) 

pd.concat([df1,df2],axis = 1,keys = ['level1','level2']) 

如果传入的是一个字典,则keys值为其键值

pd.concat({'level1' : df1,'level2' : df2},axis = 1)

pd.concat({'level1' : df1,'level2' : df2},axis = 0) 

介绍两个用于管理层次化索引的参数

pd.concat([df1,df2],axis = 1,keys = ['level1','level2'],names = ['uppe
     ...: r','lower']) 

DataFrame行索引

df1 = DataFrame(np.random.randn(3,4),columns = ['a','b','c','d']) 

df2 = DataFrame(np.random.randn(2,3),columns = ['b','d','a']) 

pd.concat([df1,df2],ignore_index = True)  

合并重叠数据

df1 = DataFrame({'a':[1.,np.nan,5.,np.nan],'b':[np.nan],'b':[np.nan,2.
     ...: ,np.nan,6.],'c':range(2,18,4)})  

df2 = DataFrame({'a':[5.,4.,np.nan,3.,7.],'b':[np.nan,3.,4.,6.,8.]})  

df1.combine_first(df2)      

你可能感兴趣的:(算法工程师,web开发,大数据,统计学习,python工程师,人工智能,机器学习)