pandas之dataframe

pandas之dataframe

建立 dataframe

访问dataframe数据

  • df.loc[,] , df.iloc[]
  • df[][]

新建行,列

实例

// A code block
import pandas as pd
import numpy as np
df = pd.read_csv('data.csv')
df = df.sort_values(['user','date'])
df_B = df[df['indc'] == 'B']
df_S = df[df['indc'] == 'S']
df['vol-sign'] = np.where(df['indc']=='B',df['vol'],-df['vol']
df['cde'] = df.groupby('user')['vol-sign'].cumsum()
\\data.csv
user,vol,prc,date,indc,cde
a01,42,72,2019.07.22,B,
a01,42,72,2019.07.20,B,
a01,42,72,2019.07.22,S,
a01,42,72,2019.07.22,B,
a02,42,72,2019.07.22,B,
a02,42,72,2019.07.22,B,
a02,42,72,2019.07.20,S,
a03,42,72,2019.07.22,B,
a03,42,72,2019.07.20,B,
a03,42,72,2019.07.22,S,
a03,42,72,2019.07.22,B,

注意

dataframe比较适合整体操作,需要进行逐行运算时,效率太低!
建议转回numpy操作。

比如对于问题:
for i in range(1,10000000):
df.iloc[i,3] = df.iloc[i-1,3]*df[i,1]+df[i,2]

你可能感兴趣的:(python)