pandas笔记-总结之一

pandas笔记-总结之一_第1张图片

import pandas as pd
import numpy as np
df=pd.read_csv('2002年-2018年上海机动车拍照拍卖.csv')
df.head()

pandas笔记-总结之一_第2张图片

df['ratio']=df['Total number of license issued']/df['Total number of applicants']
df[df['ratio']<0.05].head()

pandas笔记-总结之一_第3张图片

df.drop(columns='ratio',inplace=True)
df_3 = df.copy()
## 截取-
df_3['Year'] = df_3['Date'].apply(lambda x : 2000+int(x.split('-')[0]))
df_3['Month'] = df_3['Date'].apply(lambda x : x.split('-')[1])
newcolumns = ['Year','Month']+list(df.columns[0:5])
df_4 = df_3.reindex(columns=newcolumns).copy()
df_4.head()

pandas笔记-总结之一_第4张图片

## 按年分组
df_5 = df_4.groupby('Year')['lowest price '].agg([('Maximun','max'),('mean','mean'),('0.75Quantile',lambda x:x.quantile(0.75))])
df_5.head()

pandas笔记-总结之一_第5张图片

df_6 = df_4.copy()
## 先转化索引
df_6 = df_6.melt(id_vars=['Year', 'Month'],value_vars=['Total number of license issued', 'lowest price ', 'avg price', 'Total number of applicants'])
df_6 = pd.pivot_table(df_6,index=['Year', 'variable'],columns='Month',values='value').rename_axis(index={
     'variable':'统计项目'})
df_6.head(10)

pandas笔记-总结之一_第6张图片

df_7= df_4[['Year','Month','lowest price ','avg price']].copy()
## 怎么处理上个月 从已排序中下一行取
df_7=df_7.iloc[1:].reset_index()[['Month','lowest price ','avg price']].join(df_7,rsuffix='_lastmonth',how='outer')
df_7['lowest_diff'] = df_7['lowest price ']-df_7['lowest price _lastmonth']
df_7['avg_diff'] = df_7['avg price']-df_7['avg price_lastmonth']
df_7 = df_7[(df_7['lowest_diff']*df_7['avg_diff'])<0][['Year','Month']]
df_7

pandas笔记-总结之一_第7张图片

你可能感兴趣的:(代码,大数据)