Python时间序列01
需求:将数据按照日期分组统计,导出csv文件
import pandas as pd
data = pd.read_excel("TimeSeriesCourseworkData20_21 (2).xlsx")
data.plot()
data.head()
|
Call |
City Municipality |
0 |
2018-12-31 22:31:28 |
JAKARTA TIMUR |
1 |
2018-12-31 23:46:09 |
JAKARTA TIMUR |
2 |
2019-01-01 00:12:30 |
JAKARTA TIMUR |
3 |
2019-01-01 01:16:56 |
JAKARTA TIMUR |
4 |
2019-01-01 01:21:18 |
JAKARTA BARAT |
data.shape
(22540, 2)
data.dtypes
Call datetime64[ns]
City Municipality object
dtype: object
- Pandas.Series.dt.hour #取小时
- Pandas.Series.dt.minute #取分钟
- Pandas.Series.dt.second #取秒
- Pandas.Series.dt.year #取年
- Pandas.Series.dt.month #取月
- Pandas.Series.dt.day #取天
- Pandas.Series.dt.date #取日期
- Pandas.Series.dt.time #取时间
- Pandas.Series.dt.strftime(’%y-%m-%d %h:%m:%s’) #此处可灵活变动,根据需求来变动。
data['just_date']=data['Call'].dt.date
data.head()
|
Call |
City Municipality |
just_date |
0 |
2018-12-31 22:31:28 |
JAKARTA TIMUR |
2018-12-31 |
1 |
2018-12-31 23:46:09 |
JAKARTA TIMUR |
2018-12-31 |
2 |
2019-01-01 00:12:30 |
JAKARTA TIMUR |
2019-01-01 |
3 |
2019-01-01 01:16:56 |
JAKARTA TIMUR |
2019-01-01 |
4 |
2019-01-01 01:21:18 |
JAKARTA BARAT |
2019-01-01 |
dd=data.groupby("just_date")["just_date"].count().reset_index(name="count")
type(dd)
pandas.core.frame.DataFrame
outputpath = 'result_test.csv'
dd.to_csv(outputpath,sep=',',index=False,header=True)