生成日期范围:pd.date_range()方法用于根据特定频率生成指定长度的DatetimeIndex
pd.date_range(
start=None,
end=None,
periods=None,
freq=None,
tz=None,
normalize=False, #时间标准化为0
name=None, #生成的DatetimeIndex的名称
closed=None, #使区间相对于给定的频率闭合方式
**kwargs,
)
偏置
from pandas.tseries.offsets import Hour,Minute
hour = Hour()
hour
hour_4 = Hour(4)
hour_4
<4 * Hours>
Hour(2) + Minute(30)
<150 * Minutes>
频率
频率表
pd.date_range(‘2018-3-3’,periods=20,freq=‘2h30min’)
DatetimeIndex([‘2018-03-03 00:00:00’, ‘2018-03-03 02:30:00’,
‘2018-03-03 05:00:00’, ‘2018-03-03 07:30:00’,
‘2018-03-03 10:00:00’, ‘2018-03-03 12:30:00’,
‘2018-03-03 15:00:00’, ‘2018-03-03 17:30:00’,
‘2018-03-03 20:00:00’, ‘2018-03-03 22:30:00’,
‘2018-03-04 01:00:00’, ‘2018-03-04 03:30:00’,
‘2018-03-04 06:00:00’, ‘2018-03-04 08:30:00’,
‘2018-03-04 11:00:00’, ‘2018-03-04 13:30:00’,
‘2018-03-04 16:00:00’, ‘2018-03-04 18:30:00’,
‘2018-03-04 21:00:00’, ‘2018-03-04 23:30:00’],
dtype=‘datetime64[ns]’, freq=‘150T’)
月中某星期的日期:
wom = pd.date_range('2018-2-1','2018-12-1',freq='WOM-3FRI')
DatetimeIndex(['2018-02-16', '2018-03-16', '2018-04-20', '2018-05-18',
'2018-06-15', '2018-07-20', '2018-08-17', '2018-09-21',
'2018-10-19', '2018-11-16'],
dtype='datetime64[ns]', freq='WOM-3FRI')
shift方法移位
ts.shift(periods=1, freq=None, axis=0, fill_value=None)
ts = ts.head(10)
ts.shift(2) #简单移位,不设置频率时,不改变索引,只改变索引所对应的值
2016-01-01 NaN
2016-01-02 NaN
2016-01-03 1.270813
2016-01-04 0.172139
2016-01-05 -0.900744
2016-01-06 -0.558749
2016-01-07 0.485409
2016-01-08 -0.256562
2016-01-09 -1.623190
2016-01-10 -0.487096
Freq: D, dtype: float64
ts.shift(2,freq='90T') #传递频率移位,改变索引,但不改变值序列 相当于时间序列向前移动2*90T = 180T
2016-01-01 03:00:00 1.270813
2016-01-02 03:00:00 0.172139
2016-01-03 03:00:00 -0.900744
2016-01-04 03:00:00 -0.558749
2016-01-05 03:00:00 0.485409
2016-01-06 03:00:00 -0.256562
2016-01-07 03:00:00 -1.623190
2016-01-08 03:00:00 -0.487096
2016-01-09 03:00:00 0.312584
2016-01-10 03:00:00 0.755862
Freq: D, dtype: float64
ts.shift(2,'M') 相当于时间序列向前移动2*M 2M 取月底日期
2016-02-29 1.270813
2016-02-29 0.172139
2016-02-29 -0.900744
2016-02-29 -0.558749
2016-02-29 0.485409
2016-02-29 -0.256562
2016-02-29 -1.623190
2016-02-29 -0.487096
2016-02-29 0.312584
2016-02-29 0.755862
Freq: D, dtype: float64
使用偏置进行移位日期
from pandas.tseries.offsets import MonthEnd,Day
ts.index = ts.index + 3*Day()
ts
2016-01-04 1.270813
2016-01-05 0.172139
2016-01-06 -0.900744
2016-01-07 -0.558749
2016-01-08 0.485409
2016-01-09 -0.256562
2016-01-10 -1.623190
2016-01-11 -0.487096
2016-01-12 0.312584
2016-01-13 0.755862
Freq: D, dtype: float64
处理时区通常是时间序列操作中最不愉快的部分,因此很多时间序列用户选择世界协调时间或UTC,他是格林尼治时间的后继者, 也是目前的标准,时区通常被表示是为UTC的偏置,例如,在夏令时期间,纽约比UTC时间晚4个小时,其余时间晚5个小时。
python中时区信息来源于第三方库pytz
import pytz
pytz.common_timezones[-5:]
['US/Eastern', 'US/Hawaii', 'US/Mountain', 'US/Pacific', 'UTC']
要获得pytz的时区对象
tz = pytz.timezone('America/New_York')
tstz = ts.tz_localize('UTC')
tstz
2016-01-04 00:00:00+00:00 1.270813
2016-01-05 00:00:00+00:00 0.172139
2016-01-06 00:00:00+00:00 -0.900744
2016-01-07 00:00:00+00:00 -0.558749
2016-01-08 00:00:00+00:00 0.485409
2016-01-09 00:00:00+00:00 -0.256562
2016-01-10 00:00:00+00:00 -1.623190
2016-01-11 00:00:00+00:00 -0.487096
2016-01-12 00:00:00+00:00 0.312584
2016-01-13 00:00:00+00:00 0.755862
Freq: D, dtype: float64
newtz = tstz.tz_convert('Europe/Berlin')
newtz
2016-01-04 01:00:00+01:00 1.270813
2016-01-05 01:00:00+01:00 0.172139
2016-01-06 01:00:00+01:00 -0.900744
2016-01-07 01:00:00+01:00 -0.558749
2016-01-08 01:00:00+01:00 0.485409
2016-01-09 01:00:00+01:00 -0.256562
2016-01-10 01:00:00+01:00 -1.623190
2016-01-11 01:00:00+01:00 -0.487096
2016-01-12 01:00:00+01:00 0.312584
2016-01-13 01:00:00+01:00 0.755862
Freq: D, dtype: float64
stamp = pd.Timestamp('2011-03-12 04:00')
时间戳数值是不变的,无论时区是什么。
如果两个时区的不同的时间序列需要联合,那么结果将是‘UTC’时间的,由于时间戳是以‘UTC’格式存储的
ts1 = ts[:7].tz_localize('Europe/London')
ts2 = ts1[2:].tz_convert('Europe/Moscow')
result = ts1 + ts2
result
2016-01-04 00:00:00+00:00 NaN
2016-01-05 00:00:00+00:00 NaN
2016-01-06 00:00:00+00:00 -1.801488
2016-01-07 00:00:00+00:00 -1.117498
2016-01-08 00:00:00+00:00 0.970818
2016-01-09 00:00:00+00:00 -0.513125
2016-01-10 00:00:00+00:00 -3.246380
Freq: D, dtype: float64
result.index
DatetimeIndex(['2016-01-04 00:00:00+00:00', '2016-01-05 00:00:00+00:00',
'2016-01-06 00:00:00+00:00', '2016-01-07 00:00:00+00:00',
'2016-01-08 00:00:00+00:00', '2016-01-09 00:00:00+00:00',
'2016-01-10 00:00:00+00:00'],
dtype='datetime64[ns, UTC]', freq='D') #UTC时间格式