pandas 日期处理

1、时间字符串,提取日期、小时

    USER_ID  SHOP_ID             TIME_STA        DATE  HOUR
0  22127870     1862  2015-12-25 17:00:00  2015-12-25    17
1   3434231     1862  2016-10-05 11:00:00  2016-10-05    11

df['DATE'] = pd.to_datetime(df['TIME_STA']).dt.date
df['HOUR'] = pd.to_datetime(df['TIME_STA']).dt.hour


2、日期转换为字符串  DATE 转换为TIME_STA

pd.to_datetime(df['DATE'])


datetime.datetime.strptime('20150626','%Y%m%d') 
Out[25]: 
datetime.datetime(2015, 6, 26, 0, 0)
(datetime.datetime.strptime('20150626','%Y%m%d') + datetime.timedelta(days=1)).date()
Out[26]: 
datetime.date(2015, 6, 27)
str((datetime.datetime.strptime('20150626','%Y%m%d') + datetime.timedelta(days=1)).date())
Out[27]: 
'2015-06-27'
str(datetime.datetime.strptime('20150626','%Y%m%d') )
Out[28]: 
'2015-06-26 00:00:00'
PAYNW_TAB.columns = [str((datetime.datetime.strptime('20150626','%Y%m%d') + datetime.timedelta(days=x)).date()) for x in range( PAYNW_TAB.shape[1])]

3、dayofyear、dayofweek

#销售日期:20140213

train['day'] = train['销售日期'].map(lambda x: str(x)[-2:]).astype(int)
train['sale_m'] = train['销售日期'].map(lambda x: str(x)[4:6]).astype(int) # 1234
train['week'] = train['销售日期'].map(lambda x: pd.to_datetime(str(x)).dayofweek+1)
# time : 20141118 18

data['time'] = pd.to_datetime(data['time'])    # 18 转换为 18:00:00
data['dayofyear'] = data['time'].dt.dayofyear  # pandas  用天数来表示日期,一年中的第*天
dindex = data[data['dayofyear'] == pd.to_datetime('2014-12-12').dayofyear].index.values  # 取其索引值
data = data.drop(dindex,axis=0,inplace = False)

4、两日期间相隔的天数、秒数

import datetime
d1 = datetime.datetime.strptime('2015-03-05 17:41:20', '%Y-%m-%d %H:%M:%S')
d2 = datetime.datetime.strptime('2015-03-02 17:41:20', '%Y-%m-%d %H:%M:%S')

delta = d1 - d2

5、相隔的小时数

df_time = df_part_1[df_part_1['time'] >= np.datetime64('2014-11-27')]
df_time['diff_hours'] = df_time['diff_time'].apply(lambda x: x.days * 24 + x.seconds//3600)  # //:取整

6、今天往后n天的日期

mport datetime

now = datetime.datetime.now()
delta = datetime.timedelta(days=3)
n_days = now + delta

print n_days.strftime('%Y-%m-%d %H:%M:%S')  

输出:2017-11-18 19:16:34
# 往后8 小时,还可以用seconds ,days

a = '2018-03-06 15:27:23'

d1 = datetime.datetime.strptime(a,'%Y-%m-%d %H:%M:%S')delta = datetime.timedelta(hours = 8)n_days = d1+deltan_days.strftime('%Y-%m-%d %H:%M:%S')

7、生成日期索引,及相应星期  date_range(start=' ',end=' ',freq=‘D’)、weekday()  ;D表示天,freq='12H',则每12小时计算一次,eg:

DatetimeIndex(['2016-10-28 00:00:00', '2016-10-28 12:00:00',
               '2016-10-29 00:00:00', '2016-10-29 12:00:00',
               '2016-10-30 00:00:00', '2016-10-30 12:00:00',
               '2016-10-31 00:00:00'],
              dtype='datetime64[ns]', freq='12H')

timerange = pd.date_range('2016-1-1', '2016-10-31', freq='D')
weeknum = timerange.weekday
Out[18]: 
Int64Index([4, 5, 6, 0], dtype='int64')


8、获取日期列表

import pandas as pd
import datetime
def datelist(start, end):
    start_date = datetime.date(*start)
    end_date = datetime.date(*end)

    result = []
    curr_date = start_date
    while curr_date != end_date:
        ymd="%04d%02d%02d" % (curr_date.year, curr_date.month, curr_date.day)
        result.append(int(ymd))
        curr_date += datetime.timedelta(1)
    result.append(int(ymd))
    return result

alltime_set=set(datelist((2016, 7, 1), (2016, 10, 31)))
alltime_set
{20160701,
 20160702,
 20160703,
 20160704,
 20160705,
 20160706,
...,
 20161030}



9、以为可以在读取时直接解析日期,见另一篇 读取csv、pickle的博客

10、字符串转日期、字符串求hour、minute、second

dd2['context_timestamp'].map(lambda x : (time.mktime(time.strptime(x , '%Y-%m-%d %H:%M:%S')) )   )
dd2['time'] = dd2['context_timestamp'].map(lambda x : (datetime.datetime.strptime(x , '%Y-%m-%d %H:%M:%S')).hour +  round((datetime.datetime.strptime(x , '%Y-%m-%d %H:%M:%S')).minute/60 ,2) )

你可能感兴趣的:(python)