sklearn数值特征之时间处理

import pandas as pd
import numpy as np
import datetime
from dateutil.parser import parse 
# parse根据字符串解析成datetime,字符串可以很随意,可用时间日期的英文单词,可用横线,逗号,空格等做分隔符
import pytz # 时区

time_stamps = ['2015-03-08 10:30:00.360000+00:00', '2017-07-13 15:45:05.755000-07:00',
               '2012-01-20 22:30:00.254000+05:30', '2016-12-25 00:30:00.000000+10:00']
df = pd.DataFrame(time_stamps, columns = ['Time'])
df

sklearn数值特征之时间处理_第1张图片

ts_objs = np.array([pd.Timestamp(item) for item in np.array(df.Time)])
df['Ts_obj'] = ts_objs
print(ts_objs)
df

[Timestamp(‘2015-03-08 10:30:00.360000+0000’, tz=‘UTC’)
Timestamp(‘2017-07-13 15:45:05.755000-0700’, tz=‘pytz.FixedOffset(-420)’)
Timestamp(‘2012-01-20 22:30:00.254000+0530’, tz=‘pytz.FixedOffset(330)’)
Timestamp(‘2016-12-25 00:30:00+1000’, tz=‘pytz.FixedOffset(600)’)]
sklearn数值特征之时间处理_第2张图片

df['Year'] = df['Ts_obj'].apply(lambda d: d.year)
df['Month'] = df['Ts_obj'].apply(lambda d: d.month)
df['Day'] = df['Ts_obj'].apply(lambda d: d.day)
df['DayOfWeek'] = df['Ts_obj'].apply(lambda d: d.dayofweek)
df['DayName'] = df['Ts_obj'].apply(lambda d: d.weekday_name)
df['DayOfYear'] = df['Ts_obj'].apply(lambda d: d.dayofyear)
df['WeekOfYear'] = df['Ts_obj'].apply(lambda d: d.weekofyear)
df['Quarter'] = df['Ts_obj'].apply(lambda d: d.quarter)

df[['Time','Year','Month','Day','Quarter','DayOfWeek','DayName','DayOfYear','WeekOfYear']]

sklearn数值特征之时间处理_第3张图片

df['Hour'] = df['Ts_obj'].apply(lambda d: d.hour)
df['Minute'] = df['Ts_obj'].apply(lambda d: d.minute)
df['Second'] = df['Ts_obj'].apply(lambda d: d.second)
df['MUsecond'] = df['Ts_obj'].apply(lambda d: d.microsecond) #毫秒
df['UTC_offset'] = df['Ts_obj'].apply(lambda d: d.utcoffset()) #UTC时间位移

df[['Time','Hour','Minute','Second','MUsecond','UTC_offset']]

sklearn数值特征之时间处理_第4张图片

  • pd.cut()切分时间
hour_bins = [-1, 5, 11, 16, 21, 23]
bin_names = ['Late Night', 'Morning', 'Afternoon', 'Evening', 'Night']
df['TimeOfDayBin'] = pd.cut(df['Hour'], bins=hour_bins, labels=bin_names)
df[['Time','Hour','TimeOfDayBin']]

sklearn数值特征之时间处理_第5张图片

  • pd.qcut() 见https://blog.csdn.net/sanjianjixiang/article/details/103014528
  • 时间更多相关操作见https://blog.csdn.net/sanjianjixiang/article/details/102892864

你可能感兴趣的:(python,#,sklearn数据预处理)