python日期识别

        通过jieba切分语句,根据词性提取时间,再利用正则表达式匹配日期,最后输出识别日期。

        可以识别:今天几号?上周一的时间是?27号到30号?15号?我要订今天到后天的房间。从2019年4月28号下午3时10分27秒到2019年5月4号。前天上午12点等。

        识别的结果受jieba分词的影响,如果想要效果更好,可以自己用HMM训练分词,用双向匹配分词,条件随机场,神经网络等根据自己的需求训练自己的分词。

#!/usr/bin/python3
# -*- coding: utf-8 -*-
# author:SingWeek

import re
from datetime import datetime,timedelta
from dateutil.parser import parse
import jieba.posseg as psg

UTIL_CN_NUM={'零':0,'一':1,'二':2,'三':3,'四':4,'五':5,'六':6,'七':7,'八':8,'九':9,'0':0,'1':1,'2':2,'3':3,'4':4,'5':5,'6':6,'7':7,'8':8,'9':9}
UTIL_CN_UNIT={'十':10,'百':100,'千':1000,'万':10000}

def get_lastweek(day=1):
    d = datetime.now()
    dayscount = timedelta(days=d.isoweekday())
    dayto = d - dayscount
    sixdays = timedelta(days=7-day)
    dayfrom = dayto - sixdays
    date_from = datetime(dayfrom.year, dayfrom.month, dayfrom.day, 0, 0, 0)
    return str(date_from)[0:4]+'年'+str(date_from)[5:7]+'月'+str(date_from)[8:10]+'日'

def get_nextweek(day=1):
    d = datetime.now()
    dayscount = timedelta(days=d.isoweekday())
    dayto = d - dayscount
    sixdays = timedelta(days=-7-day)
    dayfrom = dayto - sixdays
    date_from = datetime(dayfrom.year, dayfrom.month, dayfrom.day, 0, 0, 0)
    return str(date_from)[0:4]+'年'+str(date_from)[5:7]+'月'+str(date_from)[8:10]+'日'

def get_week(day=1):
    d = datetime.now()
    dayscount = timedelta(days=d.isoweekday())
    dayto = d - dayscount
    sixdays = timedelta(days=-day)
    dayfrom = dayto - sixdays
    date_from = datetime(dayfrom.year, dayfrom.month, dayfrom.day, 0, 0, 0)
    return str(date_from)[0:4]+'年'+str(date_from)[5:7]+'月'+str(date_from)[8:10]+'日'

def check_time_valid(word):
    m=re.match("\d+$",word)
    if m:
        if len(word)<=6:
            return None
    word1=re.sub('[号|日]\d+$','日',word)
    if word1!=word:
        return check_time_valid(word)
    else:
        return word1

def cn2dig(src):
    if src=="":
        return None
    m=re.match("\d+",src)
    if m:
        return int(m.group(0))
    rsl=0
    unit=1
    for item in src[::-1]:
        if item in UTIL_CN_UNIT.keys():
            unit=UTIL_CN_UNIT[item]
        elif item in UTIL_CN_NUM.keys():
            num=UTIL_CN_NUM[item]
            rsl+=num*unit
        else:
            return None
    if rsl

 

你可能感兴趣的:(Python)