Python datetime - strftime vs strptime

目录

    • 官方文档
    • 示例代码

参考:
pythone3/library/datetime.html#datetime.timezone
Python datetime 格式化字符串:strftime()

官方文档

strftime() and strptime() Behavior

date, datetime, and time objects all support a strftime(format) method, to create a string representing the time under the control of an explicit format string.

Conversely, the datetime.strptime() class method creates a datetime object from a string representing a date and time and a corresponding format string.

The table below provides a high-level comparison of strftime() versus strptime():

strftime

strptime

Usage

Convert object to a string according to a given format

Parse a string into a datetime object given a corresponding format

Type of method

Instance method

Class method

Method of

date; datetime; time

datetime

Signature

strftime(format)

strptime(date_string, format)

strftime() and strptime() Format Codes

The following is a list of all the format codes that the 1989 C standard requires, and these work on all platforms with a standard C implementation.

Directive

Meaning

Example

Notes

%a

Weekday as locale’s abbreviated name.

Sun, Mon, …, Sat (en_US);
So, Mo, …, Sa (de_DE)

(1)

%A

Weekday as locale’s full name.

Sunday, Monday, …, Saturday (en_US);
Sonntag, Montag, …, Samstag (de_DE)

(1)

%w

Weekday as a decimal number, where 0 is Sunday and 6 is Saturday.

0, 1, …, 6

%d

Day of the month as a zero-padded decimal number.

01, 02, …, 31

(9)

%b

Month as locale’s abbreviated name.

Jan, Feb, …, Dec (en_US);
Jan, Feb, …, Dez (de_DE)

(1)

%B

Month as locale’s full name.

January, February, …, December (en_US);
Januar, Februar, …, Dezember (de_DE)

(1)

%m

Month as a zero-padded decimal number.

01, 02, …, 12

(9)

%y

Year without century as a zero-padded decimal number.

00, 01, …, 99

(9)

%Y

Year with century as a decimal number.

0001, 0002, …, 2013, 2014, …, 9998, 9999

(2)

%H

Hour (24-hour clock) as a zero-padded decimal number.

00, 01, …, 23

(9)

%I

Hour (12-hour clock) as a zero-padded decimal number.

01, 02, …, 12

(9)

%p

Locale’s equivalent of either AM or PM.

AM, PM (en_US);
am, pm (de_DE)

(1), (3)

%M

Minute as a zero-padded decimal number.

00, 01, …, 59

(9)

%S

Second as a zero-padded decimal number.

00, 01, …, 59

(4), (9)

%f

Microsecond as a decimal number, zero-padded on the left.

000000, 000001, …, 999999

(5)

%z

UTC offset in the form ±HHMM[SS[.ffffff]] (empty string if the object is naive).

(empty), +0000, -0400, +1030, +063415, -030712.345216

(6)

%Z

Time zone name (empty string if the object is naive).

(empty), UTC, GMT

(6)

%j

Day of the year as a zero-padded decimal number.

001, 002, …, 366

(9)

%U

Week number of the year (Sunday as the first day of the week) as a zero padded decimal number. All days in a new year preceding the first Sunday are considered to be in week 0.

00, 01, …, 53

(7), (9)

%W

Week number of the year (Monday as the first day of the week) as a decimal number. All days in a new year preceding the first Monday are considered to be in week 0.

00, 01, …, 53

(7), (9)

%c

Locale’s appropriate date and time representation.

Tue Aug 16 21:30:00 1988 (en_US);
Di 16 Aug 21:30:00 1988 (de_DE)

(1)

%x

Locale’s appropriate date representation.

08/16/88 (None);
08/16/1988 (en_US);
16.08.1988 (de_DE)

(1)

%X

Locale’s appropriate time representation.

21:30:00 (en_US);
21:30:00 (de_DE)

(1)

%%

A literal '%' character.

%

Several additional directives not required by the C89 standard are included for convenience. These parameters all correspond to ISO 8601 date values.

Directive

Meaning

Example

Notes

%G

ISO 8601 year with century representing the year that contains the greater part of the ISO week (%V).

0001, 0002, …, 2013, 2014, …, 9998, 9999

(8)

%u

ISO 8601 weekday as a decimal number where 1 is Monday.

1, 2, …, 7

%V

ISO 8601 week as a decimal number with Monday as the first day of the week. Week 01 is the week containing Jan 4.

01, 02, …, 53

(8), (9)

These may not be available on all platforms when used with the strftime() method. The ISO 8601 year and ISO 8601 week directives are not interchangeable with the year and week number directives above. Calling strptime() with incomplete or ambiguous ISO 8601 directives will raise a ValueError.

The full set of format codes supported varies across platforms, because Python calls the platform C library’s strftime() function, and platform variations are common. To see the full set of format codes supported on your platform, consult the strftime(3) documentation.

New in version 3.6: %G, %u and %V were added.

Technical Detail

Broadly speaking, d.strftime(fmt) acts like the time module’s time.strftime(fmt, d.timetuple()) although not all objects support a timetuple() method.

For the datetime.strptime() class method, the default value is 1900-01-01T00:00:00.000: any components not specified in the format string will be pulled from the default value. 4

Using datetime.strptime(date_string, format) is equivalent to:

datetime(*(time.strptime(date_string, format)[0:6]))

except when the format includes sub-second components or timezone offset information, which are supported in datetime.strptime but are discarded by time.strptime.

For time objects, the format codes for year, month, and day should not be used, as time objects have no such values. If they’re used anyway, 1900 is substituted for the year, and 1 for the month and day.

For date objects, the format codes for hours, minutes, seconds, and microseconds should not be used, as date objects have no such values. If they’re used anyway, 0 is substituted for them.

For the same reason, handling of format strings containing Unicode code points that can’t be represented in the charset of the current locale is also platform-dependent. On some platforms such code points are preserved intact in the output, while on others strftime may raise UnicodeError or return an empty string instead.

Notes:

  1. Because the format depends on the current locale, care should be taken when making assumptions about the output value. Field orderings will vary (for example, “month/day/year” versus “day/month/year”), and the output may contain Unicode characters encoded using the locale’s default encoding (for example, if the current locale is ja_JP, the default encoding could be any one of eucJP, SJIS, or utf-8; use locale.getlocale() to determine the current locale’s encoding).

  2. The strptime() method can parse years in the full [1, 9999] range, but years < 1000 must be zero-filled to 4-digit width.

    Changed in version 3.2: In previous versions, strftime() method was restricted to years >= 1900.

    Changed in version 3.3: In version 3.2, strftime() method was restricted to years >= 1000.

  3. When used with the strptime() method, the %p directive only affects the output hour field if the %I directive is used to parse the hour.

  4. Unlike the time module, the datetime module does not support leap seconds.

  5. When used with the strptime() method, the %f directive accepts from one to six digits and zero pads on the right. %f is an extension to the set of format characters in the C standard (but implemented separately in datetime objects, and therefore always available).

  6. For a naive object, the %z and %Z format codes are replaced by empty strings.

    For an aware object:

    %z

    utcoffset() is transformed into a string of the form ±HHMM[SS[.ffffff]], where HH is a 2-digit string giving the number of UTC offset hours, MM is a 2-digit string giving the number of UTC offset minutes, SS is a 2-digit string giving the number of UTC offset seconds and ffffff is a 6-digit string giving the number of UTC offset microseconds. The ffffff part is omitted when the offset is a whole number of seconds and both the ffffff and the SS part is omitted when the offset is a whole number of minutes. For example, if utcoffset() returns timedelta(hours=-3, minutes=-30), %z is replaced with the string '-0330'.

    Changed in version 3.7: The UTC offset is not restricted to a whole number of minutes.

    Changed in version 3.7: When the %z directive is provided to the strptime() method, the UTC offsets can have a colon as a separator between hours, minutes and seconds. For example, '+01:00:00' will be parsed as an offset of one hour. In addition, providing 'Z' is identical to '+00:00'.

    %Z

    In strftime(), %Z is replaced by an empty string if tzname() returns None; otherwise %Z is replaced by the returned value, which must be a string.

    strptime() only accepts certain values for %Z:

    1. any value in time.tzname for your machine’s locale

    2. the hard-coded values UTC and GMT

    So someone living in Japan may have JST, UTC, and GMT as valid values, but probably not EST. It will raise ValueError for invalid values.

    Changed in version 3.2: When the %z directive is provided to the strptime() method, an aware datetime object will be produced. The tzinfo of the result will be set to a timezone instance.

  7. When used with the strptime() method, %U and %W are only used in calculations when the day of the week and the calendar year (%Y) are specified.

  8. Similar to %U and %W, %V is only used in calculations when the day of the week and the ISO year (%G) are specified in a strptime() format string. Also note that %G and %Y are not interchangeable.

  9. When used with the strptime() method, the leading zero is optional for formats %d, %m, %H, %I, %M, %S, %J, %U, %W, and %V. Format %y does require a leading zero.

Footnotes

1

If, that is, we ignore the effects of Relativity

2

This matches the definition of the “proleptic Gregorian” calendar in Dershowitz and Reingold’s book Calendrical Calculations, where it’s the base calendar for all computations. See the book for algorithms for converting between proleptic Gregorian ordinals and many other calendar systems.

3

See R. H. van Gent’s guide to the mathematics of the ISO 8601 calendar for a good explanation.

4

Passing datetime.strptime('Feb 29', '%b %d') will fail since 1900 is not a leap year.

示例代码

def extract_date(date_str):
    """
    匹配并提取日期时间
    :param date_str: 网页中的日期时间字符串
    :return: 格式化后的日期字符串%Y-%m-%d %H:%M:%S(提取失败则返回当前日期时间)
    """
    # [(date_str_regex, datetime_format), (), ...]
    date_patterns = [
        # 2020-12-22 12:12:12
        (r'\d{4}-\d{1,2}-\d{1,2}\s+\d{1,2}:\d{1,2}:\d{1,2}', '%Y-%m-%d %H:%M:%S'),
        # 2020-12-22 12:12
        (r'\d{4}-\d{1,2}-\d{1,2}\s+\d{1,2}:\d{1,2}', '%Y-%m-%d %H:%M'),
        # 2020-12-22
        (r'\d{4}-\d{1,2}-\d{1,2}', '%Y-%m-%d'),
        # 2020/12/22
        (r'\d{4}/\d{1,2}/\d{1,2}', '%Y/%m/%d'),
        # 12-22 12:21:22
        (r'\d{1,2}-\d{1,2}\s+\d{1,2}:\d{1,2}:\d{1,2}', '%m-%d %H:%M:%S'),
        # 12-22 12:21
        (r'\d{1,2}-\d{1,2}\s+\d{1,2}:\d{1,2}', '%m-%d %H:%M'),
        # 12-2116:07
        (r'\d{1,2}-\d{1,2}\d{1,2}:\d{1,2}', '%m-%d%H:%M')
    ]
    try:
        for date_pattern_tuple in date_patterns:
            search_date = re.search(date_pattern_tuple[0], date_str)
            if search_date:
                # re.search中pattern没有(),所以此处group(0)即为完整匹配的字符,()对应的group索引从1开始
                search_date_str = search_date.group(0)
                search_datetime = datetime.strptime(search_date_str, date_pattern_tuple[1])
                # 格式化日期输出(替换缺失年1900为当前年)
                return search_datetime.strftime('%Y-%m-%d %H:%M:%S').replace("1900", str(datetime.now().year))
    except Exception as e:
        logging.info("转换日期格式错误 - cur_date_str=%s", date_str)

    return datetime.now().strftime('%Y-%m-%d %H:%M:%S')


def test_extract_date():
    """
    测试日期时间提取
    :return: 日期时间字符串
    """
    date_str_list = [
    '2021-01-04 10:13',
    '2021-01-04 10:13:40',
    '2021-01-01',
    '2021/01/02',
    '01-02 12:35:36',
    '01-02 12:35',
    '01-0212:35',
    ]
    for date_str in date_str_list:
        new_date_str = extract_date(date_str)
        print(new_date_str)

运行结果:

2021-01-04 10:13:00
2021-01-04 10:13:40
2021-01-01 00:00:00
2021-01-02 00:00:00
2021-01-02 12:35:36
2021-01-02 12:35:00
2021-01-02 12:35:00

你可能感兴趣的:(python)