python基础-字符串格式化(str.format)

前面一篇文章介绍了python基础-字符串格式化(printf-style),我们知道目前官方推荐使用的字符串格式化方法是使用format函数,接下来将非常详细的介绍format字符串格式化,同时结合实际的代码来加深理解。

format()字符串格式化

文章目录

  • format()字符串格式化
    • str.format(*args, **kwargs)
    • Format String Syntax
      • Replacement Fields Syntax
      • 替代字段特点
    • standard format specifier
      • 对齐(align)
      • 符号(sign)
      • \#号选项
      • 千位分隔符(thousand separator)
      • 最小域宽(field width)
      • 精度(precision)
      • 类型(type)

什么是str.format呢?

str.format()就是字符串类型的一个函数,它用来执行字符串格式化操作。

既然format是一个函数,那么就会涉及到函数的定义,函数的调用,函数的输入,函数的输出

接下来分四点来解读str.format()

str.format(*args, **kwargs)

Perform a string formatting operation. The string on which this method is called can contain literal text or replacement fields delimited by braces {}. Each replacement field contains either the numeric index of a positional argument, or the name of a keyword argument. Returns a copy of the string where each replacement field is replaced with the string value of the corresponding argument.

str.format()执行字符串格式化操作。

  1. str 是单纯的字符串字面量(string literals) 或者 是包含一个或多个替代字段(replacement field)的字符串字面量;
  2. 替代字段(replacement field):一对花括号代表一个替代字段;
  3. 每个替代字段 对应 位置参数(positional argument)的数字索引 或者 关键词参数(keyword argument)的keyname;
  4. 返回值,如果有替代字段,那么返回一个新的被格式化的字符串;如果没有替代字段,返回的还是原字符串;
str1 = "I'm string literal"
str1_new = str1.format()
print('str1 id:{}'.format(id(str1)))
print('str1_new id:{}'.format(id(str1_new)))

# somebody want to eat something
str2 = "{} want to eat {}"
str2_new = str2.format('渔道', '苹果')
print('str2 id:{}, content:{}'.format(id(str2), str2))
print('str2_new id:{}, content:{}'.format(id(str2_new), str2_new))

str3 = "{1} want to eat {0}"
str3_new = str3.format('渔道', '苹果')
print('str3 id:{}, content:{}'.format(id(str3), str3))
print('str3_new id:{}, content:{}'.format(id(str3_new), str3_new))

dict1 = {'name':'渔道', 'fruit':'苹果'}
str4 = "{name} want to eat {fruit}"
str4_new = str4.format(name=dict1['name'], fruit=dict1['fruit'])
print('str4 id:{}, content:{}'.format(id(str4), str4))
print('str4_new id:{}, content:{}'.format(id(str4_new), str4_new))
# print("{name} want to eat {fruit}".format(fruit=dict1['fruit'], name=dict1['name']))

Format String Syntax

Format strings contain “replacement fields” surrounded by curly braces {}. Anything that is not contained in braces is considered literal text, which is copied unchanged to the output. If you need to include a brace character in the literal text, it can be escaped by doubling: {{ and }}.

format string = string literals + replacement fields

格式字符串(format string) 由 字符串字面量(string literals) 或 替代字段(replacement fields)构成。

替代字段(replacement field)是由一对花括号括起来的内容;

非替代字段的字符都被作为字符串字面量(string literals);

如果字符串字面量(string literal)中仅单纯的表示一对花括号字符, 可通过双花括号转义。

str4 = '{{}}, {}'
print(str4.format('渔道'))

# nested 
name_width = 10
price_width = 10
nested_fmt = '{{:<{}}}{{:>{}}}'.format(name_width, price_width)
print(nested_fmt)
print(nested_fmt.format("苹果",5.98))

前面我们提到,替代字段是指由一对花括号括起来的"内容",但是这个"内容"到底是什么,没有做进一步的阐述。下面,我们来看看"内容"的具体定义。

Replacement Fields Syntax

replacement_field ::= "{" [field_name] ["!" conversion] [":" format_spec] "}"
field_name ::= arg_name ("." attribute_name | "[" element_index "]")*
arg_name ::= [identifier | digit+]
attribute_name ::= identifier
element_index ::= digit+ | index_string
index_string ::= <any source character except "]"> +
conversion ::= "r" | "s" | "a"
format_spec ::= <described in the next section>

从上面的语法定义我们可以看到,替代字段的"内容"主要由3部分构成:field_name,conversion,format_spec。3个部分都是可选的,可以只使用一个,或者三个都使用,或者一个都不使用。

field_name的作用是 与位置参数或关键词参数相对应,最终字段名(field_name)会被相应的参数值所替换。

conversion的所用是 使用三种不同的字符串显示函数 表示字符串。conversion前必须有一个感叹号(exclamation point)

format_spec就是格式限定符,format_spec前必须要有个冒号(colon)

替代字段特点

  1. field_name本身由arg_name开头,arg_name可以是数字也可以是关键字;如果arg_name是数字(digit),那么它指的是一个位置参数(positional argument);如果arg_name是标识符((identifier),那么它指的是一个关键词参数(keyword argument)。如果格式字符串中的数字arg_name依次为0、1、2、 …,那它们都可以被省略不写,format函数的位置参数将会依次插入替换。

    print("arg_name is number:{}".format(1)) 
    # 仅有一个field_name时,表示format函数的第0个位置参数, 一般默认就是0, 所以可不写
    print('arg_name is number:{0}'.format(1))
    name = 'keyword'
    print('arg_name is keyword:{}'.format(name))
    
    print('{0},{1},{2},{3},{4}'.format(1,2,3,4,5))
    print('{},{},{},{},{}'.format(1,2,3,4,5))
    
  2. arg_name是由identifier或digit组成,不是由引号引起来的,所以arg_name不可能是任意类型的字典键,例如,‘12’,’==’。

    dict1 = {'name':'渔道', 'fruit':'苹果'}
    print("{name},{fruit}".format(name=dict1['name'], fruit=dict1['fruit']))
    
    dict2 = {"12":'a', "==":'b'}
    print("{},{}".format(dict2['12'], dict2['==']))
    # print("{'12'},{'=='}".format(dict1)) # arg_name不能指定任意类型的字典键
    # print("{12},{==}".format(dict1)) # arg_name不能指定任意类型的字典键
    
  3. arg_name 后可以跟索引表达式(index expression)或属性表达式(attribute expression)。

    属性表达式由 ‘.’ + 属性名(attribute_name) 组成

    索引表达式由 [element_index] 组成

    # arg_name 后可以跟任意数量的索引或属性表达式。
    fruit1 = ['apple','banana','grape','pear']
    print("{0[0]}, {0[1]}, {0[2]}, {0[3]}".format(fruit1))
    print("{0[1]}, {0[3]}".format(fruit1))
    print("{0[1]}, {0[3]}, {params[0]}, {params[2]}".format(fruit1, params=fruit1))
    
    class Fruit:
        def __init__(self,name,weight,price):
            self.name = name
            self.weight = weight
            self.price = price
     
    apple = Fruit('apple', '0.23', '5.98')
    print("{0.name}'s weight is {0.weight}, {0.name}'s price is {0.price}".format(apple))
    print("{fruit.name}'s weight is {fruit.weight}, {fruit.name}'s price is {fruit.price}".format(fruit=apple))
    
  4. conversion产生格式化前的强制类型转换,将某个对象强制转换为可打印的字符串。支持3种转换标志:!s!r!a

    str2 = "渔道"
    tuple1 = (1,2)
    dict1 = {"name":"渔道", "fruit":"苹果"}
    
    # str() 返回一个对象的可打印字符串
    print('{0!s}'.format(str2))
    print('{0!s}'.format(tuple1))
    print('{0!s}'.format(tuple))
    print('{0!s}'.format(dict))
    print('{0!s}'.format(dict1))
    
    print("")
    # repr(), 对于字符串或者可以转换成字符串的对象,将返回带有单引号的字符串;对类而言,将返回带有尖括号的字符串,显示该类的相关信息
    print('{0!r}'.format(str2))
    print('{0!r}'.format(tuple1))
    print('{0!r}'.format(tuple))
    print('{0!r}'.format(dict))
    print('{0!r}'.format(dict1))
    
    print("")
    # ascii() 返回一个对象的可打印字符串, 但会将非ascii字符 转为 \u、\U、\x对应的编码
    print('{0!a}'.format(str2))
    print('{0!a}'.format(tuple1))
    print('{0!a}'.format(tuple))
    print('{0!a}'.format(dict))
    print('{0!a}'.format(dict1))
    

format_spec 与 printf-style中的format_spec大体上是相同的,所以这里我们简单的过一遍。

standard format specifier

format_spec     ::=  [[fill]align][sign][#][0][width][grouping_option][.precision][type]
fill            ::=  <any character>
align           ::=  "<" | ">" | "=" | "^"
sign            ::=  "+" | "-" | " "
width           ::=  digit+
grouping_option ::=  "_" | ","
precision       ::=  digit+
type            ::=  "b" | "c" | "d" | "e" | "E" | "f" | "F" | "g" | "G" | "n" | "o" | "s" | "x" | "X" | "%"

格式规范(format specification)用于包含在格式字符串(format string)中的替换字段(replacement field)中,以定义各个值的显示方式。

通常的约定是,空格式规范 产生的结果 和 直接调用str() 产生的结果是一样的。所以,对于一般的打印输出来说,只用指定"{}"即可。

接下来,就具体解释一下format_spec中每个字段的用法。

如果指定了有效的对齐值(align),则可以在其前面加上一个填充(fill)字符,填充字符可以是任何字符,如果省略,则默认为空格。字符 ‘{’ 或 ‘}’ 不能作为填充字符

对齐(align)

4中对齐选项的含义:

Option Meaning
‘<’ Forces the field to be left-aligned within the available space (this is the default for most objects). 左对齐(大多数对象默认是左对齐)
‘>’ Forces the field to be right-aligned within the available space (this is the default for numbers). 右对齐(数字默认是右对齐)
‘=’ Forces the padding to be placed after the sign (if any) but before the digits. This is used for printing fields in the form ‘+000000120’. This alignment option is only valid for numeric types. It becomes the default when ‘0’ immediately precedes the field width. 在正负号(如果存在的话)和数字之前插入填充字符
‘^’ Forces the field to be centered within the available space. 居中对齐

注意,除非定义了最小字段宽度,否则字段宽度将始终与填充它的数据大小相同,因此在这种情况下对齐选项没有意义。

# 对齐,要指定左对齐、右对齐和居中,可分别使用<、 >和^
# 数字默认是右对齐
# 字符串默认是左对齐

print("{:10}".format(123))  # 默认右对齐
print("{:<10}".format(123)) # 左对齐
print("{:^10}".format(123)) # 居中对齐

print("{:30}".format('helloworld')) # 默认左对齐
print("{:>30}".format('helloworld')) # 右对齐
print("{:^30}".format('helloworld')) # 居中对齐

符号(sign)

符号选项:(符号选项仅对数值类型有效)

Option Meaning
‘+’ indicates that a sign should be used for both positive as well as negative numbers.
‘-’ indicates that a sign should be used only for negative numbers (this is the default behavior).
space indicates that a leading space should be used on positive numbers, and a minus sign on negative numbers.
# 符号选项
print("{},{}".format(123,-123))
print("{:+},{:+}".format(123,-123)) # + 表示 正负数都要显示相应的符号
print("{:-},{:-}".format(123,-123)) # - 表示 仅负数显示相应的符号
print("{: },{: }".format(123,-123)) # space 表示 整数显示前导空格, 负数显示负号

#号选项

The ‘#’ option causes the “alternate form” to be used for the conversion. The alternate form is defined differently for different types. This option is only valid for integer, float, complex and Decimal types. For integers, when binary,octal, or hexadecimal output is used, this option adds the prefix respective ‘0b’, ‘0o’, or ‘0x’ to the output value. For floats, complex and Decimal the alternate form causes the result of the conversion to always contain a decimal-point character, even if no digits follow it. Normally, a decimal-point character appears in the result of these conversions only if a digit follows it. In addition, for ‘g’ and ‘G’ conversions, trailing zeros are not removed from the result.

‘#’号选项一般和’alternate form’结合使用。该选项仅对 integer,float,complex 和 decimal types有效。

对integer来说,当输出二进制、八进制、十六进制时,‘#’的作用是在输出显示的数字前加上前导符 ’0b‘,‘0o’,’0x’。

对float,complex,decimal来说,’#'的作用是使输出的数值总是有小数点符号,即使小数点后没有数字。

对’g’和’G’ conversions来说,尾零不会被省略。

# #号选项
#一般和 %o, %x, %X 结合使用, 可以标识 进制,方便阅读
conversion_flag1 = '#b: {:#b}; #b: {:b}'
conversion_flag2 = '#o: {:#o}; #o: {:o}'
conversion_flag3 = '#x: {:#x}; #x: {:x}'
conversion_flag4 = '#f: {:#f}; #f: {:f}'
conversion_flag5 = '#e: {:#e}; #e: {:e}'
conversion_flag6 = '#g: {:#g}; #g: {:g}'
print(conversion_flag1.format(16,16))
print(conversion_flag2.format(16,16))
print(conversion_flag3.format(16,16))
print(conversion_flag4.format(16,16))
print(conversion_flag5.format(16,16))
print(conversion_flag6.format(16,16))

千位分隔符(thousand separator)

The ‘,’ option signals the use of a comma for a thousands separator.

使用’,'作为千位分隔符

# 千位符
print("{:,}".format(12345678))

The ‘_’ option signals the use of an underscore for a thousands separator for floating point presentation types and for integer presentation type ‘d’. For integer presentation types ‘b’, ‘o’, ‘x’, and ‘X’, underscores will be inserted every 4 digits. For other presentation types, specifying this option is an error.

对于整数和浮点数, ‘_’ 是千分位分隔符

对于’b’,‘o’,‘x’,‘X’, '_'是4位数字分隔

其他显示类型指定‘_'都会产生错误

# 下划线
print("{:_}".format(12345678)) # 对于整数和浮点数, '_' 是千分位分隔符
print("{:_f}".format(1.2345678))
print("{:_f}".format(1234567.8))
print("{:_b}".format(64))   # 对于'b','o','x','X', '_'是4位数字分隔
print("{:_o}".format(6400))
print("{:_x}".format(640000))
print("{:_X}".format(640000))
# print("{:_n}".format(123456.78))
# print("{:_c}".format(123456.78))

最小域宽(field width)

width is a decimal integer defining the minimum total field width, including any prefixes, separators, and other formatting characters. If not specified, then the field width will be determined by the content.

width是一个十进制整数,定义了最小域宽,不仅仅是表示数字的宽度,任何前缀,分隔符,字符都包含在内。如果没有指定width,域宽则有显示的内容的长度决定。

精度(precision)

The precision is a decimal number indicating how many digits should be displayed after the decimal point for a floating point value formatted with ‘f’ and ‘F’, or before and after the decimal point for a floating point value formatted with ‘g’ or ‘G’. For non-number types the field indicates the maximum field size - in other words, how many characters will be used from the field content. The precision is not allowed for integer values.

对于’f’和’F’格式化类型,precision就是定义浮点数的保留小数点后几位。

对于’g’和‘G’格式化类型,precision就是浮点数所有的数字位数。

precision不能作用于整数类型,包括二进制、十进制、十六进制。

# 精度
print("{:.2f}".format(1.234567))
print("{:.4g}".format(1.234456))
print("{:.2s}".format("helloworld"))
# print("{:.2d}".format(16))
# print("{:.2b}".format(16))

类型(type)

字符串显示类型:

Type Meaning
‘s’ String format. This is the default type for strings and may be omitted.如果没有指定type,那么默认的type就是’s’。
None The same as ‘s’. 也就是说,输入源是字符串,默认以字符串进行显示
# 字符串
print("{}".format('hello'))
print("{:s}".format('world'))

整数显示类型:

Type Meaning
‘b’ Binary format. Outputs the number in base 2.
‘c’ Character. Converts the integer to the corresponding unicode character before printing.
‘d’ Decimal Integer. Outputs the number in base 10.
‘o’ Octal format. Outputs the number in base 8.
‘x’ Hex format. Outputs the number in base 16, using lower-case letters for the digits above 9.
‘X’ Hex format. Outputs the number in base 16, using upper-case letters for the digits above 9.
‘n’ Number. This is the same as ‘d’, except that it uses the current locale setting to insert the appropriate number separator characters.
None The same as ‘d’.也就是说,输入源是数字,默认以数字进行显示
# 整数
print("{}".format(123))
print("{:b}".format(16))
print("{:c}".format(96))
print("{:d}".format(96))
print("{:o}".format(96))
print("{:x}".format(196))
print("{:X}".format(196))
print("{:n}".format(196))

浮点数、小数显示类型:

Type Meaning
‘e’ Exponent notation. Prints the number in scientific notation using the letter ‘e’ to indicate the exponent. The default precision is 6. 默认精度是6
‘E’ Exponent notation. Same as ‘e’ except it uses an upper case ‘E’ as the separator character.
‘f’ Fixed-point notation. Displays the number as a fixed-point number. The default precision is 6.默认精度是6
‘F’ Fixed-point notation. Same as ‘f’, but converts nan to NAN and inf to INF.
‘g’ General format. For a given precision p >= 1, this rounds the number to p significant digits and then formats the result in either fixed-point format or in scientific notation, depending on its magnitude. The precise rules are as follows: suppose that the result formatted with presentation type ‘e’ and precision p-1 would have exponent exp. Then if -4 <= exp < p, the number is formatted with presentation type ‘f’ and precision p-1-exp. Otherwise, the number is formatted with presentation type ‘e’ and precision p-1. In both cases insignificant trailing zeros are removed from the significand, and the decimal point is also removed if there are no remaining digits following it, unless the ‘#’ option is used. Positive and negative infinity, positive and negative zero, and nans, are formatted as inf, -inf, 0, -0 and nan respectively, regardless of the precision. A precision of 0 is treated as equivalent to a precision of 1. The default precision is 6.
默认精度是6.
指定了精度,指数大于等于-4,那么小数点后的非零数字显示的个数和指定的精度大小相同,同时还有四舍五入;
如果指数小于-4,那么按指数形式显示;
‘G’ General format. Same as ‘g’ except switches to ‘E’ if the number gets too large. The representations of infinity and NaN are uppercased, too.
‘n’ Number. This is the same as ‘g’, except that it uses the current locale setting to insert the appropriate number separator characters.
‘%’ Percentage. Multiplies the number by 100 and displays in fixed (‘f’) format, followed by a percent sign.
None Similar to ‘g’, except that fixed-point notation, when used, has at least one digit past the decimal point. The default precision is as high as needed to represent the particular value. The overall effect is to match the output of str() as altered by the other format modifiers.
# 浮点数、小数
print("{:e}".format(1234567.89))
print("{:E}".format(1234567.89))
print("{:f}".format(1234567.89))
print("{:F}".format(1234567.89))

print("{:g}".format(0.02))
print("{:.1g}".format(0.002345678)) # 指定了精度,指数大于等于-4,那么 小数点后的非零数字显示的个数和指定的精度大小相同

print("{:g}".format(0.0000012345678)) # 指数小于-4, 以指数形式显示数字
print("{:.3g}".format(0.0000012345678)) # 指数小于-4, 以指数形式显示数字

print("{:.0%}".format(0.35))
print("{:.2%}".format(0.35))

你可能感兴趣的:(python,python,字符串)