根据个人经验,推荐使用 cp-936或utf-8处理中文--译者注
# this is the first comment
SPAM = 1 # and this is the second comment
# ... and now a third!
STRING = "# This is not a comment."
>>> tax = 12.5 / 100
>>> price = 100.50
>>> price * tax
>>> price + _
>>> round(_, 2)
>>> width = 20
>>> height = 5*9
>>> width * height
>>> 3 * 3.75 / 1.5
>>> 7.0 / 2
除了数值, Python 还可以通过几种不同的方法操作字符串。字符串用单引号或双引号标识:
‘spam eggs’
‘“Yes,” he said.’
““Yes,” he said.”
‘“Isn’t,” she said.’
String literals can span multiple lines in several ways. Continuation lines can be used, with a backslash as the last character on the line indicating that the next line is a logical continuation of the line:
hello = “This is a rather long string containing\n
several lines of text just as you would do in C.\n
Note that whitespace at the beginning of the line is
print hello
Note that newlines would still need to be embedded in the string using \n; the newline following the trailing backslash is discarded. This example would print the following:
This is a rather long string containing
several lines of text just as you would do in C.
Note that whitespace at the beginning of the line is significant.
If we make the string literal a ``raw’’ string, however, the \n sequences are not converted to newlines, but the backslash at the end of the line, and the newline character in the source, are both included in the string as data. Thus, the example:
hello = r"This is a rather long string containing\n
several lines of text much as you would do in C."
print hello
This is a rather long string containing\n
several lines of text much as you would do in C.
Or, strings can be surrounded in a pair of matching triple-quotes: “”" or ‘’'. End of lines do not need to be escaped when using triple-quotes, but they will be included in the string.
print “”"
Usage: thingy [OPTIONS]
-h Display this usage message
-H hostname Hostname to connect to
Usage: thingy [OPTIONS]
-h Display this usage message
-H hostname Hostname to connect to
The interpreter prints the result of string operations in the same way as they are typed for input: inside quotes, and with quotes and other funny characters escaped by backslashes, to show the precise value. The string is enclosed in double quotes if the string contains a single quote and no double quotes, else it’s enclosed in single quotes. (The print statement, described later, can be used to write strings without quotes or escapes.)
解释器打印出来的字符串与它们输入的形式完全相同:内部的引号,用反斜杠标识的引号和各种怪字符,都精确的显示出来。如果字符串中包含单引号,不包含双引号,可以用双引号引用它,反之可以用单引号。(后面介绍的 print 语句,可以在不使用引号和反斜杠的情况下输出字符串)。
Strings can be concatenated (glued together) with the + operator, and repeated with *:
word = ‘Help’ + ‘A’
‘<’ + word*5 + ‘>’
Two string literals next to each other are automatically concatenated; the first line above could also have been written “word = ‘Help’ ‘A’”; this only works with two literals, not with arbitrary string expressions:
‘str’ ‘ing’ # <- This is ok
‘str’.strip() + ‘ing’ # <- This is ok
‘str’.strip() ‘ing’ # <- This is invalid
File “”, line 1, in ?
‘str’.strip() ‘ing’
SyntaxError: invalid syntax
Strings can be subscripted (indexed); like in C, the first character of a string has subscript (index) 0. There is no separate character type; a character is simply a string of size one. Like in Icon, substrings can be specified with the slice notation: two indices separated by a colon.
Of course, we can use Python for more complicated tasks than adding two and two together. For instance, we can write an initial sub-sequence of the Fibonacci series as follows:
Fibonacci series:
… # the sum of two elements defines the next
… a, b = 0, 1
while b < 10:
… print b
… a, b = b, a+b
This example introduces several new features.
Python 还没有提供一个智能编辑功能,你要在每一个缩进行输入一个 tab 或(一个或多个)空格。实际上你可能会准备更为复杂的文本编辑器来编写你的 Python 程序,大多数文本编辑器都提供了自动缩进功能。交互式的输入一个复杂语句时,需要用一个空行表示完成(因为解释器没办法猜出你什么时候输入最后一行)。需要注意的是每一行都要有相同的缩进来标识这是同一个语句块。
a, b = 0, 1
while b < 1000:
… print b,
… a, b = b, a+b
1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987
Note that
2:index > 0 且 index < len(list)时,在index的位置插入obj。
3:当index < 0 且 abs(index) < len(list)时,从中间插入obj,如:-1 表示从倒数第1位插入obj。
4:当index < 0 且 abs(index) >= len(list)时,从头部插入obj。
5:当index >= len(list)时,从尾部插入obj。
lst = [2,2,2,2,2,2]
[2, 2, 2, 2, 2, 6, 2]
>>> # Measure some strings:
... a = ['cat', 'window', 'defenestrate']
>>> for x in a:
... print x, len(x)
cat 3
window 6
defenestrate 12
It is not safe to modify the sequence being iterated over in the loop (this can only happen for mutable sequence types, such as lists). If you need to modify the list you are iterating over (for example, to duplicate selected items) you must iterate over a copy. The slice notation makes this particularly convenient:
>>> for x in a[:]: # make a slice copy of the entire list
... if len(x) > 6: a.insert(0, x)
>>> a
['defenestrate', 'cat', 'window', 'defenestrate']
If you do need to iterate over a sequence of numbers, the built-in function range() comes in handy. It generates lists containing arithmetic progressions:
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
The given end point is never part of the generated list; range(10) generates a list of 10 values, exactly the legal indices for items of a sequence of length 10. It is possible to let the range start at another number, or to specify a different increment (even negative; sometimes this is called the `step’):
range(5, 10)
[5, 6, 7, 8, 9]
range(0, 10, 3)
[0, 3, 6, 9]
range(-10, -100, -30)
[-10, -40, -70]
To iterate over the indices of a sequence, combine range() and len() as follows:
a = [‘Mary’, ‘had’, ‘a’, ‘little’, ‘lamb’]
for i in range(len(a)):
… print i, a[i]
0 Mary
1 had
2 a
3 little
4 lamb
The break statement, like in C, breaks out of the smallest enclosing for or while loop.
The continue statement, also borrowed from C, continues with the next iteration of the loop.
Loop statements may have an else clause; it is executed when the loop terminates through exhaustion of the list (with for) or when the condition becomes false (with while), but not when the loop is terminated by a break statement. This is exemplified by the following loop, which searches for prime numbers:
for n in range(2, 10):
… for x in range(2, n):
… if n % x == 0:
… print n, ‘equals’, x, ‘*’, n/x
… break
… else:
… # loop fell through without finding a factor
… print n, ‘is a prime number’
2 is a prime number
3 is a prime number
4 equals 2 * 2
5 is a prime number
6 equals 2 * 3
7 is a prime number
8 equals 2 * 4
9 equals 3 * 3
The pass statement does nothing. It can be used when a statement is required syntactically but the program requires no action. For example:
while True:
… pass # Busy-wait for keyboard interrupt
We can create a function that writes the Fibonacci series to an arbitrary boundary:
def fib(n): # write Fibonacci series up to n
… “”“Print a Fibonacci series up to n.”“”
… a, b = 0, 1
… while b < n:
… print b,
… a, b = b, a+b
… fib(2000)
1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597
def fib2(n): # return Fibonacci series up to n
… “”“Return a list containing the Fibonacci series up to n.”“”
… result = []
… a, b = 0, 1
… while b < n:
… result.append(b) # see below
… a, b = b, a+b
… return result
f100 = fib2(100) # call it
f100 # write the result
[1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
i = 5
def f(arg=i):
print arg
i = 6
will print 5.
Important warning: The default value is evaluated only once. This makes a difference when the default is a mutable object such as a list, dictionary, or instances of most classes. For example, the following function accumulates the arguments passed to it on subsequent calls:
def f(a, L=[]):
return L
print f(1)
print f(2)
print f(3)
[1, 2]
[1, 2, 3]
If you don’t want the default to be shared between subsequent calls, you can write the function like this instead:
def f(a, L=None):
if L is None:
L = []
return L
Functions can also be called using keyword arguments of the form “keyword = value”. For instance, the following function:
def parrot(voltage, state=‘a stiff’, action=‘voom’, type=‘Norwegian Blue’):
print “-- This parrot wouldn’t”, action,
print “if you put”, voltage, “Volts through it.”
print “-- Lovely plumage, the”, type
print “-- It’s”, state, “!”
parrot(action = ‘VOOOOOM’, voltage = 1000000)
parrot(‘a thousand’, state = ‘pushing up the daisies’)
parrot(‘a million’, ‘bereft of life’, ‘jump’)
parrot() # required argument missing
parrot(voltage=5.0, ‘dead’) # non-keyword argument following keyword
parrot(110, voltage=220) # duplicate value for argument
parrot(actor=‘John Cleese’) # unknown keyword
In general, an argument list must have any positional arguments followed by any keyword arguments, where the keywords must be chosen from the formal parameter names. It’s not important whether a formal parameter has a default value or not. No argument may receive a value more than once – formal parameter names corresponding to positional arguments cannot be used as keywords in the same calls. Here’s an example that fails due to this restriction:
def function(a):
… pass
function(0, a=0)
Traceback (most recent call last):
File “”, line 1, in ?
TypeError: function() got multiple values for keyword argument ‘a’
def cheeseshop(kind, *arguments, **keywords):
print("-- Do you have any", kind, '?')
print("-- I'm sorry, we're all out of", kind)
for arg in arguments: print(arg)
keys = keywords.keys()
# keys.sort()新的Python中不能这么用了,新Python中如下sorted(keywords.keys())
for kw in keys: print(kw, ':', keywords[kw])
cheeseshop('Limburger', 'Its very runny222, sir.',
"It's really very, VERY runny, sir.",
client='John Cleese',
shopkeeper='Michael Palin',
sketch='Cheese Shop Sketch')
-- Do you have any Limburger ?
-- I'm sorry, we're all out of Limburger
It's very runny, sir.
It's really very, VERY runny, sir.
client : John Cleese
shopkeeper : Michael Palin
sketch : Cheese Shop Sketch
>>> range(3, 6) # normal call with separate arguments
[3, 4, 5]
>>> args = [3, 6]
>>> range(*args) # call with arguments unpacked from a list
[3, 4, 5]
Finally, the least frequently used option is to specify that a function can be called with an arbitrary number of arguments. These arguments will be wrapped up in a tuple. Before the variable number of arguments, zero or more normal arguments may occur.
def fprintf(file, format, *args):
file.write(format % args)
The reverse situation occurs when the arguments are already in a list or tuple but need to be unpacked for a function call requiring separate positional arguments. For instance, the built-in range() function expects separate start and stop arguments. If they are not available separately, write the function call with the *-operator to unpack the arguments out of a list or tuple:
>>> range(3, 6) # normal call with separate arguments
[3, 4, 5]
>>> args = [3, 6]
>>> range(*args) # call with arguments unpacked from a list
[3, 4, 5]
By popular demand, a few features commonly found in functional programming languages and Lisp have been added to Python. With the lambda keyword, small anonymous functions can be created. Here's a function that returns the sum of its two arguments: "lambda a, b: a+b". Lambda forms can be used wherever function objects are required. They are syntactically restricted to a single expression. Semantically, they are just syntactic sugar for a normal function definition. Like nested function definitions, lambda forms can reference variables from the containing scope:
>>> def make_incrementor(n):
... return lambda x: x + n
>>> f = make_incrementor(42)
>>> f(0)
>>> f(1)
There are emerging conventions about the content and formatting of documentation strings.
The first line should always be a short, concise summary of the object's purpose. For brevity, it should not explicitly state the object's name or type, since these are available by other means (except if the name happens to be a verb describing a function's operation). This line should begin with a capital letter and end with a period.
If there are more lines in the documentation string, the second line should be blank, visually separating the summary from the rest of the description. The following lines should be one or more paragraphs describing the object's calling conventions, its side effects, etc.
The Python parser does not strip indentation from multi-line string literals in Python, so tools that process documentation have to strip indentation if desired. This is done using the following convention. The first non-blank line after the first line of the string determines the amount of indentation for the entire documentation string. (We can't use the first line since it is generally adjacent to the string's opening quotes so its indentation is not apparent in the string literal.) Whitespace ``equivalent'' to this indentation is then stripped from the start of all lines of the string. Lines that are indented less should not occur, but if they occur all their leading whitespace should be stripped. Equivalence of whitespace should be tested after expansion of tabs (to 8 spaces, normally).
Here is an example of a multi-line docstring:
>>> def my_function():
... """Do nothing, but document it.
... No, really, it doesn't do anything.
... """
... pass
>>> print my_function.__doc__
Do nothing, but document it.
No, really, it doesn't do anything.
This chapter describes some things you’ve learned about already in more detail, and adds some new things as well.
The list data type has some more methods. Here are all of the methods of list objects:
Add an item to the end of the list; equivalent to a[len(a):] = [x].
把一个元素添加到链表的结尾,相当于 a[len(a):] = [x]
Extend the list by appending all the items in the given list; equivalent to a[len(a):] = L.
通过添加指定链表的所有元素来扩充链表,相当于 a[len(a):] = L。
Insert an item at a given position. The first argument is the index of the element before which to insert, so a.insert(0, x) inserts at the front of the list, and a.insert(len(a), x) is equivalent to a.append(x).
在指定位置插入一个元素。第一个参数是准备插入到其前面的那个元素的索引,例如a.insert(0, x) 会插入到整个链表之前,而a.insert(len(a), x) 相当于 a.append(x)。
Remove the first item from the list whose value is x. It is an error if there is no such item.
Remove the item at the given position in the list, and return it. If no index is specified, a.pop() returns the last item in the list. The item is also removed from the list. (The square brackets around the i in the method signature denote that the parameter is optional, not that you should type square brackets at that position. You will see this notation frequently in the Python Library Reference.)
从链表的指定位置删除元素,并将其返回。如果没有指定索引,a.pop()返回最后一个元素。元素随即从链表中被删除。(方法中i两边的方括号表示这个参数是可选的,而不是要求你输入一对方括号,你会经常在Python 库参考手册中遇到这样的标记。)
Return the index in the list of the first item whose value is x. It is an error if there is no such item.
Return the number of times x appears in the list.
Sort the items of the list, in place.
Reverse the elements of the list, in place.
An example that uses most of the list methods:
a = [66.6, 333, 333, 1, 1234.5]
print a.count(333), a.count(66.6), a.count(‘x’)
2 1 0
a.insert(2, -1)
[66.6, 333, -1, 333, 1, 1234.5, 333]
[66.6, -1, 333, 1, 1234.5, 333]
[333, 1234.5, 1, 333, -1, 66.6]
[-1, 1, 66.6, 333, 333, 1234.5]
The list methods make it very easy to use a list as a stack, where the last element added is the first element retrieved (``last-in, first-out’'). To add an item to the top of the stack, use append(). To retrieve an item from the top of the stack, use pop() without an explicit index. For example:
链表方法使得链表可以很方便的做为一个堆栈来使用,堆栈作为特定的数据结构,最先进入的元素最后一个被释放(后进先出)。用append() 方法可以把一个元素添加到堆栈顶。用不指定索引的pop() 方法可以把一个元素从堆栈顶释放出来。例如:
stack = [3, 4, 5]
[3, 4, 5, 6, 7]
>>> stack
[3, 4, 5, 6]
[3, 4]
You can also use a list conveniently as a queue, where the first element added is the first element retrieved (``first-in, first-out’'). To add an item to the back of the queue, use append(). To retrieve an item from the front of the queue, use pop() with 0 as the index. For example:
你也可以把链表当做队列使用,队列作为特定的数据结构,最先进入的元素最先释放(先进先出)。使用 append()方法可以把元素添加到队列最后,以0为参数调用 pop() 方法可以把最先进入的元素释放出来。例如:
queue = [“Eric”, “John”, “Michael”]
queue.append(“Terry”) # Terry arrives
queue.append(“Graham”) # Graham arrives
[‘Michael’, ‘Terry’, ‘Graham’]
There are three built-in functions that are very useful when used with lists: filter(), map(), and reduce().
对于链表来讲,有三个内置函数非常有用==:filter(), map(), 和 reduce()。 ==
“filter(function, sequence)” returns a sequence (of the same type, if possible) consisting of those items from the sequence for which function(item) is true. For example, to compute some primes:
"filter(function, sequence)"返回一个序列(sequence),包括了给定序列中所有调用function(item)后返回值为true的元素。(如果可能的话,会返回相同的类型)。例如,以下程序可以计算部分素数:
def f(x): return x % 2 != 0 and x % 3 != 0
>>> filter(f, range(2, 25))
[5, 7, 11, 13, 17, 19, 23]
“map(function, sequence)” calls function(item) for each of the sequence’s items and returns a list of the return values. For example, to compute some cubes:
“map(function, sequence)” 为每一个元素依次调用function(item)并将返回值组成一个链表返回。例如,以下程序计算立方:
def cube(x): return xxx
map(cube, range(1, 11))
[1, 8, 27, 64, 125, 216, 343, 512, 729, 1000]
More than one sequence may be passed; the function must then have as many arguments as there are sequences and is called with the corresponding item from each sequence (or None if some sequence is shorter than another). For example:
seq = range(8)
def add(x, y): return x+y
map(add, seq, seq)
[0, 2, 4, 6, 8, 10, 12, 14]
“reduce(func, sequence)” returns a single value constructed by calling the binary function func on the first two items of the sequence, then on the result and the next item, and so on. For example, to compute the sum of the numbers 1 through 10:
“reduce(func, sequence)” 返回一个单值,它是这样构造的:首先以序列的前两个元素调用函数,再以返回值和第三个参数调用,依次执行下去。例如,以下程序计算1到10的整数之和:
def add(x,y): return x+y
reduce(add, range(1, 11))
If there’s only one item in the sequence, its value is returned; if the sequence is empty, an exception is raised.
A third argument can be passed to indicate the starting value. In this case the starting value is returned for an empty sequence, and the function is first applied to the starting value and the first sequence item, then to the result and the next item, and so on. For example,
可以传入第三个参数做为初始值。如果序列是空的,就返回初始值,否则函数会先接收初始值和序列的第一个元素,然后是返回值和下一个元素,依此类推。例如: ==
def sum(seq):
… def add(x,y): return x+y
… return reduce(add, seq, 0)
sum(range(1, 11))
Don’t use this example’s definition of sum(): since summing numbers is such a common need, a built-in function sum(sequence) is already provided, and works exactly like this. New in version 2.3.
不要像示例中这样定义sum():因为合计数值是一个通用的需求,在2.3版中,提供了内置的sum(sequence) 函数。 New in version 2.3.
List comprehensions provide a concise way to create lists without resorting to use of map(), filter() and/or lambda. The resulting list definition tends often to be clearer than lists built using those constructs. Each list comprehension consists of an expression followed by a for clause, then zero or more for or if clauses. The result will be a list resulting from evaluating the expression in the context of the for and if clauses which follow it. If the expression would evaluate to a tuple, it must be parenthesized.
链表推导式提供了一个创建链表的简单途径,无需使用map(), filter() 以及 lambda。返回链表的定义通常要比创建这些链表更清晰。每一个链表推导式包括在一个for 语句之后的表达式,零或多个 for或 if 语句。返回值是由 for 或 if子句之后的表达式得到的元素组成的链表。如果想要得到一个元组,必须要加上括号。
freshfruit = [’ banana’, ’ loganberry ', 'passion fruit ']
[weapon.strip() for weapon in freshfruit]
[‘banana’, ‘loganberry’, ‘passion fruit’]
vec = [2, 4, 6]
[3x for x in vec]
[6, 12, 18]
[3x for x in vec if x > 3]
[12, 18]
[3x for x in vec if x < 2]
[[x,x2] for x in vec]
[[2, 4], [4, 16], [6, 36]]
[x, x2 for x in vec] # error - parens required for tuples
File “”, line 1, in ?
[x, x2 for x in vec]
SyntaxError: invalid syntax
[(x, x2) for x in vec]
[(2, 4), (4, 16), (6, 36)]
vec1 = [2, 4, 6]
vec2 = [4, 3, -9]
[xy for x in vec1 for y in vec2]
[8, 6, -18, 16, 12, -36, 24, 18, -54]
[x+y for x in vec1 for y in vec2]
[6, 5, -7, 8, 7, -5, 10, 9, -3]
[vec1[i]*vec2[i] for i in range(len(vec1))]
[8, 12, -54]
List comprehensions are much more flexible than map() and can be applied to functions with more than one argument and to nested functions:
链表推导式比 map()更复杂,可调用多个参数和嵌套函数。
[str(round(355/113.0, i)) for i in range(1,6)]
[‘3.1’, ‘3.14’, ‘3.142’, ‘3.1416’, ‘3.14159’]
There is a way to remove an item from a list given its index instead of its value: the del statement. This can also be used to remove slices from a list (which we did earlier by assignment of an empty list to the slice). For example:
有一个方法可从链表中删除指定索引的元素:del 语句。这个方法也可以从链表中删除切片(之前我们是把一个空链表赋给切片)。例如:
a = [-1, 1, 66.6, 333, 333, 1234.5]
del a[0]
[1, 66.6, 333, 333, 1234.5]
del a[2:4]
[1, 66.6, 1234.5]
del can also be used to delete entire variables:
del 也可以用于删除整个变量:
del a
Referencing the name a hereafter is an error (at least until another value is assigned to it). We’ll find other uses for del later.
We saw that lists and strings have many common properties, such as indexing and slicing operations. They are two examples of sequence data types. Since Python is an evolving language, other sequence data types may be added. There is also another standard sequence data type: the tuple.
t = 12345, 54321, ‘hello!’
(12345, 54321, ‘hello!’)Tuples may be nested:
… u = t, (1, 2, 3, 4, 5)
((12345, 54321, ‘hello!’), (1, 2, 3, 4, 5))
As you see, on output tuples are alway enclosed in parentheses, so that nested tuples are interpreted correctly; they may be input with or without surrounding parentheses, although often parentheses are necessary anyway (if the tuple is part of a larger expression).
Tuples have many uses. For example: (x, y) coordinate pairs, employee records from a database, etc. Tuples, like strings, are immutable: it is not possible to assign to the individual items of a tuple (you can simulate much of the same effect with slicing and concatenation, though). It is also possible to create tuples which contain mutable objects, such as lists.
元组有很多用途。例如(x, y)坐标点,数据库中的员工记录等等。元组就像字符串,不可改变:不能给元组的一个独立的元素赋值(尽管你可以通过联接和切片来模仿)。也可以通过包含可变对象来创建元组,例如链表。
A special problem is the construction of tuples containing 0 or 1 items: the syntax has some extra quirks to accommodate these. Empty tuples are constructed by an empty pair of parentheses; a tuple with one item is constructed by following a value with a comma (it is not sufficient to enclose a single value in parentheses). Ugly, but effective. For example:
empty = ()
singleton = ‘hello’, # <-- note trailing comma
The statement t = 12345, 54321, ‘hello!’ is an example of tuple packing: the values 12345, 54321 and ‘hello!’ are packed together in a tuple. The reverse operation is also possible:
==语句 t = 12345, 54321, ‘hello!’ 是元组封装(sequence packing)的一个例子:值 12345, 54321 和 ‘hello!’ 被封装进元组。其逆操作可能是这样:
x, y, z = t
This is called, appropriately enough, sequence unpacking. Sequence unpacking requires that the list of variables on the left have the same number of elements as the length of the sequence. Note that multiple assignment is really just a combination of tuple packing and sequence unpacking!
这个调用被称为序列拆封非常合适。序列拆封要求左侧的变量数目与序列的元素个数相同。要注意的是可变参数(multiple assignment )其实只是元组封装和序列拆封的一个结合!
There is a small bit of asymmetry here: packing multiple values always creates a tuple, and unpacking works for any sequence.
Another useful data type built into Python is the dictionary. Dictionaries are sometimes found in other languages as associative memories'' or
associative arrays’'. Unlike sequences, which are indexed by a range of numbers, dictionaries are indexed by keys, which can be any immutable type; strings and numbers can always be keys. Tuples can be used as keys if they contain only strings, numbers, or tuples; if a tuple contains any mutable object either directly or indirectly, it cannot be used as a key. You can’t use lists as keys, since lists can be modified in place using their append() and extend() methods, as well as slice and indexed assignments.
另一个非常有用的Python内建数据类型是字典。字典在某些语言中可能称为“联合内存”(associative memories'')或“联合数组”(
associative arrays’')。序列是以连续的整数为索引,与此不同的是,字典以关键字为索引,关键字可以是任意不可变类型,通常用字符串或数值。如果元组中只包含字符串和数字,它可以做为关键字,如果它直接或间接的包含了可变对象,就不能当做关键字。不能用链表做关键字,因为链表可以用它们的append() 和 extend()方法,或者用切片、或者通过检索变量来即时改变。
It is best to think of a dictionary as an unordered set of key: value pairs, with the requirement that the keys are unique (within one dictionary). A pair of braces creates an empty dictionary: {}. Placing a comma-separated list of key:value pairs within the braces adds initial key:value pairs to the dictionary; this is also the way dictionaries are written on output.
理解字典的最佳方式是把它看做无序的关键字:值 对(key:value pairs)集合,关键字必须是互不相同的(在同一个字典之内)。一对大括号创建一个空的字典:{}。初始化链表时,在大括号内放置一组逗号分隔的关键字:值对,这也是字典输出的方式。
The main operations on a dictionary are storing a value with some key and extracting the value given the key. It is also possible to delete a key:value pair with del. If you store using a key that is already in use, the old value associated with that key is forgotten. It is an error to extract a value using a non-existent key.
字典的主要操作是依据关键字来存储和析取值。也可以用 del来删除关键字:值对。如果你用一个已经存在的关键字存储值,以前为该关键字分配的值就会被遗忘。试图析取从一个不存在的关键字中读取值会导致错误。
The keys() method of a dictionary object returns a list of all the keys used in the dictionary, in random order (if you want it sorted, just apply the sort() method to the list of keys). To check whether a single key is in the dictionary, use the has_key() method of the dictionary.
字典的 keys()方法返回由所有关键字组成的链表,该链表的顺序不定(如果你需要它有序,只能调用关键字链表的sort() 方法)。使用字典的 has_key()方法可以检查字典中是否存在某一关键字。
Here is a small example using a dictionary:
tel = {‘jack’: 4098, ‘sape’: 4139}
tel[‘guido’] = 4127
{‘sape’: 4139, ‘guido’: 4127, ‘jack’: 4098}
del tel[‘sape’]
tel[‘irv’] = 4127
{‘guido’: 4127, ‘irv’: 4127, ‘jack’: 4098}
[‘guido’, ‘irv’, ‘jack’]
The dict() constructor builds dictionaries directly from lists of key-value pairs stored as tuples. When the pairs form a pattern, list comprehensions can compactly specify the key-value list.
dict([(‘sape’, 4139), (‘guido’, 4127), (‘jack’, 4098)])
{‘sape’: 4139, ‘jack’: 4098, ‘guido’: 4127}
>>> dict([(x, x**2) for x in range(2,7,2)]) # use a list comprehension
{2: 4, 4: 16, 6: 36}
When looping through dictionaries, the key and corresponding value can be retrieved at the same time using the iteritems() method.
在字典中循环时,关键字和对应的值可以使用 iteritems()方法同时解读出来。
knights = {‘gallahad’: ‘the pure’, ‘robin’: ‘the brave’}
for k, v in knights.iteritems():
… print k, v
gallahad the pure
robin the brave
When looping through a sequence, the position index and corresponding value can be retrieved at the same time using the enumerate() function.
for i, v in enumerate([‘tic’, ‘tac’, ‘toe’]):
… print i, v
0 tic
1 tac
2 toe
To loop over two or more sequences at the same time, the entries can be paired with the zip() function.
同时循环两个或更多的序列,可以使用 zip() 整体解读。
questions = [‘name’, ‘quest’, ‘favorite color’]
answers = [‘lancelot’, ‘the holy grail’, ‘blue’]
>>> for q, a in zip(questions, answers):
… print ‘What is your %s? It is %s.’ % (q, a)
What is your name? It is lancelot.
What is your quest? It is the holy grail.
What is your favorite color? It is blue.
The conditions used in while and if statements above can contain other operators besides comparisons.
用于 while 和 if 语句的条件包括了比较之外的操作符。
The comparison operators in and not in check whether a value occurs (does not occur) in a sequence. The operators is and is not compare whether two objects are really the same object; this only matters for mutable objects like lists. All comparison operators have the same priority, which is lower than that of all numerical operators.
in 和 not in 比较操作符审核值是否在一个区间之内。操作符 is is not 和比较两个对象是否相同;这只和诸如链表这样的可变对象有关。所有的比较操作符具有相同的优先级,低于所有的数值操作。
Comparisons can be chained. For example, a < b == c tests whether a is less than b and moreover b equals c.
比较操作可以传递。例如 a < b == c 审核是否 a 小于 b 并 b 等于c。
Comparisons may be combined by the Boolean operators and and or, and the outcome of a comparison (or of any other Boolean expression) may be negated with not. These all have lower priorities than comparison operators again; between them, not has the highest priority, and or the lowest, so that A and not B or C is equivalent to (A and (not B)) or C. Of course, parentheses can be used to express the desired composition.
比较操作可以通过逻辑操作符 and 和 or 组合,比较的结果可以用 not 来取反义。这些操作符的优先级又低于比较操作符,在它们之中,not 具有最高的优先级, or 优先级最低,所以A and not B or C 等于 (A and (not B)) or C。当然,表达式可以用期望的方式表示。
The Boolean operators and and or are so-called short-circuit operators: their arguments are evaluated from left to right, and evaluation stops as soon as the outcome is determined. For example, if A and C are true but B is false, A and B and C does not evaluate the expression C. In general, the return value of a short-circuit operator, when used as a general value and not as a Boolean, is the last evaluated argument.
逻辑操作符 and 和 or 也称作短路操作符:它们的参数从左向右解析,一旦结果可以确定就停止。例如,如果 A 和 C 为真而 B 为假, A and B and C 不会解析 C。作用于一个普通的非逻辑值时,短路操作符的返回值通常是最后一个变量
It is possible to assign the result of a comparison or other Boolean expression to a variable. For example,
string1, string2, string3 = ‘’, ‘Trondheim’, ‘Hammer Dance’
non_null = string1 or string2 or string3
Note that in Python, unlike C, assignment cannot occur inside expressions. C programmers may grumble about this, but it avoids a common class of problems encountered in C programs: typing = in an expression when == was intended.
需要注意的是Python与C不同,在表达式内部不能赋值。C 程序员经常对此抱怨,不过它避免了一类在 C 程序中司空见惯的错误:想要在解析式中使 == 时误用了 = 操作符。
Sequence objects may be compared to other objects with the same sequence type. The comparison uses lexicographical ordering: first the first two items are compared, and if they differ this determines the outcome of the comparison; if they are equal, the next two items are compared, and so on, until either sequence is exhausted. If two items to be compared are themselves sequences of the same type, the lexicographical comparison is carried out recursively. If all items of two sequences compare equal, the sequences are considered equal. If one sequence is an initial sub-sequence of the other, the shorter sequence is the smaller (lesser) one. Lexicographical ordering for strings uses the ASCII ordering for individual characters. Some examples of comparisons between sequences with the same types:
序列对象可以与相同类型的其它对象比较。比较操作按 字典序 进行:首先比较前两个元素,如果不同,就决定了比较的结果;如果相同,就比较后两个元素,依此类推,直到所有序列都完成比较。如果两个元素本身就是同样类型的序列,就递归字典序比较。如果两个序列的所有子项都相等,就认为序列相等。如果一个序列是另一个序列的初始子序列,较短的一个序列就小于另一个。字符串的字典序按照单字符的 ASCII 顺序。下面是同类型序列之间比较的一些例子:
(1, 2, 3) < (1, 2, 4)
[1, 2, 3] < [1, 2, 4]
‘ABC’ < ‘C’ < ‘Pascal’ < ‘Python’
(1, 2, 3, 4) < (1, 2, 4)
(1, 2) < (1, 2, -1)
(1, 2, 3) == (1.0, 2.0, 3.0)
(1, 2, (‘aa’, ‘ab’)) < (1, 2, (‘abc’, ‘a’), 4)
Note that comparing objects of different types is legal. The outcome is deterministic but arbitrary: the types are ordered by their name. Thus, a list is always smaller than a string, a string is always smaller than a tuple, etc. Mixed numeric types are compared according to their numeric value, so 0 equals 0.0, etc.5.1
If you quit from the Python interpreter and enter it again, the definitions you have made (functions and variables) are lost. Therefore, if you want to write a somewhat longer program, you are better off using a text editor to prepare the input for the interpreter and running it with that file as input instead. This is known as creating a script. As your program gets longer, you may want to split it into several files for easier maintenance. You may also want to use a handy function that you’ve written in several programs without copying its definition into each program.
如果你退出 Python 解释器重新进入,以前创建的一切定义(变量和函数)就全部丢失了。因此,如果你想写一些长久保存的程序,最好使用一个文本编辑器来编写程序,把保存好的文件输入解释器。我们称之为创建一个脚本。程序变得更长一些了,你可能为了方便维护而把它分离成几个文件。你也可能想要在几个程序中都使用一个常用的函数,但是不想把它的定义复制到每一个程序里。
To support this, Python has a way to put definitions in a file and use them in a script or in an interactive instance of the interpreter. Such a file is called a module; definitions from a module can be imported into other modules or into the main module (the collection of variables that you have access to in a script executed at the top level and in calculator mode).
A module is a file containing Python definitions and statements. The file name is the module name with the suffix .py appended. Within a module, the module’s name (as a string) is available as the value of the global variable name. For instance, use your favorite text editor to create a file called fibo.py in the current directory with the following contents:
def fib(n): # write Fibonacci series up to n
a, b = 0, 1
while b < n:
print b,
a, b = b, a+b
def fib2(n): # return Fibonacci series up to n
result = []
a, b = 0, 1
while b < n:
a, b = b, a+b
return result
Now enter the Python interpreter and import this module with the following command:
import fibo
This does not enter the names of the functions defined in fibo directly in the current symbol table; it only enters the module name fibo there. Using the module name you can access the functions:
1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987
[1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
If you intend to use a function often you can assign it to a local name:
fib = fibo.fib
1 1 2 3 5 8 13 21 34 55 89 144 233 377
Each module has its own private symbol table, which is used as the global symbol table by all functions defined in the module. Thus, the author of a module can use global variables in the module without worrying about accidental clashes with a user’s global variables.
On the other hand, if you know what you are doing you can touch a module’s global variables with the same notation used to refer to its functions, modname.itemname.
Modules can import other modules. It is customary but not required to place all import statements at the beginning of a module (or script, for that matter). The imported module names are placed in the importing module’s global symbol table.
There is a variant of the import statement that imports names from a module directly into the importing module’s symbol table. For example:
==>>> from fibo import fib, fib2
1 1 2 3 5 8 13 21 34 55 89 144 233 377==
This does not introduce the module name from which the imports are taken in the local symbol table (so in the example, fibo is not defined).
There is even a variant to import all names that a module defines:
1 1 2 3 5 8 13 21 34 55 89 144 233 377==
When a module named spam is imported, the interpreter searches for a file named spam.py in the current directory, and then in the list of directories specified by the environment variable PYTHONPATH. This has the same syntax as the shell variable PATH, that is, a list of directory names. When PYTHONPATH is not set, or when the file is not found there, the search continues in an installation-dependent default path; on Unix, this is usually .:/usr/local/lib/python.
Actually, modules are searched in the list of directories given by the variable sys.path which is initialized from the directory containing the input script (or the current directory), PYTHONPATH and the installation-dependent default. This allows Python programs that know what they’re doing to modify or replace the module search path. Note that because the directory containing the script being run is on the search path, it is important that the script not have the same name as a standard module, or Python will attempt to load the script as a module when that module is imported. This will generally be an error. See section , ``Standard Modules,‘’ for more information.
As an important speed-up of the start-up time for short programs that use a lot of standard modules, if a file called spam.pyc exists in the directory where spam.py is found, this is assumed to contain an already-``byte-compiled’’ version of the module spam. The modification time of the version of spam.py used to create spam.pyc is recorded in spam.pyc, and the .pyc file is ignored if these don’t match.
对于引用了大量标准模块的短程序,有一个提高启动速度的重要方法,如果在 spam.py 所在的目录下存在一个名为 spam.pyc 的文件,它会被视为 spam 模块的预“编译”(``byte-compiled’’ ,二进制编译)版本。用于创建 spam.pyc 的这一版 spam.py 的修改时间记录在 spam.pyc 文件中,如果两者不匹配,.pyc 文件就被忽略。
Normally, you don’t need to do anything to create the spam.pyc file. Whenever spam.py is successfully compiled, an attempt is made to write the compiled version to spam.pyc. It is not an error if this attempt fails; if for any reason the file is not written completely, the resulting spam.pyc file will be recognized as invalid and thus ignored later. The contents of the spam.pyc file are platform independent, so a Python module directory can be shared by machines of different architectures.
通常你不需要为创建 spam.pyc 文件做任何工作。一旦 spam.py 成功编译,就会试图编译对应版本的 spam.pyc。如果有任何原因导致写入不成功,返回的 spam.pyc 文件就会视为无效,随后即被忽略。 spam.pyc 文件的内容是平台独立的,所以Python模块目录可以在不同架构的机器之间共享。
When the Python interpreter is invoked with the -O flag, optimized code is generated and stored in .pyo files. The optimizer currently doesn’t help much; it only removes assert statements. When -O is used, all bytecode is optimized; .pyc files are ignored and .py files are compiled to optimized bytecode.
以 -O 参数调用Python解释器时,会生成优化代码并保存在 .pyo 文件中。现在的优化器没有太多帮助;它只是删除了断言(assert )语句。使用 -O 参参数,所有的代码都会被优化;.pyc 文件被忽略, .py文件被编译为优化代码。
Passing two -O flags to the Python interpreter (-OO) will cause the bytecode compiler to perform optimizations that could in some rare cases result in malfunctioning programs. Currently only doc strings are removed from the bytecode, resulting in more compact .pyo files. Since some programs may rely on having these available, you should only use this option if you know what you’re doing.
向Python解释器传递两个 -O 参数(-OO)会执行完全优化的二进制优化编译,这偶尔会生成错误的程序。现在的优化器,只是从二进制代码中删除了 doc 符串,生成更为紧凑的 .pyo 文件。因为某些程序依赖于这些变量的可用性,你应该只在确定无误的场合使用这一选项。
A program doesn’t run any faster when it is read from a .pyc or .pyo file than when it is read from a .py file; the only thing that’s faster about .pyc or .pyo files is the speed with which they are loaded.
来自 .pyc 文件或 .pyo 文件中的程序不会比来自 .py 文件的运行更快; .pyc 或 .pyo 文件只是在它们加载的时候更快一些。
When a script is run by giving its name on the command line, the bytecode for the script is never written to a .pyc or .pyo file. Thus, the startup time of a script may be reduced by moving most of its code to a module and having a small bootstrap script that imports that module. It is also possible to name a .pyc or .pyo file directly on the command line.
通过脚本名在命令行运行脚本时,不会将为该脚本创建的二进制代码写入 .pyc 或.pyo 文件。当然,把脚本的主要代码移进一个模块里,然后用一个小的解构脚本导入这个模块,就可以提高脚本的启动速度。也可以直接在命令行中指定一个 .pyc 或 .pyo 文件。
It is possible to have a file called spam.pyc (or spam.pyo when -O is used) without a file spam.py for the same module. This can be used to distribute a library of Python code in a form that is moderately hard to reverse engineer.
对于同一个模块(这里指例程 spam.py --译者),可以只有 spam.pyc 文件(或者 spam.pyc ,在使用 -O 参数时)而没有 spam.py 文件。这样可以打包发布比较难于逆向工程的Python代码库。
The module compileall can create .pyc files (or .pyo files when -O is used) for all modules in a directory.
compileall 模块 可以为指定目录中的所有模块创建 .pyc 文件(或者使用 .pyo 参数创建.pyo文件)。
Python comes with a library of standard modules, described in a separate document, the Python Library Reference (``Library Reference’’ hereafter). Some modules are built into the interpreter; these provide access to operations that are not part of the core of the language but are nevertheless built in, either for efficiency or to provide access to operating system primitives such as system calls. The set of such modules is a configuration option which also depends on the underlying platform For example, the amoeba module is only provided on systems that somehow support Amoeba primitives. One particular module deserves some attention: sys, which is built into every Python interpreter. The variables sys.ps1 and sys.ps2 define the strings used as primary and secondary prompts:
Python带有一个标准模块库,并发布有独立的文档,名为 Python 库参考手册 (此后称其为“库参考手册”)。有一些模块内置于解释器之中,这些操作的访问接口不是语言内核的一部分,但是已经内置于解释器了。这既是为了提高效率,也是为了给系统调用等操作系统原生访问提供接口。这类模块集合是一个依赖于底层平台的配置选项。例如,amoeba 模块只提供对 Amoeba 原生系统的支持。有一个具体的模块值得注意:sys ,这个模块内置于所有的Python解释器。变量 sys.ps1 和 sys.ps2定义了主提示符和副助提示符字符串:
import sys
'>>> ’
'… ’
sys.ps1 = 'C> ’
C> print ‘Yuck!’
These two variables are only defined if the interpreter is in interactive mode.
The variable sys.path is a list of strings that determine the interpreter’s search path for modules. It is initialized to a default path taken from the environment variable PYTHONPATH, or from a built-in default if PYTHONPATH is not set. You can modify it using standard list operations:
变量 sys.path 是解释器模块搜索路径的字符串列表。它由环境变量 PYTHONPATH 初始化,如果没有设定 PYTHONPATH ,就由内置的默认值初始化。你可以用标准的字符串操作修改它:
import sys
The built-in function dir() is used to find out which names a module defines. It returns a sorted list of strings:
内置函数 dir() 用于按模块名搜索模块定义,它返回一个字符串类型的存储列表:
import fibo, sys
[‘name’, ‘fib’, ‘fib2’]
[‘displayhook’, ‘doc’, ‘excepthook’, ‘name’, ‘stderr’,
‘stdin’, ‘stdout’, ‘_getframe’, ‘api_version’, ‘argv’,
‘builtin_module_names’, ‘byteorder’, ‘callstats’, ‘copyright’,
‘displayhook’, ‘exc_clear’, ‘exc_info’, ‘exc_type’, ‘excepthook’,
‘exec_prefix’, ‘executable’, ‘exit’, ‘getdefaultencoding’, ‘getdlopenflags’,
‘getrecursionlimit’, ‘getrefcount’, ‘hexversion’, ‘maxint’, ‘maxunicode’,
‘meta_path’, ‘modules’, ‘path’, ‘path_hooks’, ‘path_importer_cache’,
‘platform’, ‘prefix’, ‘ps1’, ‘ps2’, ‘setcheckinterval’, ‘setdlopenflags’,
‘setprofile’, ‘setrecursionlimit’, ‘settrace’, ‘stderr’, ‘stdin’, ‘stdout’,
‘version’, ‘version_info’, ‘warnoptions’]
Without arguments, dir() lists the names you have defined currently:
无参数调用时, dir() 函数返回当前定义的命名:
a = [1, 2, 3, 4, 5]
import fibo, sys
fib = fibo.fib
[‘name’, ‘a’, ‘fib’, ‘fibo’, ‘sys’]
Note that it lists all types of names: variables, modules, functions, etc.
dir() does not list the names of built-in functions and variables. If you want a list of those, they are defined in the standard module builtin:
dir() 不会列出内置函数和变量名。如果你想列出这些内容,它们在标准模块 __builtin__中定义:
import builtin
[‘ArithmeticError’, ‘AssertionError’, ‘AttributeError’,
‘DeprecationWarning’, ‘EOFError’, ‘Ellipsis’, ‘EnvironmentError’,
‘Exception’, ‘False’, ‘FloatingPointError’, ‘IOError’, ‘ImportError’,
‘IndentationError’, ‘IndexError’, ‘KeyError’, ‘KeyboardInterrupt’,
‘LookupError’, ‘MemoryError’, ‘NameError’, ‘None’, ‘NotImplemented’,
‘NotImplementedError’, ‘OSError’, ‘OverflowError’, ‘OverflowWarning’,
‘PendingDeprecationWarning’, ‘ReferenceError’,
‘RuntimeError’, ‘RuntimeWarning’, ‘StandardError’, ‘StopIteration’,
‘SyntaxError’, ‘SyntaxWarning’, ‘SystemError’, ‘SystemExit’, ‘TabError’,
‘True’, ‘TypeError’, ‘UnboundLocalError’, ‘UnicodeError’, ‘UserWarning’,
‘ValueError’, ‘Warning’, ‘ZeroDivisionError’, ‘debug’, ‘doc’,
‘import’, ‘name’, ‘abs’, ‘apply’, ‘bool’, ‘buffer’,
‘callable’, ‘chr’, ‘classmethod’, ‘cmp’, ‘coerce’, ‘compile’, ‘complex’,
‘copyright’, ‘credits’, ‘delattr’, ‘dict’, ‘dir’, ‘divmod’,
‘enumerate’, ‘eval’, ‘execfile’, ‘exit’, ‘file’, ‘filter’, ‘float’,
‘getattr’, ‘globals’, ‘hasattr’, ‘hash’, ‘help’, ‘hex’, ‘id’,
‘input’, ‘int’, ‘intern’, ‘isinstance’, ‘issubclass’, ‘iter’,
‘len’, ‘license’, ‘list’, ‘locals’, ‘long’, ‘map’, ‘max’, ‘min’,
‘object’, ‘oct’, ‘open’, ‘ord’, ‘pow’, ‘property’, ‘quit’,
‘range’, ‘raw_input’, ‘reduce’, ‘reload’, ‘repr’, ‘round’,
‘setattr’, ‘slice’, ‘staticmethod’, ‘str’, ‘string’, ‘sum’, ‘super’,
‘tuple’, ‘type’, ‘unichr’, ‘unicode’, ‘vars’, ‘xrange’, ‘zip’]
Packages are a way of structuring Python’s module namespace by using ``dotted module names’'. For example, the module name A.B designates a submodule named “B” in a package named “A”. Just like the use of modules saves the authors of different modules from having to worry about each other’s global variable names, the use of dotted module names saves the authors of multi-module packages like NumPy or the Python Imaging Library from having to worry about each other’s module names.
包通常是使用用“圆点模块名”的结构化模块命名空间。例如,名为 A.B 的模块表示了名为 “B” 的包中名为 “A” 的子模块。正如同用模块来保存不同的模块架构可以避免全局变量之间的相互冲突,使用圆点模块名保存像 NumPy 或 Python Imaging Library 之类的不同类库架构可以避免模块之间的命名冲突。
Suppose you want to design a collection of modules (a ``package’') for the uniform handling of sound files and sound data. There are many different sound file formats (usually recognized by their extension, for example: .wav, .aiff, .au), so you may need to create and maintain a growing collection of modules for the conversion between the various file formats. There are also many different operations you might want to perform on sound data (such as mixing, adding echo, applying an equalizer function, creating an artificial stereo effect), so in addition you will be writing a never-ending stream of modules to perform these operations. Here’s a possible structure for your package (expressed in terms of a hierarchical filesystem):
假设你现在想要设计一个模块集(一个“包”)来统一处理声音文件和声音数据。存在几种不同的声音格式(通常由它们的扩展名来标识,例如:.wav , .aiff , .au) ),于是,为了在不同类型的文件格式之间转换,你需要维护一个不断增长的包集合。可能你还想要对声音数据做很多不同的操作(例如混音,添加回声,应用平衡功能,创建一个人造效果),所以你要加入一个无限流模块来执行这些操作。你的包可能会是这个样子(通过分级的文件体系来进行分组):
Sound/ Top-level package
init.py Initialize the sound package
Formats/ Subpackage for file format conversions
Effects/ Subpackage for sound effects
Filters/ Subpackage for filters
When importing the package, Python searches through the directories on sys.path looking for the package subdirectory.
导入模块时,Python通过 sys.path 中的目录列表来搜索存放包的子目录。
The init.py files are required to make Python treat the directories as containing packages; this is done to prevent directories with a common name, such as “string”, from unintentionally hiding valid modules that occur later on the module search path. In the simplest case, init.py can just be an empty file, but it can also execute initialization code for the package or set the all variable, described later.
必须要有一个 init.py 文件的存在,才能使Python视该目录为一个包;这是为了防止某些目录使用了"string" 这样的通用名而无意中在随后的模块搜索路径中覆盖了正确的模块。最简单的情况下,init.py 可以只是一个空文件,不过它也可能包含了包的初始化代码,或者设置了 all 变量,后面会有相关介绍。
Users of the package can import individual modules from the package, for example:
import Sound.Effects.echo
This loads the submodule Sound.Effects.echo. It must be referenced with its full name.
这样就导入了 Sound.Effects.echo 子模块。它必需通过完整的名称来引用。
Sound.Effects.echo.echofilter(input, output, delay=0.7, atten=4)
An alternative way of importing the submodule is:
from Sound.Effects import echo
This also loads the submodule echo, and makes it available without its package prefix, so it can be used as follows:
这样就加载了 echo 子模块,并且使得它在没有包前缀的情况下也可以使用,所以它可以如下方式调用:
echo.echofilter(input, output, delay=0.7, atten=4)
Yet another variation is to import the desired function or variable directly:
from Sound.Effects.echo import echofilter
Again, this loads the submodule echo, but this makes its function echofilter() directly available:
这样就又一次加载了 echo 子模块,但这样就可以直接调用它的 echofilter() 函数:
echofilter(input, output, delay=0.7, atten=4)
Note that when using from package import item, the item can be either a submodule (or subpackage) of the package, or some other name defined in the package, like a function, class or variable. The import statement first tests whether the item is defined in the package; if not, it assumes it is a module and attempts to load it. If it fails to find it, an ImportError exception is raised.
需要注意的是使用 from package import item 方式导入包时,这个子项(item)既可以是包中的一个子模块(或一个子包),也可以是包中定义的其它命名,像函数、类或变量。import 语句首先核对是否包中有这个子项,如果没有,它假定这是一个模块,并尝试加载它。如果没有找到它,会引发一个 ImportError 异常。
Contrarily, when using syntax like import item.subitem.subsubitem, each item except for the last must be a package; the last item can be a module or a package but can’t be a class or function or variable defined in the previous item.
相反,使用类似import item.subitem.subsubitem 这样的语法时,这些子项必须是包,最后的子项可以是包或模块,但不能是前面子项中定义的类、函数或变量。
Now what happens when the user writes from Sound.Effects import *? Ideally, one would hope that this somehow goes out to the filesystem, finds which submodules are present in the package, and imports them all. Unfortunately, this operation does not work very well on Mac and Windows platforms, where the filesystem does not always have accurate information about the case of a filename! On these platforms, there is no guaranteed way to know whether a file ECHO.PY should be imported as a module echo, Echo or ECHO. (For example, Windows 95 has the annoying practice of showing all file names with a capitalized first letter.) The DOS 8+3 filename restriction adds another interesting problem for long module names.
==那么当用户写下 from Sound.Effects import * 时会发生什么事?理想中,总是希望在文件系统中找出包中所有的子模块,然后导入它们。不幸的是,这个操作在 Mac 和 Windows 平台上工作的并不太好,这些文件系统的文件大小写并不敏感!==在这些平台上没有什么方法可以确保一个叫ECHO.PY 的文件应该导入为模块 echo 、 Echo 或 ECHO 。(例如,Windows 95有一个讨厌的习惯,它会把所有的文件名都显示为首字母大写的风格。)DOS 8+3文件名限制又给长文件名模块带来了另一个有趣的问题。
The only solution is for the package author to provide an explicit index of the package. The import statement uses the following convention: if a package’s init.py code defines a list named all, it is taken to be the list of module names that should be imported when from package import * is encountered. It is up to the package author to keep this list up-to-date when a new version of the package is released. Package authors may also decide not to support it, if they don’t see a use for importing * from their package. For example, the file Sounds/Effects/init.py could contain the following code:
对于包的作者来说唯一的解决方案就是给提供一个明确的包索引。import 语句按如下条件进行转换:执行 from package import * 时,如果包中的 init.py 代码定义了一个名为 all 的链表,就会按照链表中给出的模块名进行导入。新版本的包发布时作者可以任意更新这个链表。如果包作者不想 import * 的时候导入他们的包中所有模块,那么也可能会决定不支持它(import *)。例如, Sounds/Effects/init.py 这个文件可能包括如下代码:
all = [“echo”, “surround”, “reverse”]
This would mean that from Sound.Effects import * would import the three named submodules of the Sound package.
这意味着 from Sound.Effects import * 语句会从 Sound 包中导入以上三个已命名的子模块。
If all is not defined, the statement from Sound.Effects import * does not import all submodules from the package Sound.Effects into the current namespace; it only ensures that the package Sound.Effects has been imported (possibly running its initialization code, init.py) and then imports whatever names are defined in the package. This includes any names defined (and submodules explicitly loaded) by init.py. It also includes any submodules of the package that were explicitly loaded by previous import statements. Consider this code:
如果没有定义 all , from Sound.Effects import * 语句不会从 Sound.Effects 包中导入所有的子模块。Effects 导入到当前的命名空间,只能确定的是导入了 Sound.Effects 包(可能会运行 init.py 中的初始化代码)以及包中定义的所有命名会随之导入。这样就从 init.py 中导入了每一个命名(以及明确导入的子模块)。同样也包括了前述的import语句从包中明确导入的子模块,考虑以下代码:
import Sound.Effects.echo
import Sound.Effects.surround
from Sound.Effects import *
In this example, the echo and surround modules are imported in the current namespace because they are defined in the Sound.Effects package when the from…import statement is executed. (This also works when all is defined.)
在这个例子中,echo和surround模块导入了当前的命名空间,这是因为执行 from…import 语句时它们已经定义在 Sound.Effects 包中了(定义了 all 时也会同样工作)。
Note that in general the practice of importing * from a module or package is frowned upon, since it often causes poorly readable code. However, it is okay to use it to save typing in interactive sessions, and certain modules are designed to export only names that follow certain patterns.
==需要注意的是习惯上不主张从一个包或模块中用 import * 导入所有模块,因为这样的通常意味着可读性会很差。然而,在交互会话中这样做可以减少输入,相对来说确定的模块被设计成只导出确定的模式中命名的那一部分。 ==
Remember, there is nothing wrong with using from Package import specific_submodule! In fact, this is the recommended notation unless the importing module needs to use submodules with the same name from different packages.
记住, from Package import specific_submodule 没有错误!事实上,除非导入的模块需要使用其它包中的同名子模块,否则这是受到推荐的写法。
The submodules often need to refer to each other. For example, the surround module might use the echo module. In fact, such references are so common that the import statement first looks in the containing package before looking in the standard module search path. Thus, the surround module can simply use import echo or from echo import echofilter. If the imported module is not found in the current package (the package of which the current module is a submodule), the import statement looks for a top-level module with the given name.
子模块之间经常需要互相引用。例如,surround 模块可能会引用 echo 模块。事实上,这样的引用如此普遍,以致于 import 语句会先搜索包内部,然后才是标准模块搜索路径。因此 surround 模块可以简单的调用 import echo 或者 from echo import echofilter 。如果没有在当前的包中发现要导入的模块,import 语句会依据指定名寻找一个顶级模块。
When packages are structured into subpackages (as with the Sound package in the example), there’s no shortcut to refer to submodules of sibling packages - the full name of the subpackage must be used. For example, if the module Sound.Filters.vocoder needs to use the echo module in the Sound.Effects package, it can use from Sound.Effects import echo.
如果包中使用了子包结构(就像示例中的 Sound 包),不存在什么从邻近的包中引用子模块的便捷方法--必须使用子包的全名。例如,如果 Sound.Filters.vocoder 包需要使用 Sound.Effects 包中的 echosa 模块,它可以使用 from Sound.Effects import echo 。
Packages support one more special attribute, path. This is initialized to be a list containing the name of the directory holding the package’s init.py before the code in that file is executed. This variable can be modified; doing so affects future searches for modules and subpackages contained in the package.
包支持一个更为特殊的变量, path 。 在包的 init.py 文件代码执行之前,该变量初始化一个目录名列表。该变量可以修改,它作用于包中的子包和模块的搜索功能。
While this feature is not often needed, it can be used to extend the set of modules found in a package.
In fact function definitions are also statements' that are
executed’; the execution enters the function name in the module’s global symbol table.
There are several ways to present the output of a program; data can be printed in a human-readable form, or written to a file for future use. This chapter will discuss some of the possibilities.
So far we’ve encountered two ways of writing values: expression statements and the print statement. (A third way is using the write() method of file objects; the standard output file can be referenced as sys.stdout. See the Library Reference for more information on this.)
Often you’ll want more control over the formatting of your output than simply printing space-separated values. There are two ways to format your output; the first way is to do all the string handling yourself; using string slicing and concatenation operations you can create any lay-out you can imagine. The standard module string contains some useful operations for padding strings to a given column width; these will be discussed shortly. The second way is to use the % operator with a string as the left argument. The % operator interprets the left argument much like a sprintf()-style format string to be applied to the right argument, and returns the string resulting from this formatting operation.
One question remains, of course: how do you convert values to strings? Luckily, Python has ways to convert any value to a string: pass it to the repr() or str() functions. Reverse quotes (``) are equivalent to repr(), but their use is discouraged.
The str() function is meant to return representations of values which are fairly human-readable, while repr() is meant to generate representations which can be read by the interpreter (or will force a SyntaxError if there is not equivalent syntax). For objects which don’t have a particular representation for human consumption, str() will return the same value as repr(). Many values, such as numbers or structures like lists and dictionaries, have the same representation using either function. Strings and floating point numbers, in particular, have two distinct representations.
Some examples:
‘Hello, world.’
“‘Hello, world.’”
x = 10 * 3.25
y = 200 * 200
s = 'The value of x is ’ + repr(x) + ', and y is ’ + repr(y) + ‘…’
print s
… hello = ‘hello, world\n’
hellos = repr(hello)
print hellos
… repr((x, y, (‘spam’, ‘eggs’)))
“(32.5, 40000, (‘spam’, ‘eggs’))”
… x, y, ('spam', 'eggs')
“(32.5, 40000, (‘spam’, ‘eggs’))”
Here are two ways to write a table of squares and cubes:
for x in range(1, 11):
… print repr(x).rjust(2), repr(xx).rjust(3),
… # Note trailing comma on previous line
… print repr(xxx).rjust(4)
1 1 1
2 4 8
3 9 27
4 16 64
5 25 125
6 36 216
7 49 343
8 64 512
9 81 729
10 100 1000
for x in range(1,11):
… print ‘%2d %3d %4d’ % (x, xx, xxx)
1 1 1
2 4 8
3 9 27
4 16 64
5 25 125
6 36 216
7 49 343
8 64 512
9 81 729
10 100 1000
(需要注意的是使用 print 方法时每两列之间有一个空格:它总是在参数之间加一个空格。)
以上是一个 rjust() 函数的演示,这个函数把字符串输出到一列,并通过向左侧填充空格来使其右对齐。类似的函数还有 ljust() 和 center()。这些函数只是输出新的字符串,并不改变什么。如果输出的字符串太长,它们也不会截断它,而是原样输出,这会使你的输出格式变得混乱,不过总强过另一种选择(截断字符串),因为那样会产生错误的输出值。(如果你确实需要截断它,可以使用切片操作,例如:" “x.ljust( n)[:n]”。)
There is another method, zfill(), which pads a numeric string on the left with zeros. It understands about plus and minus signs:
还有一个函数, zfill() 它用于向数值的字符串表达左侧填充0。该函数可以正确理解正负号:
Using the % operator looks like this:
可以如下这样使用 % 操作符:
import math
print ‘The value of PI is approximately %5.3f.’ % math.pi
The value of PI is approximately 3.142.
If there is more than one format in the string, you need to pass a tuple as right operand, as in this example:
>>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 7678}
>>> for name, phone in table.items():
... print '%-10s ==> %10d' % (name, phone)
Jack ==> 4098
Dcab ==> 7678
Sjoerd ==> 4127
Most formats work exactly as in C and require that you pass the proper type; however, if you don’t you get an exception, not a core dump. The %s format is more relaxed: if the corresponding argument is not a string object, it is converted to string using the str() built-in function. Using * to pass the width or precision in as a separate (integer) argument is supported. The C formats %n and %p are not supported.
If you have a really long format string that you don’t want to split up, it would be nice if you could reference the variables to be formatted by name instead of by position. This can be done by using form %(name)format, as shown here:
如果可以逐点引用要格式化的变量名,就可以产生符合真实长度的格式化字符串,不会产生间隔。这一效果可以通过使用form %(name)format 结构来实现:
table = {‘Sjoerd’: 4127, ‘Jack’: 4098, ‘Dcab’: 8637678}
print ‘Jack: %(Jack)d; Sjoerd: %(Sjoerd)d; Dcab: %(Dcab)d’ % table
Jack: 4098; Sjoerd: 4127; Dcab: 8637678
This is particularly useful in combination with the new built-in vars() function, which returns a dictionary containing all local variables.
f=open(‘/tmp/workfile’, ‘w’)
print f
The first argument is a string containing the filename. The second argument is another string containing a few characters describing the way in which the file will be used. mode can be ‘r’ when the file will only be read, ‘w’ for only writing (an existing file with the same name will be erased), and ‘a’ opens the file for appending; any data written to the file is automatically added to the end. ‘r+’ opens the file for both reading and writing. The mode argument is optional; ‘r’ will be assumed if it’s omitted.
On Windows and the Macintosh, ‘b’ appended to the mode opens the file in binary mode, so there are also modes like ‘rb’, ‘wb’, and ‘r+b’. Windows makes a distinction between text and binary files; the end-of-line characters in text files are automatically altered slightly when data is read or written. This behind-the-scenes modification to file data is fine for ASCII text files, but it’ll corrupt binary data like that in JPEGs or .EXE files. Be very careful to use binary mode when reading and writing such files. (Note that the precise semantics of text mode on the Macintosh depends on the underlying C library being used.)
The rest of the examples in this section will assume that a file object called f has already been created.
To read a file’s contents, call f.read(size), which reads some quantity of data and returns it as a string. size is an optional numeric argument. When size is omitted or negative, the entire contents of the file will be read and returned; it’s your problem if the file is twice as large as your machine’s memory. Otherwise, at most size bytes are read and returned. If the end of the file has been reached, f.read() will return an empty string (“”).
>>> f.read()
'This is the entire file.\n'
>>> f.read()
f.readline() reads a single line from the file; a newline character (\n) is left at the end of the string, and is only omitted on the last line of the file if the file doesn’t end in a newline. This makes the return value unambiguous; if f.readline() returns an empty string, the end of the file has been reached, while a blank line is represented by ‘\n’, a string containing only a single newline.
‘This is the first line of the file.\n’
‘Second line of the file\n’
f.readlines() returns a list containing all the lines of data in the file. If given an optional parameter sizehint, it reads that many bytes from the file and enough more to complete a line, and returns the lines from that. This is often used to allow efficient reading of a large file by lines, but without having to load the entire file in memory. Only complete lines will be returned.
f.write(string) writes the contents of string to the file, returning None.
f.write(‘This is a test\n’)
To write something other than a string, it needs to be converted to a string first:
s = str(value)
f.tell() returns an integer giving the file object’s current position in the file, measured in bytes from the beginning of the file. To change the file object’s position, use “f.seek(offset, from_what)”. The position is computed from adding offset to a reference point; the reference point is selected by the from_what argument. A from_what value of 0 measures from the beginning of the file, 1 uses the current file position, and 2 uses the end of the file as the reference point. from_what can be omitted and defaults to 0, using the beginning of the file as the reference point.
>>> f = open('/tmp/workfile', 'r+')
>>> f.write('0123456789abcdef')
>>> f.seek(5) # Go to the 6th byte in the file
>>> f.read(1)
>>> f.seek(-3, 2) # Go to the 3rd byte before the end
>>> f.read(1)
When you’re done with a file, call f.close() to close it and free up any system resources taken up by the open file. After calling f.close(), attempts to use the file object will automatically fail.
Traceback (most recent call last):
File “”, line 1, in ?
ValueError: I/O operation on closed file
File objects have some additional methods, such as isatty() and truncate() which are less frequently used; consult the Library Reference for a complete guide to file objects.
Strings can easily be written to and read from a file. Numbers take a bit more effort, since the read() method only returns strings, which will have to be passed to a function like int(), which takes a string like ‘123’ and returns its numeric value 123. However, when you want to save more complex data types like lists, dictionaries, or class instances, things get a lot more complicated.
Rather than have users be constantly writing and debugging code to save complicated data types, Python provides a standard module called pickle. This is an amazing module that can take almost any Python object (even some forms of Python code!), and convert it to a string representation; this process is called pickling. Reconstructing the object from the string representation is called unpickling. Between pickling and unpickling, the string representing the object may have been stored in a file or data, or sent over a network connection to some distant machine.
If you have an object x, and a file object f that’s been opened for writing, the simplest way to pickle the object takes only one line of code:
pickle.dump(x, f)
To unpickle the object again, if f is a file object which has been opened for reading:
x = pickle.load(f)
(There are other variants of this, used when pickling many objects or when you don’t want to write the pickled data to a file; consult the complete documentation for pickle in the Python Library Reference.)
pickle is the standard way to make Python objects which can be stored and reused by other programs or by a future invocation of the same program; the technical term for this is a persistent object. Because pickle is so widely used, many authors who write Python extensions take care to ensure that new data types such as matrices can be properly pickled and unpickled.
Until now error messages haven’t been more than mentioned, but if you have tried out the examples you have probably seen some. There are (at least) two distinguishable kinds of errors: syntax errors and exceptions.
Syntax errors, also known as parsing errors, are perhaps the most common kind of complaint you get while you are still learning Python:
while True print ‘Hello world’
File “”, line 1, in ?
while True print ‘Hello world’
SyntaxError: invalid syntax
The parser repeats the offending line and displays a little `arrow’ pointing at the earliest point in the line where the error was detected. The error is caused by (or at least detected at) the token preceding the arrow: in the example, the error is detected at the keyword print, since a colon (“:”) is missing before it. File name and line number are printed so you know where to look in case the input came from a script.
Even if a statement or expression is syntactically correct, it may cause an error when an attempt is made to execute it. Errors detected during execution are called exceptions and are not unconditionally fatal: you will soon learn how to handle them in Python programs. Most exceptions are not handled by programs, however, and result in error messages as shown here:
10 * (1/0)
Traceback (most recent call last):
File “”, line 1, in ?
ZeroDivisionError: integer division or modulo by zero
4 + spam*3
Traceback (most recent call last):
File “”, line 1, in ?
NameError: name ‘spam’ is not defined
‘2’ + 2
Traceback (most recent call last):
File “”, line 1, in ?
TypeError: cannot concatenate ‘str’ and ‘int’ objects
The last line of the error message indicates what happened. Exceptions come in different types, and the type is printed as part of the message: the types in the example are ZeroDivisionError, NameError and TypeError. The string printed as the exception type is the name of the built-in exception that occurred. This is true for all built-in exceptions, but need not be true for user-defined exceptions (although it is a useful convention). Standard exception names are built-in identifiers (not reserved keywords).
The rest of the line is a detail whose interpretation depends on the exception type; its meaning is dependent on the exception type.
The Python Library Reference lists the built-in exceptions and their meanings.
It is possible to write programs that handle selected exceptions. Look at the following example, which asks the user for input until a valid integer has been entered, but allows the user to interrupt the program (using Control-C or whatever the operating system supports); note that a user-generated interruption is signalled by raising the KeyboardInterrupt exception.
while True:
… try:
… x = int(raw_input("Please enter a number: "))
… break
… except ValueError:
… print “Oops! That was no valid number. Try again…”
The try statement works as follows.
First, the try clause (the statement(s) between the try and except keywords) is executed.
If no exception occurs, the except clause is skipped and execution of the try statement is finished.
If an exception occurs during execution of the try clause, the rest of the clause is skipped. Then if its type matches the exception named after the except keyword, the rest of the try clause is skipped, the except clause is executed, and then execution continues after the try statement.
If an exception occurs which does not match the exception named in the except clause, it is passed on to outer try statements; if no handler is found, it is an unhandled exception and execution stops with a message as shown above.
A try statement may have more than one except clause, to specify handlers for different exceptions. At most one handler will be executed. Handlers only handle exceptions that occur in the corresponding try clause, not in other handlers of the same try statement. An except clause may name multiple exceptions as a parenthesized list, for example:
… except (RuntimeError, TypeError, NameError):
… pass
The last except clause may omit the exception name(s), to serve as a wildcard. Use this with extreme caution, since it is easy to mask a real programming error in this way! It can also be used to print an error message and then re-raise the exception (allowing a caller to handle the exception as well):
import sys
f = open(‘myfile.txt’)
s = f.readline()
i = int(s.strip())
except IOError, (errno, strerror):
print “I/O error(%s): %s” % (errno, strerror)
except ValueError:
print “Could not convert data to an integer.”
print “Unexpected error:”, sys.exc_info()[0]
The try … except statement has an optional else clause, which, when present, must follow all except clauses. It is useful for code that must be executed if the try clause does not raise an exception. For example:
try … except 语句可以带有一个 else 子句, 该子句只能出现在所有 except 子句之后。当 try 语句没有抛出异常时,需要执行一些代码,可以使用这个子句。例如:
for arg in sys.argv[1:]:
f = open(arg, ‘r’)
except IOError:
print ‘cannot open’, arg
print arg, ‘has’, len(f.readlines()), ‘lines’
The use of the else clause is better than adding additional code to the try clause because it avoids accidentally catching an exception that wasn’t raised by the code being protected by the try … except statement.
使用 else 子句比在 try 子句中附加代码要好,因为这样可以避免 try …
keywordexcept 意外的截获本来不属于它们保护的那些代码抛出的异常。
When an exception occurs, it may have an associated value, also known as the exception’s argument. The presence and type of the argument depend on the exception type.
The except clause may specify a variable after the exception name (or list). The variable is bound to an exception instance with the arguments stored in instance.args. For convenience, the exception instance defines getitem and str so the arguments can be accessed or printed directly without having to reference .args.
在异常名(列表)之后,也可以为 except 子句指定一个变量。这个变量绑定于一个异常实例,它存储在 instance.args 的参数中。为了方便起见,异常实例定义了 getitem 和 str,这样就可以直接访问过打印参数而不必引用 .args。
… raise Exception(‘spam’, ‘eggs’)
… except Exception, inst:
… print type(inst) # the exception instance
… print inst.args # arguments stored in .args
… print inst # str allows args to printed directly
… x, y = inst # getitem allows args to be unpacked directly
… print ‘x =’, x
… print ‘y =’, y
(‘spam’, ‘eggs’)
(‘spam’, ‘eggs’)
x = spam
y = eggs
If an exception has an argument, it is printed as the last part (`detail’) of the message for unhandled exceptions.
Exception handlers don’t just handle exceptions if they occur immediately in the try clause, but also if they occur inside functions that are called (even indirectly) in the try clause. For example:
异常处理句柄不止可以处理直接发生在 try 子句中的异常,即使是其中(甚至是间接)调用的函数,发生了异常,也一样可以处理。例如:
def this_fails():
… x = 1/0
… this_fails()
… except ZeroDivisionError, detail:
… print ‘Handling run-time error:’, detail
Handling run-time error: integer division or modulo
The raise statement allows the programmer to force a specified exception to occur. For example:
raise NameError, ‘HiThere’
Traceback (most recent call last):
File “”, line 1, in ?
NameError: HiThere
The first argument to raise names the exception to be raised. The optional second argument specifies the exception’s argument.
如果你决定抛出一个异常而不处理它, raise 语句可以让你很简单的重新抛出该异常。
… raise NameError, ‘HiThere’
… except NameError:
… print ‘An exception flew by!’
… raise
An exception flew by!
Traceback (most recent call last):
File “”, line 2, in ?
NameError: HiThere
Programs may name their own exceptions by creating a new exception class. Exceptions should typically be derived from the Exception class, either directly or indirectly. For example:
class MyError(Exception):
… def init(self, value):
… self.value = value
… def str(self):
… return repr(self.value)
… raise MyError(2*2)
… except MyError, e:
… print ‘My exception occurred, value:’, e.value
My exception occurred, value: 4
raise MyError, ‘oops!’
Traceback (most recent call last):
File “”, line 1, in ?
main.MyError: ‘oops!’
Exception classes can be defined which do anything any other class can do, but are usually kept simple, often only offering a number of attributes that allow information about the error to be extracted by handlers for the exception. When creating a module which can raise several distinct errors, a common practice is to create a base class for exceptions defined by that module, and subclass that to create specific exception classes for different error conditions:
class Error(Exception):
“”“Base class for exceptions in this module.”“”
class InputError(Error):
“”"Exception raised for errors in the input.
expression -- input expression in which the error occurred
message -- explanation of the error
def __init__(self, expression, message):
self.expression = expression
self.message = message
class TransitionError(Error):
“”"Raised when an operation attempts a state transition that’s not
previous -- state at beginning of transition
next -- attempted new state
message -- explanation of why the specific transition is not allowed
def __init__(self, previous, next, message):
self.previous = previous
self.next = next
self.message = message
Most exceptions are defined with names that end in ``Error,‘’ similar to the naming of the standard exceptions.
Many standard modules define their own exceptions to report errors that may occur in functions they define. More information on classes is presented in chapter , ``Classes.‘’
很多标准模块中都定义了自己的异常,用以报告在他们所定义的函数中可能发生的错误。关于类的进一步信息请参见第 9 章 ,“类”。
The try statement has another optional clause which is intended to define clean-up actions that must be executed under all circumstances. For example:
try 语句还有另一个可选的子句,目的在于定义在任何情况下都一定要执行的功能。例如:
… raise KeyboardInterrupt
… finally:
… print ‘Goodbye, world!’
Goodbye, world!
Traceback (most recent call last):
File “”, line 2, in ?
A finally clause is executed whether or not an exception has occurred in the try clause. When an exception has occurred, it is re-raised after the finally clause is executed. The finally clause is also executed ``on the way out’’ when the try statement is left via a break or return statement.
不管try子句中有没有发生异常, finally 子句都一定会被执行。如果发生异常,在 finally 子句执行完后它会被重新抛出。 try 子句经由 break 或 return 退出也一样会执行 finally 子句。
The code in the finally clause is useful for releasing external resources (such as files or network connections), regardless of whether or not the use of the resource was successful.
在 finally 子句中的代码用于释放外部资源(例如文件或网络连接),不管这些资源是否已经成功利用。
A try statement must either have one or more except clauses or one finally clause, but not both.
在 try 语句中可以使用若干个 except 子句或一个 finally 子句,但两者不能共存。
In C++ terminology, all class members (including the data members) are public, and all member functions are virtual. There are no special constructors or destructors. As in Modula-3, there are no shorthands for referencing the object’s members from its methods: the method function is declared with an explicit first argument representing the object, which is provided implicitly by the call. As in Smalltalk, classes themselves are objects, albeit in the wider sense of the word: in Python, all data types are objects. This provides semantics for importing and renaming. Unlike C++ and Modula-3, built-in types can be used as base classes for extension by the user. Also, like in C++ but unlike in Modula-3, most built-in operators with special syntax (arithmetic operators, subscripting etc.) can be redefined for class instances.
Lacking universally accepted terminology to talk about classes, I will make occasional use of Smalltalk and C++ terms. (I would use Modula-3 terms, since its object-oriented semantics are closer to those of Python than C++, but I expect that few readers have heard of it.)
I also have to warn you that there’s a terminological pitfall for object-oriented readers: the word ``object’’ in Python does not necessarily mean a class instance. Like C++ and Modula-3, and unlike Smalltalk, not all types in Python are classes: the basic built-in types like integers and lists are not, and even somewhat more exotic types like files aren’t. However, all Python types share a little bit of common semantics that is best described by using the word object.
Objects have individuality, and multiple names (in multiple scopes) can be bound to the same object. This is known as aliasing in other languages. This is usually not appreciated on a first glance at Python, and can be safely ignored when dealing with immutable basic types (numbers, strings, tuples). However, aliasing has an (intended!) effect on the semantics of Python code involving mutable objects such as lists, dictionaries, and most types representing entities outside the program (files, windows, etc.). This is usually used to the benefit of the program, since aliases behave like pointers in some respects. For example, passing an object is cheap since only a pointer is passed by the implementation; and if a function modifies an object passed as an argument, the caller will see the change – this eliminates the need for two different argument passing mechanisms as in Pascal.
Before introducing classes, I first have to tell you something about Python’s scope rules. Class definitions play some neat tricks with namespaces, and you need to know how scopes and namespaces work to fully understand what’s going on. Incidentally, knowledge about this subject is useful for any advanced Python programmer.
Let’s begin with some definitions.
By the way, I use the word attribute for any name following a dot – for example, in the expression z.real, real is an attribute of the object z. Strictly speaking, references to names in modules are attribute references: in the expression modname.funcname, modname is a module object and funcname is an attribute of it. In this case there happens to be a straightforward mapping between the module’s attributes and the global names defined in the module: they share the same namespace! 9.1
Attributes may be read-only or writable. In the latter case, assignment to attributes is possible. Module attributes are writable: you can write “modname.the_answer = 42”. Writable attributes may also be deleted with the del statement. For example, “del modname.the_answer” will remove the attribute the_answer from the object named by modname.
Name spaces are created at different moments and have different lifetimes. The namespace containing the built-in names is created when the Python interpreter starts up, and is never deleted. The global namespace for a module is created when the module definition is read in; normally, module namespaces also last until the interpreter quits. The statements executed by the top-level invocation of the interpreter, either read from a script file or interactively, are considered part of a module called main, so they have their own global namespace. (The built-in names actually also live in a module; this is called builtin.)
The local namespace for a function is created when the function is called, and deleted when the function returns or raises an exception that is not handled within the function. (Actually, forgetting would be a better way to describe what actually happens.) Of course, recursive invocations each have their own local namespace.
A scope is a textual region of a Python program where a namespace is directly accessible. ``Directly accessible’’ here means that an unqualified reference to a name attempts to find the name in the namespace.
Although scopes are determined statically, they are used dynamically. At any time during execution, there are at least three nested scopes whose namespaces are directly accessible: the innermost scope, which is searched first, contains the local names; the namespaces of any enclosing functions, which are searched starting with the nearest enclosing scope; the middle scope, searched next, contains the current module’s global names; and the outermost scope (searched last) is the namespace containing built-in names.
If a name is declared global, then all references and assignments go directly to the middle scope containing the module’s global names. Otherwise, all variables found outside of the innermost scope are read-only.
Usually, the local scope references the local names of the (textually) current function. Outside of functions, the local scope references the same namespace as the global scope: the module’s namespace. Class definitions place yet another namespace in the local scope.
It is important to realize that scopes are determined textually: the global scope of a function defined in a module is that module’s namespace, no matter from where or by what alias the function is called. On the other hand, the actual search for names is done dynamically, at run time – however, the language definition is evolving towards static name resolution, at ``compile’’ time, so don’t rely on dynamic name resolution! (In fact, local variables are already determined statically.)
作用域决定于源程序的文本:一个定义于某模块中的函数的全局作用域是该模块的命名空间,而不是该函数的别名被定义或调用的位置,了解这一点非常重要。另一方面,命名的实际搜索过程是动态的,在运行时确定的——然而,Python 语言也在不断发展,以后有可能会成为静态的“编译”时确定,所以不要依赖于动态解析!(事实上,局部变量已经是静态确定了。)
A special quirk of Python is that assignments always go into the innermost scope. Assignments do not copy data – they just bind names to objects. The same is true for deletions: the statement “del x” removes the binding of x from the namespace referenced by the local scope. In fact, all operations that introduce new names use the local scope: in particular, import statements and function definitions bind the module or function name in the local scope. (The global statement can be used to indicate that particular variables live in the global scope.)
Python 的一个特别之处在于其赋值操作总是在最里层的作用域。赋值不会复制数据——只是将命名绑定到对象。删除也是如此:“del x” 只是从局部作用域的命名空间中删除命名 x 。事实上,所有引入新命名的操作都作用于局部作用域。特别是 import 语句和函数定将模块名或函数绑定于局部作用域。(可以使用 global 语句将变量引入到全局作用域。)
Classes introduce a little bit of new syntax, three new object types, and some new semantics.
class ClassName:
Class definitions, like function definitions (def statements) must be executed before they have any effect. (You could conceivably place a class definition in a branch of an if statement, or inside a function.)
In practice, the statements inside a class definition will usually be function definitions, but other statements are allowed, and sometimes useful – we’ll come back to this later. The function definitions inside a class normally have a peculiar form of argument list, dictated by the calling conventions for methods – again, this is explained later.
When a class definition is entered, a new namespace is created, and used as the local scope – thus, all assignments to local variables go into this new namespace. In particular, function definitions bind the name of the new function here.
When a class definition is left normally (via the end), a class object is created. This is basically a wrapper around the contents of the namespace created by the class definition; we’ll learn more about class objects in the next section. The original local scope (the one in effect just before the class definitions was entered) is reinstated, and the class object is bound here to the class name given in the class definition header (ClassName in the example).
Attribute references use the standard syntax used for all attribute references in Python: obj.name. Valid attribute names are all the names that were in the class’s namespace when the class object was created. So, if the class definition looked like this:
属性引用使用和 Python 中所有的属性引用一样的标准语法:obj.name。类对象创建后,类命名空间中所有的命名都是有效属性名。所以如果类定义是这样:
class MyClass:
“A simple example class”
i = 12345
def f(self):
return ‘hello world’
then MyClass.i and MyClass.f are valid attribute references, returning an integer and a method object, respectively. Class attributes can also be assigned to, so you can change the value of MyClass.i by assignment. doc is also a valid attribute, returning the docstring belonging to the class: “A simple example class”.
Class instantiation uses function notation. Just pretend that the class object is a parameterless function that returns a new instance of the class. For example (assuming the above class):
x = MyClass()
creates a new instance of the class and assigns this object to the local variable x.
The instantiation operation (``calling’’ a class object) creates an empty object. Many classes like to create objects in a known initial state. Therefore a class may define a special method named init(), like this:
def __init__(self):
self.data = []
When a class defines an init() method, class instantiation automatically invokes init() for the newly-created class instance. So in this example, a new, initialized instance can be obtained by:
x = MyClass()
Of course, the init() method may have arguments for greater flexibility. In that case, arguments given to the class instantiation operator are passed on to init(). For example,
class Complex:
… def init(self, realpart, imagpart):
… self.r = realpart
… self.i = imagpart
x = Complex(3.0, -4.5)
x.r, x.i
(3.0, -4.5)
The first I’ll call data attributes. These correspond to instance variables'' in Smalltalk, and to
data members’’ in C++. Data attributes need not be declared; like local variables, they spring into existence when they are first assigned to. For example, if x is the instance of MyClass created above, the following piece of code will print the value 16, without leaving a trace:
x.counter = 1
while x.counter < 10:
x.counter = x.counter * 2
print x.counter
del x.counter
The second kind of attribute references understood by instance objects are methods. A method is a function that ``belongs to’’ an object. (In Python, the term method is not unique to class instances: other object types can have methods as well. For example, list objects have methods called append, insert, remove, sort, and so on. However, below, we’ll use the term method exclusively to mean methods of class instance objects, unless explicitly stated otherwise.)
Valid method names of an instance object depend on its class. By definition, all attributes of a class that are (user-defined) function objects define corresponding methods of its instances. So in our example, x.f is a valid method reference, since MyClass.f is a function, but x.i is not, since MyClass.i is not. But x.f is not the same thing as MyClass.f – it is a method object, not a function object.
In our example, this will return the string ‘hello world’. However, it is not necessary to call a method right away: x.f is a method object, and can be stored away and called at a later time. For example:
xf = x.f
while True:
print xf()
will continue to print “hello world” until the end of time.
What exactly happens when a method is called? You may have noticed that x.f() was called without an argument above, even though the function definition for f specified an argument. What happened to the argument? Surely Python raises an exception when a function that requires an argument is called without any – even if the argument isn’t actually used…
Actually, you may have guessed the answer: the special thing about methods is that the object is passed as the first argument of the function. In our example, the call x.f() is exactly equivalent to MyClass.f(x). In general, calling a method with a list of n arguments is equivalent to calling the corresponding function with an argument list that is created by inserting the method’s object before the first argument.
If you still don’t understand how methods work, a look at the implementation can perhaps clarify matters. When an instance attribute is referenced that isn’t a data attribute, its class is searched. If the name denotes a valid class attribute that is a function object, a method object is created by packing (pointers to) the instance object and the function object just found together in an abstract object: this is the method object. When the method object is called with an argument list, it is unpacked again, a new argument list is constructed from the instance object and the original argument list, and the function object is called with this new argument list.
Data attributes override method attributes with the same name; to avoid accidental name conflicts, which may cause hard-to-find bugs in large programs, it is wise to use some kind of convention that minimizes the chance of conflicts. Possible conventions include capitalizing method names, prefixing data attribute names with a small unique string (perhaps just an underscore), or using verbs for methods and nouns for data attributes.
Data attributes may be referenced by methods as well as by ordinary users (``clients’') of an object. In other words, classes are not usable to implement pure abstract data types. In fact, nothing in Python makes it possible to enforce data hiding – it is all based upon convention. (On the other hand, the Python implementation, written in C, can completely hide implementation details and control access to an object if necessary; this can be used by extensions to Python written in C.)
数据属性可以由方法引用,也可以由普通用户(客户)调用。换句话说,类不能实现纯的数据类型。事实上 Python 中没有什么办法可以强制隐藏数据--一切都基本约定的惯例。(另一方法讲,Python 的实现是用 C 写成的,如果有必要,可以用 C 来编写 Python 扩展,完全隐藏实现的细节,控制对象的访问。)
Clients should use data attributes with care – clients may mess up invariants maintained by the methods by stamping on their data attributes. Note that clients may add data attributes of their own to an instance object without affecting the validity of the methods, as long as name conflicts are avoided – again, a naming convention can save a lot of headaches here.
There is no shorthand for referencing data attributes (or other methods!) from within methods. I find that this actually increases the readability of methods: there is no chance of confusing local variables and instance variables when glancing through a method.
Conventionally, the first argument of methods is often called self. This is nothing more than a convention: the name self has absolutely no special meaning to Python. (Note, however, that by not following the convention your code may be less readable by other Python programmers, and it is also conceivable that a class browser program be written which relies upon such a convention.)
Any function object that is a class attribute defines a method for instances of that class. It is not necessary that the function definition is textually enclosed in the class definition: assigning a function object to a local variable in the class is also ok. For example:
def f1(self, x, y):
return min(x, x+y)
class C:
f = f1
def g(self):
return ‘hello world’
h = g
Now f, g and h are all attributes of class C that refer to function objects, and consequently they are all methods of instances of C – h being exactly equivalent to g. Note that this practice usually only serves to confuse the reader of a program.
Methods may call other methods by using method attributes of the self argument:
class Bag:
def init(self):
self.data = []
def add(self, x):
def addtwice(self, x):
Methods may reference global names in the same way as ordinary functions. The global scope associated with a method is the module containing the class definition. (The class itself is never used as a global scope!) While one rarely encounters a good reason for using global data in a method, there are many legitimate uses of the global scope: for one thing, functions and modules imported into the global scope can be used by methods, as well as functions and classes defined in it. Usually, the class containing the method is itself defined in this global scope, and in the next section we’ll find some good reasons why a method would want to reference its own class!
Of course, a language feature would not be worthy of the name ``class’’ without supporting inheritance. The syntax for a derived class definition looks as follows:
class DerivedClassName(BaseClassName):
The name BaseClassName must be defined in a scope containing the derived class definition. Instead of a base class name, an expression is also allowed. This is useful when the base class is defined in another module,
命名 BaseClassName(示例中的基类名)必须与派生类定义在一个作用域内。除了类,还可以用表达式,基类定义在另一个模块中时这一点非常有用:
class DerivedClassName(modname.BaseClassName):
Execution of a derived class definition proceeds the same as for a base class. When the class object is constructed, the base class is remembered. This is used for resolving attribute references: if a requested attribute is not found in the class, it is searched in the base class. This rule is applied recursively if the base class itself is derived from some other class.
There’s nothing special about instantiation of derived classes: DerivedClassName() creates a new instance of the class. Method references are resolved as follows: the corresponding class attribute is searched, descending down the chain of base classes if necessary, and the method reference is valid if this yields a function object.
Derived classes may override methods of their base classes. Because methods have no special privileges when calling other methods of the same object, a method of a base class that calls another method defined in the same base class, may in fact end up calling a method of a derived class that overrides it. (For C++ programmers: all methods in Python are effectively virtual.)
An overriding method in a derived class may in fact want to extend rather than simply replace the base class method of the same name. There is a simple way to call the base class method directly: just call “BaseClassName.methodname(self, arguments)”. This is occasionally useful to clients as well. (Note that this only works if the base class is defined or imported directly in the global scope.)
class DerivedClassName(Base1, Base2, Base3):
The only rule necessary to explain the semantics is the resolution rule used for class attribute references. This is depth-first, left-to-right. Thus, if an attribute is not found in DerivedClassName, it is searched in Base1, then (recursively) in the base classes of Base1, and only if it is not found there, it is searched in Base2, and so on.
(To some people breadth first – searching Base2 and Base3 before the base classes of Base1 – looks more natural. However, this would require you to know whether a particular attribute of Base1 is actually defined in Base1 or in one of its base classes before you can figure out the consequences of a name conflict with an attribute of Base2. The depth-first rule makes no differences between direct and inherited attributes of Base1.)
It is clear that indiscriminate use of multiple inheritance is a maintenance nightmare, given the reliance in Python on conventions to avoid accidental name conflicts. A well-known problem with multiple inheritance is a class derived from two classes that happen to have a common base class. While it is easy enough to figure out what happens in this case (the instance will have a single copy of ``instance variables’’ or data attributes used by the common base class), it is not clear that these semantics are in any way useful.
显然不加限制的使用多继承会带来维护上的噩梦,因为 Python 中只依靠约定来避免命名冲突。多继承一个很有名的问题是派生继承的两个基类都是从同一个基类继承而来。目前还不清楚这在语义上有什么意义,然而很容易想到这会造成什么后果(实例会有一个独立的“实例变量”或数据属性复本作用于公共基类。)
There is limited support for class-private identifiers. Any identifier of the form __spam (at least two leading underscores, at most one trailing underscore) is now textually replaced with _classname__spam, where classname is the current class name with leading underscore(s) stripped. This mangling is done without regard of the syntactic position of the identifier, so it can be used to define class-private instance and class variables, methods, as well as globals, and even to store instance variables private to this class on instances of other classes. Truncation may occur when the mangled name would be longer than 255 characters. Outside classes, or when the class name consists of only underscores, no mangling occurs.
Python 对类的私有成员提供了有限的支持。任何形如 __spam(以至少双下划线开头,至多单下划线结尾)随即都被替代为 _classname__spam,去掉前导下划线的 classname 即当前的类名。这种混淆不关心标识符的语法位置,所以可用来定义私有类实例和类变量、方法,以及全局变量,甚至于将其它类的实例保存为私有变量。混淆名长度超过255个字符的时候可能会发生截断。在类的外部,或类名只包含下划线时,不会发生截断。
Name mangling is intended to give classes an easy way to define ``private’’ instance variables and methods, without having to worry about instance variables defined by derived classes, or mucking with instance variables by code outside the class. Note that the mangling rules are designed mostly to avoid accidents; it still is possible for a determined soul to access or modify a variable that is considered private. This can even be useful in special circumstances, such as in the debugger, and that’s one reason why this loophole is not closed. (Buglet: derivation of a class with the same name as the base class makes use of private variables of the base class possible.)
Notice that code passed to exec, eval() or evalfile() does not consider the classname of the invoking class to be the current class; this is similar to the effect of the global statement, the effect of which is likewise restricted to code that is byte-compiled together. The same restriction applies to getattr(), setattr() and delattr(), as well as when referencing dict directly.
要注意的是传入 exec,eval() 或 evalfile() 的代码不会将调用它们的类视作当前类,这与 global 语句的情况类似,global 的作用局限于“同一批”进行字节编译的代码。同样的限制也适用于 getattr(),setattr() 和delattr(),以及直接引用 dict 的时候。
class Employee:
john = Employee() # Create an empty employee record
john.name = ‘John Doe’
john.dept = ‘computer lab’
john.salary = 1000
A piece of Python code that expects a particular abstract data type can often be passed a class that emulates the methods of that data type instead. For instance, if you have a function that formats some data from a file object, you can define a class with methods read() and readline() that gets the data from a string buffer instead, and pass it as an argument.
某一段 Python 代码需要一个特殊的抽象数据结构的话,通常可以传入一个类,事实上这模仿了该类的方法。例如,如果你有一个用于从文件对象中格式化数据的函数,你可以定义一个带有 read() 和 readline() 方法的类,以此从字符串缓冲读取数据,然后将该类的对象作为参数传入前述的函数。
Instance method objects have attributes, too: m.im_self is the object of which the method is an instance, and m.im_func is the function object corresponding to the method.
User-defined exceptions are identified by classes as well. Using this mechanism it is possible to create extensible hierarchies of exceptions.
There are two new valid (semantic) forms for the raise statement:
raise Class, instance
raise instance
In the first form, instance must be an instance of Class or of a class derived from it. The second form is a shorthand for:
第一种形式中,instance 必须是 Class 或其派生类的一个实例。第二种形式是以下形式的简写:
raise instance.class, instance
A class in an except clause is compatible with an exception if it is the same class or a base class thereof (but not the other way around – an except clause listing a derived class is not compatible with a base class). For example, the following code will print B, C, D in that order:
class B:
class C(B):
class D©:
for c in [B, C, D]:
raise c()
except D:
print “D”
except C:
print “C”
except B:
print “B”
Note that if the except clauses were reversed (with “except B” first), it would have printed B, B, B – the first matching except clause is triggered.
When an error message is printed for an unhandled exception which is a class, the class name is printed, then a colon and a space, and finally the instance converted to a string using the built-in function str().
By now, you’ve probably noticed that most container objects can be looped over using a for statement:
for element in [1, 2, 3]:
print element
for element in (1, 2, 3):
print element
for key in {‘one’:1, ‘two’:2}:
print key
for char in “123”:
print char
for line in open(“myfile.txt”):
print line
This style of access is clear, concise, and convenient. The use of iterators pervades and unifies Python. Behind the scenes, the for statement calls iter() on the container object. The function returns an iterator object that defines the method next() which accesses elements in the container one at a time. When there are no more elements, next() raises a StopIteration exception which tells the for loop to terminate. This example shows how it all works:
s = ‘abc’
it = iter(s)
Traceback (most recent call last):
File “
Having seen the mechanics behind the iterator protocol, it is easy to add iterator behavior to your classes. Define a iter() method which returns an object with a next() method. If the class defines next(), then iter() can just return self:
>>> class Reverse:
"Iterator for looping over a sequence backwards"
def __init__(self, data):
self.data = data
self.index = len(data)
def __iter__(self):
return self
def next(self):
if self.index == 0:
raise StopIteration
self.index = self.index - 1
return self.data[self.index]
for char in Reverse(‘spam’):
print char
Generators are a simple and powerful tool for creating iterators. They are written like regular functions but use the yield statement whenever they want to return data. Each time the next() is called, the generator resumes where it left-off (it remembers all the data values and which statement was last executed). An example shows that generators can be trivially easy to create:
def reverse(data):
for index in range(len(data)-1, -1, -1):
yield data[index]
for char in reverse(‘golf’):
print char
Anything that can be done with generators can also be done with class based iterators as described in the previous section. What makes generators so compact is that the iter() and next() methods are created automatically.
Another key feature is that the local variables and execution state are automatically saved between calls. This made the function easier to write and much more clear than an approach using class variables like self.index and self.data.
In addition to automatic method creation and saving program state, when generators terminate, they automatically raise StopIteration. In combination, these features make it easy to create iterators with no more effort than writing a regular function.
Except for one thing. Module objects have a secret read-only attribute called dict which returns the dictionary used to implement the module’s namespace; the name dict is an attribute but not a global name. Obviously, using this violates the abstraction of namespace implementation, and should be restricted to things like post-mortem debuggers.
The os module provides dozens of functions for interacting with the operating system:
import os
os.system(‘time 0:02’)
os.getcwd() # Return the current working directory
Be sure to use the “import os” style instead of “from os import *”. This will keep os.open() from shadowing the builtin open() function which operates much differently.
The builtin dir() and help() functions are useful as interactive aids for working with large modules like os:
import os
For daily file and directory management tasks, the shutil module provides a higher level interface that is easier to use:
import shutil
shutil.copyfile(‘data.db’, ‘archive.db’)
shutil.move(‘/build/executables’, ‘installdir’)
The glob module provides a function for making file lists from directory wildcard searches:
import glob
[‘primes.py’, ‘random.py’, ‘quote.py’]
Common utility scripts often invoke processing command line arguments. These arguments are stored in the sys module’s argv attribute as a list. For instance the following output results from running “python demo.py one two three” at the command line:
import sys
print sys.argv
[‘demo.py’, ‘one’, ‘two’, ‘three’]
The getopt module processes sys.argv using the conventions of the Unix getopt() function. More powerful and flexible command line processing is provided by the optparse module.
The sys module also has attributes for stdin, stdout, and stderr. The latter is useful for emitting warnings and error messages to make them visible even when stdout has been redirected:
sys.stderr.write(‘Warning, log file not found starting a new one’)
Warning, log file not found starting a new one
The most direct way to terminate a script is to use “sys.exit()”.
The re module provides regular expression tools for advanced string processing. For complex matching and manipulation, regular expressions offer succinct, optimized solutions:
import re
print(re.findall(r'\bf[a-z]*', 'which foot or hand fell fastest'))
print(re.sub(r'(\b[a-z]+) \1', r'\1', 'cat in the the hat'))
['foot', 'fell', 'fastest']
>>> re.sub(r'(\b[a-z]+) \1', r'\1', 'cat in the the hat')
'cat in the hat'
When only simple capabilities are needed, string methods are preferred because they are easier to read and debug:
‘tea for two’
The math module gives access to the underlying C library functions for floating point math:
import math
math.cos(math.pi / 4.0)
math.log(1024, 2)
The random module provides tools for making random selections:
import random
random.choice([‘apple’, ‘pear’, ‘banana’])
random.sample(xrange(100), 10) # sampling without replacement
[30, 83, 16, 4, 8, 81, 41, 50, 18, 33]
random.random() # random float
random.randrange(6) # random integer chosen from range(6)
There are a number of modules for accessing the internet and processing internet protocols. Two of the simplest are urllib2 for retrieving data from urls and smtplib for sending mail:
import urllib2
for line in urllib2.urlopen(‘http://tycho.usno.navy.mil/cgi-bin/timer.pl’):
… if ‘EST’ in line: # look for Eastern Standard Time
… print line
Nov. 25, 09:43:32 PM EST
import smtplib
server = smtplib.SMTP(‘localhost’)
server.sendmail(‘[email protected]’, ‘[email protected]’,
“”"To: [email protected]
From: [email protected]
Beware the Ides of March.
The datetime module supplies classes for manipulating dates and times in both simple and complex ways. While date and time arithmetic is supported, the focus of the implementation is on efficient member extraction for output formatting and manipulation. The module also supports objects that are time zone aware.
from datetime import date
now = date.today()
datetime.date(2003, 12, 2)
now.strftime(“%m-%d-%y or %d%b %Y is a %A on the %d day of %B”)
‘12-02-03 or 02Dec 2003 is a Tuesday on the 02 day of December’
birthday = date(1964, 7, 31)
age = now - birthday
Common data archiving and compression formats are directly supported by modules including: zlib, gzip, bz2, zipfile, and tarfile.
import zlib
s = ‘witch which has which witches wrist watch’
t = zlib.compress(s)
‘witch which has which witches wrist watch’
Some Python users develop a deep interest in knowing the relative performance between different approaches to the same problem. Python provides a measurement tool that answers those questions immediately.
For example, it may be tempting to use the tuple packing and unpacking feature instead of the traditional approach to swapping arguments. The timeit module quickly demonstrates that the traditional approach is faster:
from timeit import Timer
Timer(‘t=a; a=b; b=t’, ‘a=1; b=2’).timeit()
Timer(‘a,b = b,a’, ‘a=1; b=2’).timeit()
In contrast to timeit’s fine level of granularity, the profile and pstats modules provide tools for identifying time critical sections in larger blocks of code.
One approach for developing high quality software is to write tests for each function as it is developed and to run those tests frequently during the development process.
The doctest module provides a tool for scanning a module and validating tests embedded in a program’s docstrings. Test construction is as simple as cutting-and-pasting a typical call along with its results into the docstring. This improves the documentation by providing the user with an example and it allows the doctest module to make sure the code remains true to the documentation:
def average(values):
“”"Computes the arithmetic mean of a list of numbers.
>>> print average([20, 30, 70])
return sum(values, 0.0) / len(values)
import doctest
doctest.testmod() # automatically validate the embedded tests
The unittest module is not as effortless as the doctest module, but it allows a more comprehensive set of tests to be maintained in a separate file:
import unittest
class TestStatisticalFunctions(unittest.TestCase):
def test_average(self):
self.assertEqual(average([20, 30, 70]), 40.0)
self.assertEqual(round(average([1, 5, 7]), 1), 4.3)
self.assertRaises(ZeroDivisionError, average, [])
self.assertRaises(TypeError, average, 20, 30, 70)
unittest.main() # Calling from the command line invokes all tests
The xmlrpclib and SimpleXMLRPCServer modules make implementing remote procedure calls into an almost trivial task. Despite the names, no direct knowledge or handling of XML is needed.
The email package is a library for managing email messages, including MIME and other RFC 2822-based message documents. Unlike smtplib and poplib which actually send and receive messages, the email package has a complete toolset for building or decoding complex message structures (including attachments) and for implementing internet encoding and header protocols.
The xml.dom and xml.sax packages provide robust support for parsing this popular data interchange format. Likewise, the csv module supports direct reads and writes in a common database format. Together, these modules and packages greatly simplify data interchange between python applications and other tools.
Internationalization is supported by a number of modules including gettext, locale, and the codecs package.
国际化由 gettext, locale和 codecs 包支持。