You are here: Home ‣ Dive Into Python 3 ‣
Difficulty level: ♦♢♢♢♢
YOUR FIRST PYTHON PROGRAM
从第一个PYTHON程序开始
❝ Don’t bury your burden in saintly silence. You have a problem? Great. Rejoice, dive in, and investigate. ❞
— Ven. Henepola Gunaratana
‣ show table of contents
DIVING IN#
Books about programming usually start with a bunch of boring chapters about fundamentals and eventually work up to building something useful. Let’s skip all that. Here is a complete, working Python program. It probably makes absolutely no sense to you. Don’t worry about that, because you’re going to dissect it line by line. But read through it first and see what, if anything, you can make of it.
好多书都会从基础开始——语法之类的,或者上来就是一个在现实中正在应用的例子。我不打算这样做,下面是一个完整的Python程序,虽然它实现的功能对你可能毫无意义,但让我首先通读它,看看有什么发现,下面我会一行一行的解释给你,不要担心!
skip over this code listing
[hide] [open in new window] [download humansize.py]
SUFFIXES = {1000: ['KB', 'MB', 'GB', 'TB', 'PB', 'EB', 'ZB', 'YB'],
1024: ['KiB', 'MiB', 'GiB', 'TiB', 'PiB', 'EiB', 'ZiB', 'YiB']}
上面声明一个列表SUFFIXES,其中有两个成员,每个成员是一个字典类型,两个字典类型的key分别是1000和1024。这两个字典类型的值又是一个列表。要是c程序员,其实把它当成个结构体就好了,并不是很复杂。
下面声明一个函数,a_kilobyte_is_1024_bytes默认值为True.三个引号的注释,表示多行注释,和/*………*/一样的作用,单行注释用字符‘#’
def approximate_size(size, a_kilobyte_is_1024_bytes=True):
'''Convert a file size to human-readable form.
Keyword arguments:
size -- file size in bytes
a_kilobyte_is_1024_bytes -- if True (default), use multiples of 1024
if False, use multiples of 1000
Returns: string
'''
if size < 0:
raise ValueError('number must be non-negative')
上面就是刨出个异常,其实异常这个东西,很简单,那上面的例子讲,就是小于0的时候,提示你有问题。 raise 是关键字,ValueError是一种异常类型,先这么理解,后面应该会讲
multiple = 1024 if a_kilobyte_is_1024_bytes else 1000
上面一句,有点像c中的问号表达式, x = bool?y:z。 如果bool值为真x = y否则x=z。这里如果a_kilobyte_is_1024_bytes为真,multiple = 1024 否则等于1000。熟悉c,了解汇编的人其实这些东西都触类旁通的。
for suffix in SUFFIXES[multiple]:
size /= multiple
if size < multiple:
return '{0:.1f} {1}'.format(size, suffix)
SUFFIXES = {1000: ['KB', 'MB', 'GB', 'TB', 'PB', 'EB', 'ZB', 'YB'],
1024: ['KiB', 'MiB', 'GiB', 'TiB', 'PiB', 'EiB', 'ZiB', 'YiB']}
上面这一段有点麻烦,关键是'{0:.1f} {1}'.format(size, suffix) ,大家可能不太明白,其实很简单,我们把SUFFIXES拷贝过来参考。不会就google。下面就引用一段网络上的内容。现在明白了吧,用'{0:.1f} {1}'格式化(size, suffix),0和1分别代表format的第一个参数和第二个参数。
字符串格式化方面的变化
很多 Python 程序员都感觉用来格式化字符串的这个内置的 % 操作符太有限了,这是因为:
它是一个二进制的操作符,最多只能接受两个参数。
除了格式化字符串参数,所有其他的参数都必须用一个元组(tuple)或是一个字典(dictionary)进行挤压。
这种格式化多少有些不灵活,所以 Python 3 引入了一种新的进行字符串格式化的方式(版本 3 保留了 % 操作符和 string.Template 模块)。字符串对象现在均具有一个方法 format(),此方法接受位置参数和关键字参数,二者均传递到 replacement 字段 。Replacement 字段在字符串内由花括号({})标示。replacement 字段内的元素被简单称为一个字段。以下是一个简单的例子:
〉〉〉“I love {0}, {1}, and {2}“.format(“eggs“, “bacon“, “sausage“)
’I love eggs, bacon, and sausage’
字段 {0}、{1} 和 {2} 通过位置参数 eggs、 bacon 和 sausage 被传递给 format() 方法。如下的例子显示了如何使用 format() 通过关键字参数的传递来进行格式化:
〉〉〉“I love {a}, {b}, and {c}“.format(a=“eggs“, b=“bacon“, c=“sausage“)
’I love eggs, bacon, and sausage’
下面是另外一个综合了位置参数和关键字参数的例子:
〉〉〉“I love {0}, {1}, and {param}“.format(“eggs“, “bacon“, param=“sausage“)
’I love eggs, bacon, and sausage’
请记住,在关键字参数之后放置非关键字参数是一种语法错误。要想转义花括号,只需使用双倍的花括号,如下所示:
〉〉〉“{{0}}“.format(“can’t see me“)
’{0}’
位置参数 can’t see me 没有被输出,这是因为没有字段可以输出。请注意这不会产生错误。
新的 format() 内置函数可以格式化单个值。比如:
〉〉〉print(format(10.0, “7.3g“))
10
换言之,g 代表的是 一般格式,它输出的是宽度固定的值。小数点前的第一个数值指定的是最小宽度,小数点后的数值指定的是精度。
raise ValueError('number too large')
if __name__ == '__main__':
print(approximate_size(1000000000000, False))
print(approximate_size(1000000000000))
__name__是python内置的一个变量,如果直接执行某个py时,该py的__name__值为’__main__’。这样做易于初始化,测试之类的。
Now let’s run this program on the command line. On Windows, it will look something like this:
在windows上,按以下操作
c:/home/diveintopython3/examples> c:/python31/python.exe humansize.py
1.0 TB
931.3 GiB
On Mac OS X or Linux, it would look something like this:
在Mac OS x 或 Linux上,按以下操作
you@localhost:~/diveintopython3/examples$ python3 humansize.py
1.0 TB
931.3 GiB
What just happened? You executed your first Python program. You called the Python intepreter on the command line, and you passed the name of the script you wanted Python to execute. The script defines a single function, the approximate_size() function, which takes an exact file size in bytes and calculates a “pretty” (but approximate) size. (You’ve probably seen this in Windows Explorer, or the Mac OS X Finder, or Nautilus or Dolphin or Thunar on Linux. If you display a folder of documents as a multi-column list, it will display a table with the document icon, the document name, the size, type, last-modified date, and so on. If the folder contains a 1093-byte file named TODO, your file manager won’t display TODO 1093 bytes; it’ll say something like TODO 1 KB instead. That’s what the approximate_size() function does.)
上面我们运行了第一个Python程序,通过命令行调用了Python的解释器和要执行的脚本名称。上面的脚本只定义了一个函数,这个函数将字节数转为近似的大小。就像Windows Explorer, Mac OS X Finder, Nautilus or Dolphin or Thunar on Linux上一样,并不精确显示具体字节数,就显示个约等于值。
Look at the bottom of the script, and you’ll see two calls to print(approximate_size(arguments)). These are function calls — first calling the approximate_size() function and passing a number of arguments, then taking the return value and passing it straight on to the print() function. The print() function is built-in; you’ll never see an explicit declaration of it. You can just use it, anytime, anywhere. (There are lots of built-in functions, and lots more functions that are separated into modules. Patience, grasshopper.)
So why does running the script on the command line give you the same output every time? We’ll get to that. First, let’s look at that approximate_size() function.
上面示例代码最后两行调用了print(approximate_size(arguments)),然后将返回值打印出来。Print()是内置函数,不用声明就可以用。Python有大量的模块,里面有N多的函数,咱们慢慢来,别急。
⁂
DECLARING FUNCTIONS#
声明函数
Python has functions like most other languages, but it does not have separate header files like C++ or interface/implementation sections like Pascal. When you need a function, just declare it, like this:
如大多数语言一样,Pth也有函数,但不像c++一样由头文件,也不像Pascal有interface/implementation。在Ptn中,直接声明要用的函数就可以了。
def approximate_size(size, a_kilobyte_is_1024_bytes=True):
When you need a function, just declare it.
随时需要,随时声明。
The keyword def starts the function declaration, followed by the function name, followed by the arguments in parentheses. Multiple arguments are separated with commas.
声明以def开始,接着函数名,括号中的参数,参数间用逗号分割。
Also note that the function doesn’t define a return datatype. Python functions do not specify the datatype of their return value; they don’t even specify whether or not they return a value. (In fact, every Python function returns a value; if the function ever executes a return statement, it will return that value, otherwise it will return None, the Python null value.)
需要注意的是函数中并没有定义返回的数据类型,这个其实我觉得挺别扭,不过考虑到返回什么类型都可以,其实这样做也不错,提高了代码复用,甚至于有点像模版。Ptn不指定数据类型或返回值,也不强制要求一定要返回。这意思就是一个函数中可以同时存在这些情况中的一种,注意,我说是同时,返回a类型,或返回b类型,不返回值,返回某个值。如果函数执行return就会返回一个值,否则就返回None——Pth中的Null值。
☞In some languages, functions (that return a value) start with function, and subroutines (that do not return a value) start with sub. There are no subroutines in Python. Everything is a function, all functions return a value (even if it’s None), and all functions start with def.
有些语言,对于有返回值的函数要以function开头,没有返回值的以sub开头。Pth的函数都有返回值,即使是None,而且所有函数声明都是def开头。
The approximate_size() function takes the two arguments — size and a_kilobyte_is_1024_bytes — but neither argument specifies a datatype. In Python, variables are never explicitly typed. Python figures out what type a variable is and keeps track of it internally.
approximate_size()函数有两个参数, size 和a_kilobyte_is_1024_bytes,但它们都没有说明是什么数据类型。在Ptn中,变量永远都没有类型。Ptn会自动识别和处理它们的类型。
☞In Java and other statically-typed languages, you must specify the datatype of the function return value and each function argument. In Python, you never explicitly specify the datatype of anything. Based on what value you assign, Python keeps track of the datatype internally.
在Java和其他那些需要指定类型的语言中,必须指定函数返回值和函数参数的数据类型。在Ptn中,不需要明确指定数据类型。根据你给变量分配的值的不同,Ptn会自动处理。
OPTIONAL AND NAMED ARGUMENTS#
具有默认值的函数参数
Python allows function arguments to have default values; if the function is called without the argument, the argument gets its default value. Furthermore, arguments can be specified in any order by using named arguments.
Pth中函数参数支持默认值,这种情况下,如果不传对应参数,则参数等于默认值。
Let’s take another look at that approximate_size() function declaration:
再看下approximate_size()函数声明:
def approximate_size(size, a_kilobyte_is_1024_bytes=True):
The second argument, a_kilobyte_is_1024_bytes, specifies a default value of True. This means the argument is optional; you can call the function without it, and Python will act as if you had called it with True as a second parameter.
看,第2个参数指定了一个默认值True。这样写表示,该参数为可选参数,调用该函数时可以不传递该参数,Ptn默认会为该参数指定True值。
Now look at the bottom of the script:
现在看看脚本最后的几行代码:
skip over this code listing
[hide] [open in new window]
if __name__ == '__main__':
print(approximate_size(1000000000000, False)) ①
print(approximate_size(1000000000000)) ②
① This calls the approximate_size() function with two arguments. Within the approximate_size() function, a_kilobyte_is_1024_bytes will be False, since you explicitly passed False as the second argument.
因为你写上了第2个参数,所以a_kilobyte_is_1024_bytes值为True。
② This calls the approximate_size() function with only one argument. But that’s OK, because the second argument is optional! Since the caller doesn’t specify, the second argument defaults to True, as defined by the function declaration.
因为只写了一个参数,所以使用a_kilobyte_is_1024_bytes的默认值True.
You can also pass values into a function by name.
还可以通过写出参数具体名称来传递参数。
skip over this code listing
[hide] [open in new window]
>>> from humansize import approximate_size(丛humansize模块导入approximate_size函数)
>>> approximate_size(4000, a_kilobyte_is_1024_bytes=False) ①
'4.0 KB'
>>> approximate_size(size=4000, a_kilobyte_is_1024_bytes=False) ②
'4.0 KB'
>>> approximate_size(a_kilobyte_is_1024_bytes=False, size=4000) ③
'4.0 KB'
>>> approximate_size(a_kilobyte_is_1024_bytes=False, 4000) ④
File "
SyntaxError: non-keyword arg after keyword arg
>>> approximate_size(size=4000, False) ⑤
File "
SyntaxError: non-keyword arg after keyword arg
① This calls the approximate_size() function with 4000 for the first argument (size) and False for the argument named a_kilobyte_is_1024_bytes. (That happens to be the second argument, but doesn’t matter, as you’ll see in a minute.)
第一个参数直接使用值,第二个参数使用了 参数名=值,顺序和声明一样,这样用ok。
② This calls the approximate_size() function with 4000 for the argument named size and False for the argument named a_kilobyte_is_1024_bytes. (These named arguments happen to be in the same order as the arguments are listed in the function declaration, but that doesn’t matter either.)
第一个和第二个参数使用了 参数名=值,顺序和声明一样,这样用ok。
③ This calls the approximate_size() function with False for the argument named a_kilobyte_is_1024_bytes and 4000 for the argument named size. (See? I told you the order didn’t matter.)
第一个和第二个参数使用了 参数名=值,顺序和声明不一样,这样用ok。
④ This call fails, because you have a named argument followed by an unnamed (positional) argument, and that never works. Reading the argument list from left to right, once you have a single named argument, the rest of the arguments must also be named.
第二个参数直接使用值,第一个参数使用了 参数名=值,顺序和声明不一样,这样用NO。
原因是Pth是从左到右解析参数列表的,如果碰到一个使用 参数名 = 值的内容,那么后面的参数都必须是 参数名 = 值 的模式。
⑤ This call fails too, for the same reason as the previous call. Is that surprising? After all, you passed 4000 for the argument named size, then “obviously” that False value was meant for the a_kilobyte_is_1024_bytes argument. But Python doesn’t work that way. As soon as you have a named argument, all arguments to the right of that need to be named arguments, too.
这和上面的情况一样。原因是Pth是从左到右解析参数列表的,如果碰到一个使用 参数名 = 值的内容,那么后面的参数都必须是 参数名 = 值 的模式。
⁂
WRITING READABLE CODE#
编写容易读明白的代码
I won’t bore you with a long finger-wagging speech about the importance of documenting your code. Just know that code is written once but read many times, and the most important audience for your code is yourself, six months after writing it (i.e. after you’ve forgotten everything but need to fix something). Python makes it easy to write readable code, so take advantage of it. You’ll thank me in six months.
重要性我就不讲了,你读代码的次数显然大于你写代码的次数,更重要的是你自己要读,如果你写个代码,放六个月后再看,如果仍然容易看明白,那你就知道我这里所谓的容易读明白是个什么程度了。
DOCUMENTATION STRINGS#
关于注释
You can document a Python function by giving it a documentation string (docstring for short). In this program, the approximate_size() function has a docstring:
在Pth里面我们用docstring注释,下面就是一个例子,摘自本章开始的程序。
def approximate_size(size, a_kilobyte_is_1024_bytes=True):
'''Convert a file size to human-readable form.
Keyword arguments:
size -- file size in bytes
a_kilobyte_is_1024_bytes -- if True (default), use multiples of 1024
if False, use multiples of 1000
Returns: string
'''
Every function deserves a decent docstring.
Triple quotes signify a multi-line string. Everything between the start and end quotes is part of a single string, including carriage returns, leading white space, and other quote characters. You can use them anywhere, but you’ll see them most often used when defining a docstring.
函数都应该有注释,要不怎么用?每次用的时候看一遍代码,你脑子秀逗了吧,达芬奇也不行啊。我们用三个引号(单引号,双引号都行,但要凑够3个)来注释,里面可以有换行,空格和字符。注释到处都可以写,尤其是定义docstring时,总会这么用。
☞Triple quotes are also an easy way to define a string with both single and double quotes, like qq/.../ in Perl 5.
定义字符串也可以用3引号。
Everything between the triple quotes is the function’s docstring, which documents what the function does. A docstring, if it exists, must be the first thing defined in a function (that is, on the next line after the function declaration). You don’t technically need to give your function a docstring, but you always should. I know you’ve heard this in every programming class you’ve ever taken, but Python gives you an added incentive: the docstring is available at runtime as an attribute of the function.
三引号里的内容就是docstring,它记录了函数的功能和用法。如果用docstring,就必须在函数声明后的下一行做。我不强迫你这么做,不写也行,最好还是写,这话说得真大喘气。Pth中鼓励你这么写,因为你写的内容将作为函数的一个属性,你用__doc__变量就可以得到它,好玩吧,你可以写个笑话进去,或简历???哈哈
☞Many Python IDEs use the docstring to provide context-sensitive documentation, so that when you type a function name, its docstring appears as a tooltip. This can be incredibly helpful, but it’s only as good as the docstrings you write.
许多Pth IDE都会用docstring给开发人员以帮助,键入函数名就会给你提示docstring中的内容。这很有用,不过前提是你写的东西确实有用。
⁂
THE IMPORT SEARCH PATH#
IMPORT的搜索路径(当年不写JAVA就是因为这个,烦人,看看PTN怎么PERFECT处理的吧)
Before this goes any further, I want to briefly mention the library search path. Python looks in several places when you try to import a module. Specifically, it looks in all the directories defined in sys.path. This is just a list, and you can easily view it or modify it with standard list methods. (You’ll learn more about lists in Native Datatypes.)
我们得先说说import导入库时的搜索路径。它会搜索sys.path中定义的目录,你可以很容易查看或修改这个变量。这个变量是个列表类型,是Ptn直接支持的。
skip over this code listing
[hide] [open in new window]
>>> import sys ①
>>> sys.path ②
['', '/usr/lib/python31.zip', '/usr/lib/python3.1',
'/usr/lib/python3.1/plat-linux2@EXTRAMACHDEPPATH@',
'/usr/lib/python3.1/lib-dynload', '/usr/lib/python3.1/dist-packages',
'/usr/local/lib/python3.1/dist-packages']
>>> sys ③
>>> sys.path.insert(0, '/home/mark/diveintopython3/examples') ④
>>> sys.path ⑤
['/home/mark/diveintopython3/examples', '', '/usr/lib/python31.zip',
'/usr/lib/python3.1', '/usr/lib/python3.1/plat-linux2@EXTRAMACHDEPPATH@',
'/usr/lib/python3.1/lib-dynload', '/usr/lib/python3.1/dist-packages',
'/usr/local/lib/python3.1/dist-packages']
① Importing the sys module makes all of its functions and attributes available.
导入sys模块,就能用它的函数和属性了。调用方式sys.+”函数名/属性”,这样做解决了命名空间的问题。命名空间是什么,如果你写了很多名字参数都一样的函数,你就要考虑怎么唯一标识它们了。
② sys.path is a list of directory names that constitute the current search path. (Yours will look different, depending on your operating system, what version of Python you’re running, and where it was originally installed.) Python will look through these directories (in this order) for a .py file whose name matches what you’re trying to import.
Sys.path列出了要搜索的目录,你如果要import 某个模块,Ptn就会上这里面搜索,并且按照列出的顺序进行搜索。你的可能和我不一样。
③ Actually, I lied; the truth is more complicated than that, because not all modules are stored as .py files. Some, like the sys module, are built-in modules; they are actually baked right into Python itself. Built-in modules behave just like regular modules, but their Python source code is not available, because they are not written in Python! (The sys module is written in C.)
事实上,我在②中骗了你。真相会复杂些,并非所有模块都是py文件。Sys模块就不是,它是ptn内建的模块,而且你看不到它的源代码,它是用c写的。即使这样,sys模块对于你来说和其他模块使用方法没什么不同。
④ You can add a new directory to Python’s search path at runtime by adding the directory name to sys.path, and then Python will look in that directory as well, whenever you try to import a module. The effect lasts as long as Python is running.
你在运行时,也可以为sys.path追加新的搜索目录。运行时就是程序运行的时候,运行的时候当然不能改代码了,所以说运行时其实意思就是你可以在程序里动态改。修改后,只要Ptn运行,就一直生效
⑤ By using sys.path.insert(0, new_path), you inserted a new directory as the first item of the sys.path list, and therefore at the beginning of Python’s search path. This is almost always what you want. In case of naming conflicts (for example, if Python ships with version 2 of a particular library but you want to use version 3), this ensures that your modules will be found and used instead of the modules that came with Python.
sys.path.insert(0, new_path)可以把new_path添加到列表的第一项,因为第一个参数是0——它的意思就是索引。为什么要加到第一项,因为避免命名冲突。比如有两个a.py分别在b目录,和c目录中。如果你想加入c目录中的a.py,而b目录已经在搜索路径中,那么你加入到索引0上,就不会导入b目录中的a.py了。
⁂
EVERYTHING IS AN OBJECT#
因为对象所以对象
In case you missed it, I just said that Python functions have attributes, and that those attributes are available at runtime. A function, like everything else in Python, is an object.
如果你忘性不大,我刚刚说过Ptn中函数也有属性,而且可以在运行时使用。函数在Ptn中也是对象,对函数是对象,你不用怀疑。其实在c来看,对象不过就是结构体罢了。
Run the interactive Python shell and follow along:
看看下面的例子,最后你也敲敲键盘,看我这里说这么带劲,不写是没用的
skip over this code listing
[hide] [open in new window]
>>> import humansize ①
>>> print(humansize.approximate_size(4096, True)) ②
4.0 KiB
>>> print(humansize.approximate_size.__doc__) ③
Convert a file size to human-readable form.
Keyword arguments:
size -- file size in bytes
a_kilobyte_is_1024_bytes -- if True (default), use multiples of 1024
if False, use multiples of 1000
Returns: string
① The first line imports the humansize program as a module — a chunk of code that you can use interactively, or from a larger Python program. Once you import a module, you can reference any of its public functions, classes, or attributes. Modules can do this to access functionality in other modules, and you can do it in the Python interactive shell too. This is an important concept, and you’ll see a lot more of it throughout this book.
导入humansize模块,之后你就可以用里面的public的函数、类或属性。
② When you want to use functions defined in imported modules, you need to include the module name. So you can’t just say approximate_size; it must be humansize.approximate_size. If you’ve used classes in Java, this should feel vaguely familiar.
和java类似,调用approximate_size时要这样: humansize.approximate_size,humansize是模块名字。
③ Instead of calling the function as you would expect to, you asked for one of the function’s attributes, __doc__.
调用函数的__doc__属性。
☞import in Python is like require in Perl. Once you import a Python module, you access its functions with module.function; once you require a Perl module, you access its functions with module::function.
Ptn中的import和Perl中的require类似。Ptn模式是module.function,Perl模式是module::function。
WHAT’S AN OBJECT?#
什么是对象
Everything in Python is an object, and everything can have attributes and methods. All functions have a built-in attribute __doc__, which returns the docstring defined in the function’s source code. The sys module is an object which has (among other things) an attribute called path. And so forth.
记住Ptn中什么都是对象,什么都有属性和方法。是函数就会有内建的__doc__属性,内容就是函数定义中的docstring。模块也有属性,sys模块就有path属性。
Still, this doesn’t answer the more fundamental question: what is an object? Different programming languages define “object” in different ways. In some, it means that all objects must have attributes and methods; in others, it means that all objects are subclassable. In Python, the definition is looser. Some objects have neither attributes nor methods, but they could. Not all objects are subclassable. But everything is an object in the sense that it can be assigned to a variable or passed as an argument to a function.
到底什么是对象?不同的编程语言有不同的定义。有说对象必须有属性和方法的,有说对象必须是可以子类化的。Ptn中定义不是那么严格,有些对象既没有属性也没有方法,但仍然是对象(????想不出来是什么)。Ptn中也不是所有对象都可以子类化的。Ptn中的一切都是对象,都可以赋值给变量或作为函数参数传递。
You may have heard the term “first-class object” in other programming contexts. In Python, functions are first-class objects. You can pass a function as an argument to another function. Modules are first-class objects. You can pass an entire module as an argument to a function. Classes are first-class objects, and individual instances of a class are also first-class objects.
听说过牛逼对象(first-class object)么。在Ptn中函数就是牛逼对象。你可以把函数作为参数传递给另一个函数(c里就是函数指针)。模块也是牛逼对象。可以将整个模块作为参数传递给函数。类也是牛逼对象,类的每个实例也是牛逼对象。
This is important, so I’m going to repeat it in case you missed it the first few times: everything in Python is an object. Strings are objects. Lists are objects. Functions are objects. Classes are objects. Class instances are objects. Even modules are objects.
请记住Ptn中的一切都是对象。字符串、列表和函数都是对象。类也是对象。类实例还是对象。模块也是对象~~~
⁂
INDENTING CODE#
代码缩进
Python functions have no explicit begin or end, and no curly braces to mark where the function code starts and stops. The only delimiter is a colon (:) and the indentation of the code itself.
Ptn中只使用定界符(冒号)和代码缩进来构造函数。
skip over this code listing
[hide] [open in new window]
def approximate_size(size, a_kilobyte_is_1024_bytes=True): ①
if size < 0: ②
raise ValueError('number must be non-negative') ③
④
multiple = 1024 if a_kilobyte_is_1024_bytes else 1000
for suffix in SUFFIXES[multiple]: ⑤
size /= multiple
if size < multiple:
return '{0:.1f} {1}'.format(size, suffix)
raise ValueError('number too large')
① Code blocks are defined by their indentation. By “code block,” I mean functions, if statements, for loops, while loops, and so forth. Indenting starts a block and unindenting ends it. There are no explicit braces, brackets, or keywords. This means that whitespace is significant, and must be consistent. In this example, the function code is indented four spaces. It doesn’t need to be four spaces, it just needs to be consistent. The first line that is not indented marks the end of the function.
② In Python, an if statement is followed by a code block. If the if expression evaluates to true, the indented block is executed, otherwise it falls to the else block (if any). Note the lack of parentheses around the expression.
函数的定义格式通过缩进构造。缩进的尺度只要一致即可,不一定非要是4个空格。
③ This line is inside the if code block. This raise statement will raise an exception (of type ValueError), but only if size < 0.
size < 0时,引发异常
④ This is not the end of the function. Completely blank lines don’t count. They can make the code more readable, but they don’t count as code block delimiters. The function continues on the next line.
空行,并不表示函数定义结束,只是让代码更易读。
⑤ The for loop also marks the start of a code block. Code blocks can contain multiple lines, as long as they are all indented the same amount. This for loop has three lines of code in it. There is no other special syntax for multi-line code blocks. Just indent and get on with your life.
比for 循环缩进更深的就是它的循环体。
After some initial protests and several snide analogies to Fortran, you will make peace with this and start seeing its benefits. One major benefit is that all Python programs look similar, since indentation is a language requirement and not a matter of style. This makes it easier to read and understand other people’s Python code.
开始你会觉得缩进并不那么舒服。但这样做的一个主要优势是,所有Ptn程序风格都保持了一定程度的一致。这让大家不会因为代码风格不同,而阅读他人代码困难。
☞Python uses carriage returns to separate statements and a colon and indentation to separate code blocks. C++ and Java use semicolons to separate statements and curly braces to separate code blocks.
Ptn使用换行分割语句,使用冒号和缩减分割代码块(就是其他语言中用{}圈住的内容)。C++和java使用分号分割语句,花括号分割代码块。
⁂
EXCEPTIONS#
异常
Exceptions are everywhere in Python. Virtually every module in the standard Python library uses them, and Python itself will raise them in a lot of different circumstances. You’ll see them repeatedly throughout this book.
异常在Ptn中到处都是,Ptn标准卡中的每个模块都用到了,在许多情况下会触发这些异常。在本书之中,你会经常看到。
What is an exception? Usually it’s an error, an indication that something went wrong. (Not all exceptions are errors, but never mind that for now.) Some programming languages encourage the use of error return codes, which you check. Python encourages the use of exceptions, which you handle.
什么是异常?通俗讲,就是错误,表示有东西出问题了。当然异常并不一定是错误,你现在先暂且这么理解着。有些编程语言鼓励使用错误返回码表示各种异常情况,这需要我们检查返回码。Ptn鼓励使用异常,这需要我们来处理。
When an error occurs in the Python Shell, it prints out some details about the exception and how it happened, and that’s that. This is called an unhandled exception. When the exception was raised, there was no code to explicitly notice it and deal with it, so it bubbled its way back up to the top level of the Python Shell, which spits out some debugging information and calls it a day. In the shell, that's no big deal, but if that happened while your actual Python program was running, the entire program would come to a screeching halt if nothing handles the exception. Maybe that’s what you want, maybe it isn’t.
Ptn Shell检测到有错误时,就会打印有关异常及其触发原因的信息。这叫做 未处理(unhandled)异常,也就是说发生异常时没有相应的代码去处理,所以就会将错误返回到Shell这一层,并显示调试信息,但程序并不结束。在Shell中这不要紧,但如果在实际应用的Ptn程序中发生,整个程序就可能会没有反应。
☞Unlike Java, Python functions don’t declare which exceptions they might raise. It’s up to you to determine what possible exceptions you need to catch.
与Java不同,Pth函数不声明可能引发的异常,由你决定需要捕获的异常。
An exception doesn’t need to result in a complete program crash, though. Exceptions can be handled. Sometimes an exception is really because you have a bug in your code (like accessing a variable that doesn’t exist), but sometimes an exception is something you can anticipate. If you’re opening a file, it might not exist. If you’re importing a module, it might not be installed. If you’re connecting to a database, it might be unavailable, or you might not have the correct security credentials to access it. If you know a line of code may raise an exception, you should handle the exception using a try...except block.
发生异常并不一定导致整个程序崩溃,异常可以被处理。有些时候异常是因为bug,比如访问不存在的变量。还有些时候,异常是可以预料到的,比如要打开的文件不存在,要导入的模块未安装,要连接的数据库不可用或密码错误拒绝访问。如果知道某行代码可能引发异常,就应该使用try...except来处理。
☞Python uses try...except blocks to handle exceptions, and the raise statement to generate them. Java and C++ use try...catch blocks to handle exceptions, and the throw statement to generate them.
Ptn使用try...except处理异常,使用raise引发异常。Java使用try...catch处理异常,使用throw引发异常。
The approximate_size() function raises exceptions in two different cases: if the given size is larger than the function is designed to handle, or if it’s less than zero.
approximate_size()函数在size参数过大,或小于0时引发异常。
if size < 0:
raise ValueError('number must be non-negative')
The syntax for raising an exception is simple enough. Use the raise statement, followed by the exception name, and an optional human-readable string for debugging purposes. The syntax is reminiscent of calling a function. (In reality, exceptions are implemented as classes, and this raise statement is actually creating an instance of the ValueError class and passing the string 'number must be non-negative' to its initialization method. But we’re getting ahead of ourselves!)
Ptn中很容易引发一个异常。实际上异常是个类,我们创建了一个ValueError类的实例,并将'number must be non-negative'传给该类的初始化方法。若想进一步了解,请参见we’re getting ahead of ourselves。
☞You don’t need to handle an exception in the function that raises it. If one function doesn’t handle it, the exception is passed to the calling function, then that function’s calling function, and so on “up the stack.” If the exception is never handled, your program will crash, Python will print a “traceback” to standard error, and that’s the end of that. Again, maybe that’s what you want; it depends on what your program does.
不一定在抛出异常的函数中处理该异常。如果不处理,异常就会传给上一层函数,如果还不处理就再往上传。如果都不处理,程序就会宕掉,最后Ptn会打印Traceback信息。
CATCHING IMPORT ERRORS#
CATCH住IMPORT错误
One of Python’s built-in exceptions is ImportError, which is raised when you try to import a module and fail. This can happen for a variety of reasons, but the simplest case is when the module doesn’t exist in your import search path. You can use this to include optional features in your program. For example, the chardet library provides character encoding auto-detection. Perhaps your program wants to use this library if it exists, but continue gracefully if the user hasn’t installed it. You can do this with a try..except block.
ImportError是Ptn的一个内置异常,在导入模块失败时会引发。失败原因可能有很多种,最简单的一种是模块不在搜索路径中。你可以通过import导入相应的路径。例如,the chardet library提供字符编码的自动检测。如果希望程序即使在不能导入该模块的情况下,还能继续运行,就要使用try..except。
try:
import chardet
except ImportError:
chardet = None
Later, you can check for the presence of the chardet module with a simple if statement:
然后,接下来做个判断。
if chardet:
# do something
else:
# continue anyway
Another common use of the ImportError exception is when two modules implement a common API, but one is more desirable than the other. (Maybe it’s faster, or it uses less memory.) You can try to import one module but fall back to a different module if the first import fails. For example, the XML chapter talks about two modules that implement a common API, called the ElementTree API. The first, lxml, is a third-party module that you need to download and install yourself. The second, xml.etree.ElementTree, is slower but is part of the Python 3 standard library.
另一种常见的导致ImportError的情况是,两个模块有两个名字相同的API。2个API在效率或空间占用上不同,你希望在导入一个失败的情况下,再试试导入另外一个。例如,the XML chapter中就有这种情况。lxml是第三方库,xml.etree.ElementTree慢但是ptn3标准库的一部分。
try:
from lxml import etree
except ImportError:
import xml.etree.ElementTree as etree
By the end of this try..except block, you have imported some module and named it etree. Since both modules implement a common API, the rest of your code doesn’t need to keep checking which module got imported. And since the module that did get imported is always called etree, the rest of your code doesn’t need to be littered with if statements to call differently-named modules.
上面代码最后一行中,我们不但导入了模块而且还给它起了新名字etree。这样即使lxml导入错误,还可以尝试导入xml.etree.ElementTree,关键是它们导入时的别名都一样,这样调用这些模块的地方就不用改动了,当然你调用的东西必须在两个模块里的功能和用法一样,这样提高了代码的扩展性。
⁂
UNBOUND VARIABLES#
自由变量,看看如何个自由
Take another look at this line of code from the approximate_size() function:
看看下面这行代码
multiple = 1024 if a_kilobyte_is_1024_bytes else 1000
You never declare the variable multiple, you just assign a value to it. That’s OK, because Python lets you do that. What Python will not let you do is reference a variable that has never been assigned a value. Trying to do so will raise a NameError exception.
不用提前声明multiple就可以赋值。Ptn允许这样,但是如果引用一个从未赋值过的变量就会引发NameError异常。
>>> x
Traceback (most recent call last):
File "
NameError: name 'x' is not defined
>>> x = 1
>>> x
1
You will thank Python for this one day.
这真的不错,你知道我们很多时间都花在变量声明上了。
⁂
EVERYTHING IS CASE-SENSITIVE#
PTN的一切都区分大小写
All names in Python are case-sensitive: variable names, function names, class names, module names, exception names. If you can get it, set it, call it, construct it, import it, or raise it, it’s case-sensitive.
变量名、函数名、类名、模块名、异常名都区分大小写。
>>> an_integer = 1
>>> an_integer
1
>>> AN_INTEGER
Traceback (most recent call last):
File "
NameError: name 'AN_INTEGER' is not defined
>>> An_Integer
Traceback (most recent call last):
File "
NameError: name 'An_Integer' is not defined
>>> an_inteGer
Traceback (most recent call last):
File "
NameError: name 'an_inteGer' is not defined
And so on.
⁂
RUNNING SCRIPTS#
运行脚本
Everything in Python is an object.
Python modules are objects and have several useful attributes. You can use this to easily test your modules as you write them, by including a special block of code that executes when you run the Python file on the command line. Take the last few lines of humansize.py:
Ptn中的一切都是对象,模块也不例外,它也有一些方便的属性。通过这些属性可以方便测试模块。
if __name__ == '__main__':
print(approximate_size(1000000000000, False))
print(approximate_size(1000000000000))
☞Like C, Python uses == for comparison and = for assignment. Unlike C, Python does not support in-line assignment, so there’s no chance of accidentally assigning the value you thought you were comparing.
和C一样,Ptn使用==比较,使用=赋值。与C不同,Ptn不支持 行中(in-line) 赋值,这样你就不可能写出if(a=b)这种错误而不被发现了,太好了,又减少了bug出现的机会。
So what makes this if statement special? Well, modules are objects, and all modules have a built-in attribute __name__. A module’s __name__ depends on how you’re using the module. If you import the module, then __name__ is the module’s filename, without a directory path or file extension.
所有模块都有一个内建属性__name__,它的值取决于模块的使用方式。如果导入模块,则__name__为该模块的文件名(不包括路径和扩展名)。
>>> import humansize
>>> humansize.__name__
'humansize'
But you can also run the module directly as a standalone program, in which case __name__ will be a special default value, __main__. Python will evaluate this if statement, find a true expression, and execute the if code block. In this case, to print two values.
如果直接运行该模块,__name__值为__main__。
c:/home/diveintopython3> c:/python31/python.exe humansize.py
1.0 TB
931.3 GiB
And that’s your first Python program!
这就是我们的第一个Ptn程序。
⁂
FURTHER READING#
• PEP 257: Docstring Conventions explains what distinguishes a good docstring from a great docstring.
• Python Tutorial: Documentation Strings also touches on the subject.
• PEP 8: Style Guide for Python Code discusses good indentation style.
• Python Reference Manual explains what it means to say that everything in Python is an object, because some people are pedants and like to discuss that sort of thing at great length.
深度阅读#
• PEP 257: Docstring Conventions 讲解了如何编写好的docstring
• Python Tutorial: Documentation Strings 也讲了docstring
• PEP 8: Style Guide for Python Code 讨论了怎样的缩进风格才好
• Python Reference Manual 讲解了 everything in Python is an object, 因为有些pedants(学问人)喜欢有点深度的探讨
☜ ☞
© 2001–9 Mark Pilgrim