Python Tutorial中英双语对照文档2

接 Python Tutorial中英双语对照文档1


CHAPTER
SIX


MODULES

模块

If you quit from the Python interpreter and enter it again, the definitions you have made (functions and variables) are lost. Therefore, if you want to write a somewhat longer program, you are better off using a text editor to prepare the input for the interpreter and running it with that file as input instead. This is known as creating a script. As your program gets longer, you may want to split it into several files for easier maintenance. You may also want to use a handy function that you’ve written in several programs without copying its definition into each program.
如果你从退出的 Python 解释器再一次进入, 你所做的定义(函数和变量)是丢失的. 因此, 如果你想写一些更长的程序, 你最好用文本编辑器作为解释器准备输入, 并将该文件作为输入来运行. 这就是常说的创建脚本. 随着程序变得越来越长,您可能希望将其拆分为多个文件以便于维护.

To support this, Python has a way to put definitions in a file and use them in a script or in an interactive instance of the interpreter. Such a file is called a module; definitions from a module can be imported into other modules or into the main module (the collection of variables that you have access to in a script executed at the top level and in calculator mode).
为了支持这个, Python 有一个方法是将定义内容放入文件, 然后用他们作为一个脚本或是在交互解释器中引用. 这样的文件本称为模块; 模块的定义可以导入到其他模块或主模块中(在脚本执行时可以调用的变量集位于最高级, 并且处于计算器模式).

A module is a file containing Python definitions and statements. The file name is the module name with the suffix .py appended. Within a module, the module’s name (as a string) is available as the value of the global variable __name__. For instance, use your favorite text editor to create a file called fibo.py in the current directory with the following contents:

# Fibonacci numbers module
def fib(n): # write Fibonacci series up to n
    a, b = 0, 1
    while b < n:
        print(b, end=' ')
        a, b = b, a+b
    print()
def fib2(n): # return Fibonacci series up to n
    result = []
    a, b = 0, 1
    while b < n:
        result.append(b)
        a, b = b, a+b
    return result

模块是包含 Python 定义和语句的文件. 文件即模块名再加上 .py 后缀. 在模块中, 模块名(一般是字符串)可以通过全局变量 __name__ 获取. 举例, 使用你最喜欢的文本编辑器创建名为 fibo.py 的文件, 并在里面录入一下内容:

# Fibonacci numbers module
def fib(n): # write Fibonacci series up to n
    a, b = 0, 1
    while b < n:
        print(b, end=' ')
        a, b = b, a+b
    print()
def fib2(n): # return Fibonacci series up to n
    result = []
    a, b = 0, 1
    while b < n:
        result.append(b)
        a, b = b, a+b
    return result

Now enter the Python interpreter and import this module with the following command:
现在进入 Python 解释器, 然后用以下命令引入这个模块:

>>> import fibo

This does not enter the names of the functions defined in fibo directly in the current symbol table; it only enters the module name fibo there. Using the module name you can access the functions:

>>> fibo.fib(1000)
1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987
>>> fibo.fib2(100)
[1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
>>> fibo.__name__
'fibo'

这不会直接输入fibo中定义的函数的名称到当前符号表中; 这里只输入了模块名. 通过模块名, 你可以方位这些函数:

>>> fibo.fib(1000)
1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987
>>> fibo.fib2(100)
[1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
>>> fibo.__name__
'fibo'

If you intend to use a function often you can assign it to a local name:

>>> fib = fibo.fib
>>> fib(500)
1 1 2 3 5 8 13 21 34 55 89 144 233 377

如果你打算经常使用一个函数, 你可以赋值它给一个本地名:

>>> fib = fibo.fib
>>> fib(500)
1 1 2 3 5 8 13 21 34 55 89 144 233 377

6.1 More on Modules

6.1 更过关于模块

A module can contain executable statements as well as function definitions. These statements are intended to initialize the module. They are executed only the first time the module name is encountered in an import statement.1 (They are also run if the file is executed as a script.)
模块可以包含可执行语句以及函数定义. 这些语句意在初始化模块. 它们只在第一次模块名在 import 语句被引用时执行.1 (它们也会运行在文件被当做脚本执行时.)

Each module has its own private symbol table, which is used as the global symbol table by all functions defined in the module. Thus, the author of a module can use global variables in the module without worrying about accidental clashes with a user’s global variables. On the other hand, if you know what you are doing you can touch a module’s global variables with the same notation used to refer to its functions, modname.itemname.
每个模块都有自己私有的符号表, 被模块内所有的函数定义作为全局符号表使用. 因此, 模块作者可以使用全局变量在模块里, 而不必担心与用户的全局变量意外冲突. 另一方面,

Modules can import other modules. It is customary but not required to place all import tatements at the beginning of a module (or script, for that matter). The imported module names are placed in the importing module’s global symbol table. There is a variant of the import statement that imports names from a module directly into the importing module’s symbol table. For example:

>>> from fibo import fib, fib2
>>> fib(500)
1 1 2 3 5 8 13 21 34 55 89 144 233 377

模块可以引入其他模块. 习惯性但不强求所有的 import 语句 放置在模块(或脚本, 对于这个问题)的开始. 导入的模块被放入当前引入模块的全局符号表中. 这里有一个 import 语句的变种, 从模块中引入命名直接到当前引入模块的全局符号表中. 示例:

>>> from fibo import fib, fib2
>>> fib(500)
1 1 2 3 5 8 13 21 34 55 89 144 233 377

6.1.1 Executing modules as scripts

6.1.1 执行模块作为脚本

When you run a Python module with

python fibo.py 

the code in the module will be executed, just as if you imported it, but with the __name__ set to “__main__”. That means that by adding this code at the end of your module:

if __name__ == "__main__":
    import sys
    fib(int(sys.argv[1]))

you can make the file usable as a script as well as an importable module, because the code that parses the command line only runs if the module is executed as the “main” file:

$ python fibo.py 50
1 1 2 3 5 8 13 21 34

当你以下面这样的方式运行 Python 模块时

python fibo.py 

模块中的代码会被执行, 就像你引入了它一样, 且设置了 __name__ 为 “__main__”. 这意味着要通过在模块尾添加下面这些代码实现:

if __name__ == "__main__":
    import sys
    fib(int(sys.argv[1]))

你可以将文件用作脚本以及可导入模块, 因为只有当模块作为"main"文件执行时, 才会运行解析命令行的代码.

$ python fibo.py 50
1 1 2 3 5 8 13 21 34

If the module is imported, the code is not run:

>>> import fibo
>>>

如果模块被引入, 代码不会运行:

>>> import fibo
>>>

This is often used either to provide a convenient user interface to a module, or for testing purposes (running the module as a script executes a test suite).
这也经常使用于为模块提供方便的用户接口, 或是用在测试目的上(运行模块作为脚本执行测试单元).

6.1.2 The Module Search Path

6.1.2 模块搜索路径

When a module named spam is imported, the interpreter first searches for a built-in module with that name. If not found, it then searches for a file named spam.py in a list of directories given by the variable sys.path. sys.path is initialized from these locations:

  • The directory containing the input script (or the current directory when no file is specified).
  • PYTHONPATH (a list of directory names, with the same syntax as the shell variable PATH).
  • The installation-dependent default.
    当引入一个名为 spam 的模块时, 解释器首先搜寻这样名称的内置模块. 若没有找到, 它接着在变量 sys.path 给定的目录列表里搜寻名为 spam.py 的文件.sys.path从这些位置初始化:
  • 包含输入脚本的目录(或无指定文件时的当前目录)
  • PYTHONPATH (目录列表名, 与 shell 变量 PATH 语法相同)(即环境变量 PYTHONPATH 目录列表)
  • 依赖于安装的默认值

Note: On file systems which support symlinks, the directory containing the input script is calculated after the symlink is followed. In other words the directory containing the symlink is not added to the module search path.
注意: 在支持符号链接的文件系统, 目录包含的输入脚本是计算符号链接指向的目录. 也就是说, 包含符号链接的目录不会添加到模块搜寻的路径.


After initialization, Python programs can modify sys.path. The directory containing the script being run is placed at the beginning of the search path, ahead of the standard library path. This means that scripts in that directory will be loaded instead of modules of the same name in the library directory. This is an error unless the replacement is intended. See section Standard Modules for more information.
初始化后, Python 程序可以修改 sys.path. 包含正在运行的脚本的目录位于搜索路径的开头, 位于标准库路径之前. 这意味着将加载该目录中的脚本, 而不是库目录中的同名模块. 这会是一个错误, 除非是有意替换的. 查看 标准模块了解更多信息.

6.1.3 “Compiled” Python files

6.1.3 “编译” Python 文件

To speed up loading modules, Python caches the compiled version of each module in the __pycache__ directory under the name module.version.pyc, where the version encodes the format of the compiled file; it generally contains the Python version number. For example, in CPython release 3.3 the compiled version of spam.py would be cached as __pycache__/spam.cpython-33.pyc. This naming convention allows compiled modules from different releases and different versions of Python to coexist.
为了加速载入模块, Python 缓存每个模块名称为 module.version.pyc 的编译版本在__pycache__目录下, 这里的版本编码编译后文件的格式; 它通常包含 Python 版本号. 例如, 在 CPython 3.3 发布版中, 编译 spam.py 后版本将是缓存为 __pycache__/spam.cpython-33.pyc. 这种命名习惯允许编译的模块在不同的发布和不同的版本 Python 共存.

Python checks the modification date of the source against the compiled version to see if it’s out of date and needs to be recompiled. This is a completely automatic process. Also, the compiled modules are platform-independent, so the same library can be shared among systems with different architectures.
Python 根据编译版本检查源的修改日期, 以查看它是否已过期并需要重新编译. 这是一个完全自动的过程. 同样地, 编译模块和平台无关, 可以在具有不同体系结构的系统之间共享相同的库.

Python does not check the cache in two circumstances. First, it always recompiles and does not store the result for the module that’s loaded directly from the command line. Second, it does not check the cache if there is no source module. To support a non-source (compiled only) distribution, the compiled module must be in the source directory, and there must not be a source module.
Python 在两种情况下不会检查缓存. 第一, 总是重新编译而且不存储直接从命令行加载的模块的结果. 第二, 如果没有源模块, 不会检查缓存. 为了支持无源文件(只有编译的)发布, 编译模块必须在源目录下, 而且必须没有源文件模块.

Some tips for experts:

  • You can use the -O or -OO switches on the Python command to reduce the size of a compiled module. The -O switch removes assert statements, the -OO switch removes both assert statements and __doc__ strings. Since some programs may rely on having these available, you should only use this option if you know what you’re doing. “Optimized” modules have an opt-tag and are usually smaller. Future releases may change the effects of optimization.
  • A program doesn’t run any faster when it is read from a .pyc file than when it is read from a .py file; the only thing that’s faster about .pyc files is the speed with which they are loaded
  • The module compileall can create .pyc files for all modules in a directory.
  • There is more detail on this process, including a flow chart of the decisions, in PEP 3147.
    一些内行提示:
  • 你可以用 -O 或 _OO 启动 Python 命令去压缩编译模块的大小. -O 选择删除 assert 语句(断言语句), -OO 选择删除断言语句和 __doc__ 字符串(文档字符串). 由于某些程序可能依赖于这些程序, 你应该只用这些选项如果你知道你在做什么. "优化"的模块拥有 opt-tag 而且一般会更小. 未来的发布也许会改变优化的效率.
  • 程序不会因从 .pyc 文件读取就比从 .py 文件读取时运行的更快; 更快的事只有 .pyc 文件加载的速度.
  • 模块 compileall 可以为目录中的所有模块创建 .pyc 文件.
  • 有关此过程的更多详细信息, 包括决策的流程图, 在 PEP 3147.

6.2 Standard Modules

6.2 标准模块

Python comes with a library of standard modules, described in a separate document, the Python Library Reference (“Library Reference” hereafter). Some modules are built into the interpreter; these provide access to operations that are not part of the core of the language but are nevertheless built in, either for efficiency or to provide access to operating system primitives such as system calls. The set of such modules is a configuration option which also depends on the underlying platform. For example, the winreg module is only provided on Windows systems. One particular module deserves some attention: sys, which is built into every Python interpreter. The variables sys.ps1 and sys.ps2 define the strings used as primary and secondary prompts:

>>> import sys
>>> sys.ps1
'>>> '
>>> sys.ps2
'... '
>>> sys.ps1 = 'C> '
C> print('Yuck!')
Yuck!
C>

These two variables are only defined if the interpreter is in interactive mode.
Python 附带标准模块库, 并发布有独立的文档, 名为 Python 库参考手册(后面简称"库手册"). 有些模块内置于解释器; 这些提供不属于语言核心但仍然内置的访问操作, 或为提高效率, 或为提供对系统调用等操作系统原语的访问. 这样的模块集是配置选项, 它也依赖于底层平台. 例如, winreg 模块只提供在 Windows 系统. 一个特别的模块值得注意: sys, 它内置于每一个 Python 解释器. 变量 sys.ps1 和 sys.ps2 定义用作主要和次要提示的字符串:

>>> import sys
>>> sys.ps1
'>>> '
>>> sys.ps2
'... '
>>> sys.ps1 = 'C> '
C> print('Yuck!')
Yuck!
C>

这两个变量只在解释器是交互模式下被定义.

The variable sys.path is a list of strings that determines the interpreter’s search path for modules. It is initialized to a default path taken from the environment variable PYTHONPATH, or from a built-in default if PYTHONPATH is not set. You can modify it using standard list operations:

>>> import sys
>>> sys.path.append('/ufs/guido/lib/python')

变量 sys.path 是一个字符串列表, 它确定解释器的模块搜寻路径. 它初始化为一个默认路径取自环境变量 PYTHONPATH, 若 PYTHONPATH 没有设置则从内置默认值初始化. 你可以用标准的列表操作修改它:

>>> import sys
>>> sys.path.append('/ufs/guido/lib/python')

6.3 The dir() Function

6.3 dir() 函数

The built-in function dir() is used to find out which names a module defines. It returns a sorted list of strings:

>>> import fibo, sys
>>> dir(fibo)
['__name__', 'fib', 'fib2']
>>> dir(sys)
['__displayhook__', '__doc__', '__excepthook__', '__loader__', '__name__',
'__package__', '__stderr__', '__stdin__', '__stdout__',
'_clear_type_cache', '_current_frames', '_debugmallocstats', '_getframe',
'_home', '_mercurial', '_xoptions', 'abiflags', 'api_version', 'argv',
'base_exec_prefix', 'base_prefix', 'builtin_module_names', 'byteorder',
'call_tracing', 'callstats', 'copyright', 'displayhook',
'dont_write_bytecode', 'exc_info', 'excepthook', 'exec_prefix',
'executable', 'exit', 'flags', 'float_info', 'float_repr_style',
'getcheckinterval', 'getdefaultencoding', 'getdlopenflags',
'getfilesystemencoding', 'getobjects', 'getprofile', 'getrecursionlimit',
'getrefcount', 'getsizeof', 'getswitchinterval', 'gettotalrefcount',
'gettrace', 'hash_info', 'hexversion', 'implementation', 'int_info',
'intern', 'maxsize', 'maxunicode', 'meta_path', 'modules', 'path',
'path_hooks', 'path_importer_cache', 'platform', 'prefix', 'ps1',
'setcheckinterval', 'setdlopenflags', 'setprofile', 'setrecursionlimit',
'setswitchinterval', 'settrace', 'stderr', 'stdin', 'stdout',
'thread_info', 'version', 'version_info', 'warnoptions']

内置函数 dir() 用来按命名查找模块定义. 它返回一个排序过的字符串列表:

>>> import fibo, sys
>>> dir(fibo)
['__name__', 'fib', 'fib2']
>>> dir(sys)
['__displayhook__', '__doc__', '__excepthook__', '__loader__', '__name__',
'__package__', '__stderr__', '__stdin__', '__stdout__',
'_clear_type_cache', '_current_frames', '_debugmallocstats', '_getframe',
'_home', '_mercurial', '_xoptions', 'abiflags', 'api_version', 'argv',
'base_exec_prefix', 'base_prefix', 'builtin_module_names', 'byteorder',
'call_tracing', 'callstats', 'copyright', 'displayhook',
'dont_write_bytecode', 'exc_info', 'excepthook', 'exec_prefix',
'executable', 'exit', 'flags', 'float_info', 'float_repr_style',
'getcheckinterval', 'getdefaultencoding', 'getdlopenflags',
'getfilesystemencoding', 'getobjects', 'getprofile', 'getrecursionlimit',
'getrefcount', 'getsizeof', 'getswitchinterval', 'gettotalrefcount',
'gettrace', 'hash_info', 'hexversion', 'implementation', 'int_info',
'intern', 'maxsize', 'maxunicode', 'meta_path', 'modules', 'path',
'path_hooks', 'path_importer_cache', 'platform', 'prefix', 'ps1',
'setcheckinterval', 'setdlopenflags', 'setprofile', 'setrecursionlimit',
'setswitchinterval', 'settrace', 'stderr', 'stdin', 'stdout',
'thread_info', 'version', 'version_info', 'warnoptions']

Without arguments, dir() lists the names you have defined currently:

>>> a = [1, 2, 3, 4, 5]
>>> import fibo
>>> fib = fibo.fib
>>> dir()
['__builtins__', '__name__', 'a', 'fib', 'fibo', 'sys']

Note that it lists all types of names: variables, modules, functions, etc.
没有参数时, dir() 列出你当前已定义的命名:

>>> a = [1, 2, 3, 4, 5]
>>> import fibo
>>> fib = fibo.fib
>>> dir()
['__builtins__', '__name__', 'a', 'fib', 'fibo', 'sys']

注意, 它列出所有的命名类型: 变量, 模块, 函数, 等等.

dir() does not list the names of built-in functions and variables. If you want a list of those, they are defined in the standard module builtins:

>>> import builtins
>>> dir(builtins)
['ArithmeticError', 'AssertionError', 'AttributeError', 'BaseException',
'BlockingIOError', 'BrokenPipeError', 'BufferError', 'BytesWarning',
'ChildProcessError', 'ConnectionAbortedError', 'ConnectionError',
'ConnectionRefusedError', 'ConnectionResetError', 'DeprecationWarning',
'EOFError', 'Ellipsis', 'EnvironmentError', 'Exception', 'False',
'FileExistsError', 'FileNotFoundError', 'FloatingPointError',
'FutureWarning', 'GeneratorExit', 'IOError', 'ImportError',
'ImportWarning', 'IndentationError', 'IndexError', 'InterruptedError',
'IsADirectoryError', 'KeyError', 'KeyboardInterrupt', 'LookupError',
'MemoryError', 'NameError', 'None', 'NotADirectoryError', 'NotImplemented',
'NotImplementedError', 'OSError', 'OverflowError',
'PendingDeprecationWarning', 'PermissionError', 'ProcessLookupError',
'ReferenceError', 'ResourceWarning', 'RuntimeError', 'RuntimeWarning',
'StopIteration', 'SyntaxError', 'SyntaxWarning', 'SystemError',
'SystemExit', 'TabError', 'TimeoutError', 'True', 'TypeError',
'UnboundLocalError', 'UnicodeDecodeError', 'UnicodeEncodeError',
'UnicodeError', 'UnicodeTranslateError', 'UnicodeWarning', 'UserWarning',
'ValueError', 'Warning', 'ZeroDivisionError', '_', '__build_class__',
'__debug__', '__doc__', '__import__', '__name__', '__package__', 'abs',
'all', 'any', 'ascii', 'bin', 'bool', 'bytearray', 'bytes', 'callable',
'chr', 'classmethod', 'compile', 'complex', 'copyright', 'credits',
'delattr', 'dict', 'dir', 'divmod', 'enumerate', 'eval', 'exec', 'exit',
'filter', 'float', 'format', 'frozenset', 'getattr', 'globals', 'hasattr',
'hash', 'help', 'hex', 'id', 'input', 'int', 'isinstance', 'issubclass',
'iter', 'len', 'license', 'list', 'locals', 'map', 'max', 'memoryview',
'min', 'next', 'object', 'oct', 'open', 'ord', 'pow', 'print', 'property',
'quit', 'range', 'repr', 'reversed', 'round', 'set', 'setattr', 'slice',
'sorted', 'staticmethod', 'str', 'sum', 'super', 'tuple', 'type', 'vars',
'zip']

dir() 不会列出内置函数和变量的命名. 如果你想它们的列列表, 它们定义在标准模块 builtins里:

>>> import builtins
>>> dir(builtins)
['ArithmeticError', 'AssertionError', 'AttributeError', 'BaseException',
'BlockingIOError', 'BrokenPipeError', 'BufferError', 'BytesWarning',
'ChildProcessError', 'ConnectionAbortedError', 'ConnectionError',
'ConnectionRefusedError', 'ConnectionResetError', 'DeprecationWarning',
'EOFError', 'Ellipsis', 'EnvironmentError', 'Exception', 'False',
'FileExistsError', 'FileNotFoundError', 'FloatingPointError',
'FutureWarning', 'GeneratorExit', 'IOError', 'ImportError',
'ImportWarning', 'IndentationError', 'IndexError', 'InterruptedError',
'IsADirectoryError', 'KeyError', 'KeyboardInterrupt', 'LookupError',
'MemoryError', 'NameError', 'None', 'NotADirectoryError', 'NotImplemented',
'NotImplementedError', 'OSError', 'OverflowError',
'PendingDeprecationWarning', 'PermissionError', 'ProcessLookupError',
'ReferenceError', 'ResourceWarning', 'RuntimeError', 'RuntimeWarning',
'StopIteration', 'SyntaxError', 'SyntaxWarning', 'SystemError',
'SystemExit', 'TabError', 'TimeoutError', 'True', 'TypeError',
'UnboundLocalError', 'UnicodeDecodeError', 'UnicodeEncodeError',
'UnicodeError', 'UnicodeTranslateError', 'UnicodeWarning', 'UserWarning',
'ValueError', 'Warning', 'ZeroDivisionError', '_', '__build_class__',
'__debug__', '__doc__', '__import__', '__name__', '__package__', 'abs',
'all', 'any', 'ascii', 'bin', 'bool', 'bytearray', 'bytes', 'callable',
'chr', 'classmethod', 'compile', 'complex', 'copyright', 'credits',
'delattr', 'dict', 'dir', 'divmod', 'enumerate', 'eval', 'exec', 'exit',
'filter', 'float', 'format', 'frozenset', 'getattr', 'globals', 'hasattr',
'hash', 'help', 'hex', 'id', 'input', 'int', 'isinstance', 'issubclass',
'iter', 'len', 'license', 'list', 'locals', 'map', 'max', 'memoryview',
'min', 'next', 'object', 'oct', 'open', 'ord', 'pow', 'print', 'property',
'quit', 'range', 'repr', 'reversed', 'round', 'set', 'setattr', 'slice',
'sorted', 'staticmethod', 'str', 'sum', 'super', 'tuple', 'type', 'vars',
'zip']

6.4 Packages

6.4 打包

Packages are a way of structuring Python’s module namespace by using “dotted module names”. For example, the module name A.B designates a submodule named B in a package named A. Just like the use of modules saves the authors of different modules from having to worry about each other’s global variable names, the use of dotted module names saves the authors of multi-module packages like NumPy or Pillow from having to worry about each other’s module names.
包是一种结构化 Python 模块命名空间的方式, 它通过使用"点模块命名"实现. 例如, 模块名 A.B 指派子模块名 B 在包名 A 里. 就像使用模块一样, 保证不同模块的作者不必担心彼此的全局变量名, 点模块名的使用保证多模块包比如 NumPy 或 Pillow 的作者去担心彼此模块的命名.

Suppose you want to design a collection of modules (a “package”) for the uniform handling of sound files and sound data. There are many different sound file formats (usually recognized by their extension, for example: .wav, .aiff, .au), so you may need to create and maintain a growing collection of modules for the conversion between the various file formats. There are also many different operations you might want to perform on sound data (such as mixing, adding echo, applying an equalizer function, creating an artificial stereo effect), so in addition you will be writing a never-ending stream of modules to perform these operations. Here’s a possible structure for your package (expressed in terms of a hierarchical filesystem):

sound/                          Top-level package
    __init__.py                 Initialize the sound package
    formats/                    Subpackage for file format conversions
            __init__.py
            wavread.py
            wavwrite.py
            aiffread.py
            aiffwrite.py
            auread.py
            auwrite.py
            ...
    effects/                    Subpackage for sound effects
            __init__.py
            echo.py
            surround.py
            reverse.py
            ...
    filters/                    Subpackage for filters
            __init__.py
            equalizer.py
            vocoder.py
            karaoke.py
            ...

假设你想设计一个模块集(包)去处理声音文件和声音数据的噪音. 这里有很多不同格式的声音文件(一般通过它们的后缀名识别, 例如: .wav, .aiff, .au), 所以你可能需要创建和维护一个不断增长的模块集用于各种文件格式之间的转换. 同样地, 可能你需要在声音数据上实现很多不同的操作(像是混音, 添加回声, 应用平衡功能, 创造人造立体效果), 所以此外, 你将编写一个没完没了的模块流来实现这些操作. 这有一个可能结构给你的包(用分层文件系统表示):

sound/                          Top-level package
    __init__.py                 Initialize the sound package
    formats/                    Subpackage for file format conversions
            __init__.py
            wavread.py
            wavwrite.py
            aiffread.py
            aiffwrite.py
            auread.py
            auwrite.py
            ...
    effects/                    Subpackage for sound effects
            __init__.py
            echo.py
            surround.py
            reverse.py
            ...
    filters/                    Subpackage for filters
            __init__.py
            equalizer.py
            vocoder.py
            karaoke.py
            ...

When importing the package, Python searches through the directories on sys.path looking for the package subdirectory.
当引入一个包时, Python 通过搜寻 sys.path 上的目录来查找包子目录

The __init__.py files are required to make Python treat the directories as containing packages; this is done to prevent directories with a common name, such as string, from unintentionally hiding valid modules that occur later on the module search path. In the simplest case, __init__.py can just be an empty file, but it can also execute initialization code for the package or set the __all__ variable, described later.
文件__init__.py 是必须的使得 Python 对待这个目录作为包含包; 它防止了目录作为一个普通名, 就像个字符串, 无意中隐藏了稍后在模块搜索路径上出现的有效模块. 最简单的情况下, __init__.py可以就是一个空文件, 但它也可以执行包的代码初始化, 或设置变量 __all__, 稍后详述.

Users of the package can import individual modules from the package, for example:

import sound.effects.echo

包的用户可以从包里引入单个的模块, 比如

import sound.effects.echo

This loads the submodule sound.effects.echo. It must be referenced with its full name.

sound.effects.echo.echofilter(input, output, delay=0.7, atten=4)

这就载入了子模块 sound.effects.echo. 它必须被全名引用.

sound.effects.echo.echofilter(input, output, delay=0.7, atten=4)

An alternative way of importing the submodule is:

from sound.effects import echo

引入子模块的替代方式是:

from sound.effects import echo

This also loads the submodule echo, and makes it available without its package prefix, so it can be used as follows:

echo.echofilter(input, output, delay=0.7, atten=4)

这种方式也是载入子模块 echo, 而且使得它可以无包前缀, 所它可以如下使用:

echo.echofilter(input, output, delay=0.7, atten=4)

Yet another variation is to import the desired function or variable directly:

from sound.effects.echo import echofilter

又一种变形去直接引入期望的函数和变量:

from sound.effects.echo import echofilter

Again, this loads the submodule echo, but this makes its function echofilter() directly available:

echofilter(input, output, delay=0.7, atten=4)

同样, 这会加载子模块 echo, 但这使得它的函数 echofilter() 直接可用:

echofilter(input, output, delay=0.7, atten=4)

Note that when using from package import item, the item can be either a submodule (or subpackage) of the package, or some other name defined in the package, like a function, class or variable. The import statement first tests whether the item is defined in the package; if not, it assumes it is a module and attempts to load it. If it fails to find it, an ImportError exception is raised.
注意, 从包导入项目使用时, 项目可以或为包子模块(或为子包), 或是一些其他命名第一在包理, 比如一个函数, 类, 或者变量. import 语句首先测试项目是否已定义在包中; 若无, 它假设它是一个模块然后尝试载入它. 如果查找失败, 会抛出 ImportError 异常.

Contrarily, when using syntax like import item.subitem.subsubitem, each item except for the last must be a package; the last item can be a module or a package but can’t be a class or function or variable defined in the previous item.
相反地, 当使用语法形如 import item.subitem.subsubitem 时, 除了最后一项之外的每个项目都必须是一个包; 最后一个项目可以是一个模块或一个包, 但也可以是前一个项目定义的类,函数,或变量.

6.4.1 Importing * From a Package

6.4.1 从包里引入 *

Now what happens when the user writes from sound.effects import *? Ideally, one would hope that this somehow goes out to the filesystem, finds which submodules are present in the package, and imports them all. This could take a long time and importing sub-modules might have unwanted side-effects that should only happen when the sub-module is explicitly imported.
那么, 当用户写 from sound.effects import * 时会发生什么呢? 理想的情况下, 人们期望它会以某种方式传递给文件系统, 哪个子模块是当前包里的. 然后引入他们所有. 这可能需要很长时间, 导入子模块可能会产生不必要的副作用, 这种副作用只有在显式导入子模块时才会发生.

The only solution is for the package author to provide an explicit index of the package. The import statement uses the following convention: if a package’s __init__.py code defines a list named __all__, it is taken to be the list of module names that should be imported when from package import * is encountered. It is up to the package author to keep this list up-to-date when a new version of the package is released. Package authors may also decide not to support it, if they don’t see a use for importing * from their package. For example, the file sound/effects/__init__.py could contain the following code:

__all__ = ["echo", "surround", "reverse"]

This would mean that from sound.effects import * would import the three named submodules of the sound package.

唯一的解决方案是由包作者去提供明确的索引在包中. import 语句使用后面的惯例: 如果包的 __init__.py 定义了名为 __all__ 的列表, 它被视为模块名称列表, 这些将会在遇到 from package import * 时被引入. 它取决于包作者保持此列表的最新状态在包的新版本发布时. 包作者也可以决定不支持此功能, 如果他们没有看到从他们的包中导入 * 的用途. 例如. 文件 sound/effects/__init__.py 可以包含下面的代码:

__all__ = ["echo", "surround", "reverse"]

这意味着, from sound.effects import * 将引入 sound 包的三个命名子模块.

If __all__ is not defined, the statement from sound.effects import * does not import all submodules from the package sound.effects into the current namespace; it only ensures that the package sound.effects has been imported (possibly running any initialization code in __init__.py) and then imports whatever names are defined in the package. This includes any names defined (and submodules explicitly loaded) by __init__.py. It also includes any submodules of the package that were explicitly loaded by previous import statements. Consider this code:

import sound.effects.echo
import sound.effects.surround
from sound.effects import *

如果 __all__ 没有定义, 语句 form sound.effects import * 不会从包 sound.effects 导入所有子模块进入当前命名空间; 它只能确保包 sound.effects 已经被导入(或者运行__init__.py中任何初始化代码), 然后导入在包里被定义的任何名. 这包括由 __init__.py 定义的每一个名称(和子模块明确地载入的). 它也包括了已经明确地被前面的 import 语句载入的包的子模块. 参考这段代码:

import sound.effects.echo
import sound.effects.surround
from sound.effects import *

In this example, the echo and surround modules are imported in the current namespace because they are defined in the sound.effects package when the from…import statement is executed. (This also works when __all__ is defined.)
在这个例子中, echo 和 surround 模块被导入当前命名空间, 因为当 from…import 语句执行时它们已经被定义在 sound.effects 包里. (__all__被定义了时也会同样工作.)

Although certain modules are designed to export only names that follow certain patterns when you use import *, it is still considered bad practice in production code.
尽管某些模块设计为使用 import * 时它只导出符合某种规范/模式的命名, 它一直坏的建议在生成代码中.

Remember, there is nothing wrong with using from Package import specific_submodule! In fact, this is the recommended notation unless the importing module needs to use submodules with the same name from different packages.
记住, rom Package import specific_submodule 的使用是没有任何错的! 实际上, 这是被推荐的写法除非导入的模块需要使用其它包中的同名子模块.

6.4.2 Intra-package References

6.4.2 包内引用

When packages are structured into subpackages (as with the sound package in the example), you can use absolute imports to refer to submodules of siblings packages. For example, if the module sound.filters. vocoder needs to use the echo module in the sound.effects package, it can use from sound.effects import echo.
当包由子包组成时(如上例的 sound 包), 你可以使用独立的导入去引用兄弟包的子模块. 例如, 如果模块的 sound.filters.vocoder 需要使用 sound.effects 包 echo 模块, 可以使用 from sound.effects import echo 语句.

You can also write relative imports, with the from module import name form of import statement. These imports use leading dots to indicate the current and parent packages involved in the relative import. From the surround module for example, you might use:

from . import echo
from .. import formats
from ..filters import equalizer

你也可以使用相对导入, 以 from module import name 的 import 语句. 这些导入使用前导点来指示相对导入中涉及的当前和父包. 例如, 从 surround 模块你可以用:

from . import echo
from .. import formats
from ..filters import equalizer

Note that relative imports are based on the name of the current module. Since the name of the main module is always "__main__", modules intended for use as the main module of a Python application must always use absolute imports.
注意, 相对导入是基于当前模块的命名. 由于主模块的名称总是 "__main__", 所以用作 Python 应用程序主模块的模块必须始终使用绝对导入.

6.4.3 Packages in Multiple Directories

6.4.3 包在多重目录中

Packages support one more special attribute, __path__. This is initialized to be a list containing the name of the directory holding the package’s __init__.py before the code in that file is executed. This variable can be modified; doing so affects future searches for modules and subpackages contained in the package.
包支持一个更特殊的属性, __path__. 这用来初始化一个包含目录名的列表, 保持在包的 ``init.py` 文件代码执行之前. 这个变量是可以修改的; 它作用于包中的子包和模块的搜索功能.

While this feature is not often needed, it can be used to extend the set of modules found in a package.
然而这个特性不经常使用, 这个功能可以用于扩展包中的模块集.


CHAPTER
SEVEN


INPUT AND OUTPUT

输入和输出

There are several ways to present the output of a program; data can be printed in a human-readable form, or written to a file for future use. This chapter will discuss some of the possibilities.
这里有些方式显示当前程序的输出; 数据可以被人类可读的形式打印, 或写入文件给将来使用. 本章将讨论一些可能性.

7.1 Fancier Output Formatting

7.1 Fancier 输出格式化

So far we’ve encountered two ways of writing values: expression statements and the print() function. (A third way is using the write() method of file objects; the standard output file can be referenced as sys.stdout. See the Library Reference for more information on this.)
到目前为止, 我们已经遇到了两种写出值得方式: 表达陈述和函数 print(). (第三种方式是使用 file 对象的 write() 方法; 标准输出文件可以是引用 sys.stdout. 参见库手册了解更多信息.)

Often you’ll want more control over the formatting of your output than simply printing space-separated values. There are two ways to format your output; the first way is to do all the string handling yourself; using string slicing and concatenation operations you can create any layout you can imagine. The string type has some methods that perform useful operations for padding strings to a given column width; these will be discussed shortly. The second way is to use formatted string literals, or the str.format() method.
经常, 你会想更多的控制在你输出的格式比简单地打印空间分隔的值. 这里有两种方式去格式化你的输出; 第一种方式是自已完成所有字符串的处理; 使用字符串切片和并联操作符, 你可以创建任何你想象的布局. 字符串类型有些方法可以执行有用的操作去填充字符串到你指定的列宽; 这些将在稍后讨论. 第二种方式是去使用格式化的字符串文本, 或是 str.format() 方法.

The string module contains a Template class which offers yet another way to substitute values into strings.
字符串模块包含一个 Template 类, 它提供另一种方式将值替换为字符串.

One question remains, of course: how do you convert values to strings? Luckily, Python has ways to convert any value to a string: pass it to the repr() or str() functions.
当然, 一个问题浮现了: 怎样转换值为字符串? 幸运地是, Python 有方式去转换任何值为字符串: 通过 repr() 或 str() 函数.

The str() function is meant to return representations of values which are fairly human-readable, while repr() is meant to generate representations which can be read by the interpreter (or will force a SyntaxError if there is no equivalent syntax). For objects which don’t have a particular representation for human consumption, str() will return the same value as repr(). Many values, such as numbers or structures like lists and dictionaries, have the same representation using either function. Strings, in particular, have two distinct representations.
str() 函数用于返回相当人性化的值的表征, 而 repr() 用于转化为供解释器读取的形式(如果没有等价语法, 将会抛出 SyntaxError). 某对象没有适于人阅读的解释形式的话, str() 将返回 repr() 返回的同样的值. 很多值, 比如数字或像列表和字典样的结构化值, 针对各函数都有着统一的解读方式. 字符串, 较特别, 有两个不同的表示.

Some examples:

>>> s = 'Hello, world.'
>>> str(s)
'Hello, world.'
>>> repr(s)
"'Hello, world.'"
>>> str(1/7)
'0.14285714285714285'
>>> x = 10 * 3.25
>>> y = 200 * 200
>>> s = 'The value of x is ' + repr(x) + ', and y is ' + repr(y) + '...'
>>> print(s)
The value of x is 32.5, and y is 40000...
>>> # The repr() of a string adds string quotes and backslashes:
... hello = 'hello, world\n'
>>> hellos = repr(hello)
>>> print(hellos)
'hello, world\n'
>>> # The argument to repr() may be any Python object:
... repr((x, y, ('spam', 'eggs')))
"(32.5, 40000, ('spam', 'eggs'))"

一些例子:

>>> s = 'Hello, world.'
>>> str(s)
'Hello, world.'
>>> repr(s)
"'Hello, world.'"
>>> str(1/7)
'0.14285714285714285'
>>> x = 10 * 3.25
>>> y = 200 * 200
>>> s = 'The value of x is ' + repr(x) + ', and y is ' + repr(y) + '...'
>>> print(s)
The value of x is 32.5, and y is 40000...
>>> # The repr() of a string adds string quotes and backslashes:
... hello = 'hello, world\n'   # Python 单引号下 \n 也解析
>>> hellos = repr(hello)
>>> print(hellos)
'hello, world\n'
>>> # The argument to repr() may be any Python object:
... repr((x, y, ('spam', 'eggs')))
"(32.5, 40000, ('spam', 'eggs'))"

Here are two ways to write a table of squares and cubes:

>>> for x in range(1, 11):
...     print(repr(x).rjust(2), repr(x*x).rjust(3), end=' ')
...     # Note use of 'end' on previous line
...     print(repr(x*x*x).rjust(4))
...
 1   1    1
 2   4    8
 3   9   27
 4  16   64
 5  25  125
 6  36  216
 7  49  343
 8  64  512
 9  81  729
10 100 1000
>>> for x in range(1, 11):
...     print('{0:2d} {1:3d} {2:4d}'.format(x, x*x, x*x*x))
...
 1   1    1
 2   4    8
 3   9   27
 4  16   64
 5  25  125
 6  36  216
 7  49  343
 8  64  512
 9  81  729
10 100 1000

(Note that in the first example, one space between each column was added by the way print() works: by default it adds spaces between its arguments.)

这里展示两种方式写平方和立方表:

>>> for x in range(1, 11):
...     print(repr(x).rjust(2), repr(x*x).rjust(3), end=' ')
...     # Note use of 'end' on previous line
...     print(repr(x*x*x).rjust(4))
...
 1   1    1
 2   4    8
 3   9   27
 4  16   64
 5  25  125
 6  36  216
 7  49  343
 8  64  512
 9  81  729
10 100 1000
>>> for x in range(1, 11):
...     print('{0:2d} {1:3d} {2:4d}'.format(x, x*x, x*x*x))
...
 1   1    1
 2   4    8
 3   9   27
 4  16   64
 5  25  125
 6  36  216
 7  49  343
 8  64  512
 9  81  729
10 100 1000

(注意在第一个例子中, 每列之间的一个空格是由 print() 工作方式添加的: 默认地, 它添加空格在它的参数之间.)

This example demonstrates the str.rjust() method of string objects, which right-justifies a string in a field of a given width by padding it with spaces on the left. There are similar methods str.ljust() and str.center(). These methods do not write anything, they just return a new string. If the input string is too long, they don’t truncate it, but return it unchanged; this will mess up your column layout but that’s usually better than the alternative, which would be lying about a value. (If you really want truncation you can always add a slice operation, as in x.ljust(n)[:n].)
这个例子演示了字符串对象的 str.rjust() 方法, 它通过左边填充空格来在给定的宽度范围右对齐字符串. 类似的方法有 str.ljust() 和 str.center(). 这些方法不会写任何事, 它们只是返回一个新的字符串. 如果输入的字符串太长了, 它们不会截断字符串, 而是原样返回; 这样会打乱你的列布局, 不过这通常是好过另一种替代(截断字符串), 那将是一个错误的值(如果你真的想截断字符串, 你始终可以添加切片操作, 如 x.ljust(n)[:n].)

There is another method, str.zfill(), which pads a numeric string on the left with zeros. It understands about plus and minus signs:

>>> '12'.zfill(5)
'00012'
>>> '-3.14'.zfill(7)
'-003.14'
>>> '3.14159265359'.zfill(5)
'3.14159265359'

这还有其他一种方法, str.zfill(), 它用数字零在左侧填充字符串. 它识别正负号:

>>> '12'.zfill(5)
'00012'
>>> '-3.14'.zfill(7)
'-003.14'
>>> '3.14159265359'.zfill(5)
'3.14159265359'

Basic usage of the str.format() method looks like this:

>>> print('We are the {} who say "{}!"'.format('knights', 'Ni'))
We are the knights who say "Ni!"

方法 str.format() 的基本用法如下:

>>> print('We are the {} who say "{}!"'.format('knights', 'Ni'))
We are the knights who say "Ni!"

The brackets and characters within them (called format fields) are replaced with the objects passed into the str.format() method. A number in the brackets can be used to refer to the position of the object passed into the str.format() method.

>>> print('{0} and {1}'.format('spam', 'eggs'))
spam and eggs
>>> print('{1} and {0}'.format('spam', 'eggs'))
eggs and spam

大括号和里面的字符(称为格式字段)被传递给str.format()方法的对象替换. 大括号中的数字可以用作传递给str.format()方法的对象的引用位置.

If keyword arguments are used in the str.format() method, their values are referred to by using the name of the argument.

>>> print('This {food} is {adjective}.'.format(
...       food='spam', adjective='absolutely horrible'))
This spam is absolutely horrible.

如果关键字参数被用在 str.format() 方法, 它们的值是通过使用参数名来引用的.

>>> print('This {food} is {adjective}.'.format(
...    food='spam', adjective='absolutely horrible'))   
This spam is absolutely horrible.

译注: format 随便换行, 行头对齐是为了方便代码的阅读, 其实在哪都行; 别在引号那断, 要么上面一个引号结尾, 下面重起, print() 自动把他们合为一个.

Positional and keyword arguments can be arbitrarily combined:

>>> print('The story of {0}, {1}, and {other}.'.format('Bill', 'Manfred',
                                                       other='Georg'))
The story of Bill, Manfred, and Georg.

位置和关键字参数可以被随意的包含:

>>> print('The story of {0}, {1}, and {other}.'.format('Bill', 'Manfred',
                                                       other='Georg'))
The story of Bill, Manfred, and Georg.

‘!a’ (apply ascii()), ‘!s’ (apply str()) and ‘!r’ (apply repr()) can be used to convert the value before it is formatted:

>>> contents = 'eels'
>>> print('My hovercraft is full of {}.'.format(contents))
My hovercraft is full of eels.
>>> print('My hovercraft is full of {!r}.'.format(contents))
My hovercraft is full of 'eels'.

‘!a’ (应用函数(下同) ascii()), ‘!s’ (str()) 和 ‘!r’ (repr()) 可以在格式化前用来转化值:

>>> contents = 'eels'
>>> print('My hovercraft is full of {}.'.format(contents))
My hovercraft is full of eels.
>>> print('My hovercraft is full of {!r}.'.format(contents))
My hovercraft is full of 'eels'.

An optional ‘:’ and format specifier can follow the field name. This allows greater control over how the value is formatted. The following example rounds Pi to three places after the decimal.

>>> import math
>>> print('The value of PI is approximately {0:.3f}.'.format(math.pi))
The value of PI is approximately 3.142.

可选的 ‘:’ 和格式指令可以更在字段名后. 这允许更好地控制如何格式化值. 下面的示例四舍五入 π 到小数点后三位数.

>>> import math
>>> print('The value of PI is approximately {0:.3f}.'.format(math.pi))
The value of PI is approximately 3.142.

Passing an integer after the ‘:’ will cause that field to be a minimum number of characters wide. This is useful for making tables pretty.

>>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 7678}
>>> for name, phone in table.items():
...     print('{0:10} ==> {1:10d}'.format(name, phone))
...
Sjoerd     ==>       4127
Jack       ==>       4098
Dcab       ==>       7678

在’:'之后传递一个整数将限制该字段的最小字符数宽度. 这非常有用在制作漂亮表格上.

>>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 7678}
>>> for name, phone in table.items():
...     print('{0:10} ==> {1:10d}'.format(name, phone))
...
Sjoerd     ==>       4127
Jack       ==>       4098
Dcab       ==>       7678

译注: d, f 等都是格式标识符, 和 C 类似

If you have a really long format string that you don’t want to split up, it would be nice if you could reference the variables to be formatted by name instead of by position. This can be done by simply passing the dict and using square brackets ‘[]’ to access the keys

>>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678}
>>> print('Jack: {0[Jack]:d}; Sjoerd: {0[Sjoerd]:d}; '
...       'Dcab: {0[Dcab]:d}'.format(table))
Jack: 4098; Sjoerd: 4127; Dcab: 8637678

如果你有个实在要很长的格式化字符串, 又不想去切片, 它可能是比较好的: 如果你可以通过名称替代通过位置来引用变量去格式化. 这可以由简单的字典和方括号’[]'配合使用访问它的键来完成:

>>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678}
>>> print('Jack: {0[Jack]:d}; Sjoerd: {0[Sjoerd]:d}; '
...       'Dcab: {0[Dcab]:d}'.format(table))
Jack: 4098; Sjoerd: 4127; Dcab: 8637678

This could also be done by passing the table as keyword arguments with the ‘**’ notation

>>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678}
>>> print('Jack: {Jack:d}; Sjoerd: {Sjoerd:d}; Dcab: {Dcab:d}'.format(**table))
Jack: 4098; Sjoerd: 4127; Dcab: 8637678

这也可以通过将表作为关键字参数传递’**'表示法来完成.

>>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678}
>>> print('Jack: {Jack:d}; Sjoerd: {Sjoerd:d}; Dcab: {Dcab:d}'.format(**table))
Jack: 4098; Sjoerd: 4127; Dcab: 8637678

This is particularly useful in combination with the built-in function vars(), which returns a dictionary containing all local variables.
这与内置函数 vars() 结合使用时尤其有用, 它返回包含所有局部变量的字典.

For a complete overview of string formatting with str.format(), see formatstrings.
要进一步了解字符串格式化方法 str.format(), 参见格式字符串语法.

7.1.1 Old string formatting

7.1.1 旧字符串格式化

The % operator can also be used for string formatting. It interprets the left argument much like a sprintf()-style format string to be applied to the right argument, and returns the string resulting from this formatting operation. For example:

>>> import math
>>> print('The value of PI is approximately %5.3f.' % math.pi)
The value of PI is approximately 3.142

% 操作符也可用来格式化字符串. 它以类似 sprintf()-style 的方式解析左参数, 应用于右参数, 从格式化操作得到字符串结果. 例如:

>>> import math
>>> print('The value of PI is approximately %5.3f.' % math.pi)
The value of PI is approximately 3.142

More information can be found in the old-string-formatting section.
更多的信息可以参见章节: 旧字符串格式化.

7.2 Reading and Writing Files

7.2 读和写文件

open() returns a file object, and is most commonly used with two arguments: open(filename, mode).

>>> f = open('workfile', 'w')

函数 open() 返回一个文件对象, 它通常使用两个参数: open(filename, mode).

>>> f = open('workfile', 'w')

The first argument is a string containing the filename. The second argument is another string containing a few characters describing the way in which the file will be used. mode can be ‘r’ when the file will only be read, ‘w’ for only writing (an existing file with the same name will be erased), and ‘a’ opens the file for appending; any data written to the file is automatically added to the end. ‘r+’ opens the file for both reading and writing. The mode argument is optional; ‘r’ will be assumed if it’s omitted.
第一个参数是包含文件名的字符串. 第二个参数是另个一包含少量字符的字符串, 用来描述文件将被使用的方式. 当文件需要只被读是, mode 应该是 ‘r’; 只写时是 ‘w’ (如果文件不存在, 将会创建同名文件); ‘a’ 打开文件为了追加, 任何数据写入文件时自动地添加到末尾 . ‘r+’ 打开既可以读也可以写. mode 参数是可选的; 如缺省, ‘r’ 将是预设的.

Normally, files are opened in text mode, that means, you read and write strings from and to the file, which are encoded in a specific encoding. If encoding is not specified, the default is platform dependent (see open()). ‘b’ appended to the mode opens the file in binary mode: now the data is read and written in the form of bytes objects. This mode should be used for all files that don’t contain text.
通常地, 文件以文本模式被打开, 这意味着, 你读写字符串出入文件是以特定编码方式编码的. 如果编码方式没有指定, 默认值为依赖于平台(参见 open()). 'b’追加到模式表示打开文件是二进制模式: 这是, 数据读写是以 bytes 对象形式进行的. 这种模式应该被用在所有不包含文本的文件.

In text mode, the default when reading is to convert platform-specific line endings (\n on Unix, \r\n on Windows) to just \n. When writing in text mode, the default is to convert occurrences of \n back to platform-specific line endings. This behind-the-scenes modification to file data is fine for text files, but will corrupt binary data like that in JPEG or EXE files. Be very careful to use binary mode when reading and writing such files.
文本模式下, 默认地, 当读取时会转换平台指定行结束符(\n 在 Unix, \r\n 在 Windows)只为 \n. 这种模式写入时, 也会默认地将产生的 \n 转回到平台特定的行结束符. 这种幕后修改文件数据是针对文本文件的, 但会破坏二进制数据比如 JPEG 和 EXE 文件. 当读写这样的文件时, 要非常小心地用二进制模式.

It is good practice to use the with keyword when dealing with file objects. The advantage is that the file is properly closed after its suite finishes, even if an exception is raised at some point. Using with is also much shorter than writing equivalent try-finally blocks:

>>> with open('workfile') as f:
...     read_data = f.read()
>>> f.closed
True

在处理文件对象时, 最好使用 with 关键字. 这么做的好书是文件在它的单元完成后将被正确地关闭, 即使某些时候出现异常. 使用 with 也是比写等价的 try-finally 块更短的多.

If you’re not using the with keyword, then you should call f.close() to close the file and immediately free up any system resources used by it. If you don’t explicitly close a file, Python’s garbage collector will eventually destroy the object and close the open file for you, but the file may stay open for a while. Another risk is that different Python implementations will do this clean-up at different times.
如果你没有使用 with 关键字, 之后应该调用 f.cloes() 去关闭文件, 立即释放它使用的任何系统资源. 如果你不明确地关闭文件, Python 的垃圾收集器终究将为你销毁这个对象和关闭打开的文件, 不过文件可能保持打开一阵子. 另一个风险是不同的 Python 解释器将会在不同时间进行清理.

After a file object is closed, either by a with statement or by calling f.close(), attempts to use the file object will automatically fail.

>>> f.close()
>>> f.read()
Traceback (most recent call last):
  File "", line 1, in <module>
ValueError: I/O operation on closed file.

当文件对象通过 with 语句或调用 f.cloes() 被关闭, 企图使用文件对象将自然而然地失败.

>>> f.close()
>>> f.read()
Traceback (most recent call last):
  File "", line 1, in <module>
ValueError: I/O operation on closed file.

7.2.1 Methods of File Objects

7.2.1 File 对象的方法

The rest of the examples in this section will assume that a file object called f has already been created.
本节中的其余示例将假定已创建名为f的文件对象.

To read a file’s contents, call f.read(size), which reads some quantity of data and returns it as a string (in text mode) or bytes object (in binary mode). size is an optional numeric argument. When size is omitted or negative, the entire contents of the file will be read and returned; it’s your problem if the file is twice as large as your machine’s memory. Otherwise, at most size bytes are read and returned. If the end of the file has been reached, f.read() will return an empty string (’’).

>>> f.read()
'This is the entire file.\n'
>>> f.read()
''

要读取文件内容, 请调用 f.read(size), 它读取一些数据,然后将它们作为一个字符串(文本模式)或 bytes 对象(二进制模式)返回. size 是一个可选数字参数. 当 size 缺省或为负数时, 文件的全部内容将被读取和返回; 如果文件大小是你机器内存的两倍大, 它将会是你的问题. 否则, 尽可能按比较大的 size 读取和返回字节. 如果已经到达文件尾, f.read() 将返回一个空字符串(’’).

>>> f.read()
'This is the entire file.\n'
>>> f.read()
''

f.readline() reads a single line from the file; a newline character (\n) is left at the end of the string, and is only omitted on the last line of the file if the file doesn’t end in a newline. This makes the return value unambiguous; if f.readline() returns an empty string, the end of the file has been reached, while a blank line is represented by ‘\n’, a string containing only a single newline.

>>> f.readline()
'This is the first line of the file.\n'
>>> f.readline()
'Second line of the file\n'
>>> f.readline()
''

f.readline() 从文件读取单独的一行; ==换行符(\n)==留在字符串的末尾, 如果文件不以换行符结束, 那么它只在文件最后一行时缺省的. 这使得返回值是明确的; 如果 f.readline() 返回一个空字符串, 就表示已经到达文件尾, 因为空白行是用 ‘\n’ 表示的, 一个只包含换行符的字符串.

>>> f.readline()
'This is the first line of the file.\n'
>>> f.readline()
'Second line of the file\n'
>>> f.readline()
''

For reading lines from a file, you can loop over the file object. This is memory efficient, fast, and leads to simple code:

>>> for line in f:
...     print(line, end='')
...
This is the first line of the file.
Second line of the file

为了从文件里一行行读取, 你可以循环文件对象. 这是内存高效, 快速, 并且简化代码的:

>>> for line in f:
...     print(line, end='')             # 文件自带换行, end参数用空字符串即可
...
This is the first line of the file.
Second line of the file

If you want to read all the lines of a file in a list you can also use list(f) or f.readlines().
如果你想读取文件所有的行到一个列表里, 你可以使用 list(f) 或 f.readlines().

f.write(string) writes the contents of string to the file, returning the number of characters written.

>>> f.write('This is a test\n')
15

f.write(string) 写入字符串内容到文件, 返回已写入的字符数.

>>> f.write('This is a test\n')
15

调用 f.close() 后才会写入磁盘.

Other types of objects need to be converted – either to a string (in text mode) or a bytes object (in binary mode) – before writing them:

>>> value = ('the answer', 42)
>>> s = str(value) # convert the tuple to string
>>> f.write(s)
18

其他对象类型在写之前需要转换为字符串(文本模式)或者字节对象(二进制模式):

>>> value = ('the answer', 42)
>>> s = str(value) # convert the tuple to string
>>> f.write(s)
18

f.tell() returns an integer giving the file object’s current position in the file represented as number of bytes from the beginning of the file when in binary mode and an opaque number when in text mode.
f.tell()返回一个整数, 给出文件对象在文件中的当前位置, 表示为二进制模式下文件从开头的字节数(比特数), 或文本模式下的 opaque number.

To change the file object’s position, use f.seek(offset, from_what). The position is computed from adding offset to a reference point; the reference point is selected by the from_what argument. A from_what value of 0 measures from the beginning of the file, 1 uses the current file position, and 2 uses the end of the file as the reference point. from_what can be omitted and defaults to 0, using the beginning of the file as the reference point.
要改变文件对象的位置(指针), 可以使用 f.seek(offset, from_what). 指针计算从参考点添加的偏移量; 参考点由 from_what 参数选择. from_what 值为 0 意为着从文件头开始, 1 使用当前文件指针, 2 使用文件尾为参考点. from_what 可以被省略, 默认值为 0, 使用文件头为参考点.

>>> f = open('workfile', 'rb+')
>>> f.write(b'0123456789abcdef')
16
>>> f.seek(5) # Go to the 6th byte in the file
5
>>> f.read(1)
b'5'
>>> f.seek(-3, 2) # Go to the 3rd byte before the end
13
>>> f.read(1)
b'd'

In text files (those opened without a b in the mode string), only seeks relative to the beginning of the file are allowed (the exception being seeking to the very file end with seek(0, 2)) and the only valid offset values are those returned from the f.tell(), or zero. Any other offset value produces undefined behaviour.
在文本文件(打开是模式符没有 b)中, 只允许相对文件头寻找(用 seek(0, 2)来寻找文件尾是种例外), 而且有效的偏移量只能是从 f.tell() 返回的值, 或者零. 其他的偏移量将产生无法定义的行为.

File objects have some additional methods, such as isatty() and truncate() which are less frequently used; consult the Library Reference for a complete guide to file objects.
文件对象还有些附加方法, 比如 isatty() 和 truncate(), 这些都很少用到; 参考 库手册 了解全面的文件对象指南.

7.2.2 Saving structured data with json

7.2.2 使用 json 存储结构化数据

Strings can easily be written to and read from a file. Numbers take a bit more effort, since the read() method only returns strings, which will have to be passed to a function like int(), which takes a string like ‘123’ and returns its numeric value 123. When you want to save more complex data types like nested lists and dictionaries, parsing and serializing by hand becomes complicated.
字符串可以很容易地从文件中被读写. 数字得多花点心思, 由于 read() 方法只返回字符串, 这将不得不使用像 int() 这类的函数, 它处理类似 ‘123’ 字符串, 然后返回它的数学值 123. 当你想保存更复杂的数据类型, 如嵌套列表和字典时, 手动解析和序列化是很复杂的.

Rather than having users constantly writing and debugging code to save complicated data types to files, Python allows you to use the popular data interchange format called JSON (JavaScript Object Notation). The standard module called json can take Python data hierarchies, and convert them to string representations; this process is called serializing. Reconstructing the data from the string representation is called deserializing. Between serializing and deserializing, the string representing the object may have been stored in a file or data, or sent over a network connection to some distant machine.
相比于让用户不断编写和调试代码以将复杂的数据类型保存到文件中, Python 允许你使用常用的数据交换格式 JSON (JavaScript Object Notation). 标准模块 json 可以接受 Python 数据结构, 并转换它们为字符串表现形式; 这个过程称为__序列化__. 从字符串形式来重新构建数据的过程称为__反序列化__. 通过序列化和反序列化, 表示对象的字符串可以存储在文件或数据中, 也可以通过网络连接传递给远程机器.

Note: The JSON format is commonly used by modern applications to allow for data exchange. Many programmers are already familiar with it, which makes it a good choice for interoperability.
注意: JSON 格式经常用在现代程序中进行数据交换. 许多程序员已经熟悉它了, 这使得其成为交互操作的良好选择

If you have an object x, you can view its JSON string representation with a simple line of code:

>>> import json
>>> x = [1, 'simple', 'list']
>>> json.dumps(x)
'[1, "simple", "list"]'

如果你有一个对象 x, 你可以用简单的一行代码查看其 JSON 字符串表示形式:

>>> import json
>>> x = [1, 'simple', 'list']
>>> json.dumps(x)
'[1, "simple", "list"]'

Another variant of the dumps() function, called dump(), simply serializes the object to a text file. So if f is a text file object opened for writing, we can do this:

json.dump(x, f)

函数 dumps 的另一变种, 称为 dump(), 简单地序列化对象到一个文本文件. 所以, 如果 f 是一个为写而打开的文本文件对象, 我们可以如此做:

json.dump(x, f)

To decode the object again, if f is a text file object which has been opened for reading:

x = json.load(f)

要再次解码对象, 当 f 是已为读打开的文件文件对象时:

x = json.load(f)

This simple serialization technique can handle lists and dictionaries, but serializing arbitrary class instances in JSON requires a bit of extra effort. The reference for the json module contains an explanation of this.
简单的序列化技术可以处理列表和字典, 但序列任意类的实例化到 JSON 需要一点点额外工夫. json 模块手册包含有关详细解释.

See also:
参见

pickle - the pickle module
pickle - pickle 模块

Contrary to JSON, pickle is a protocol which allows the serialization of arbitrarily complex Python objects. As such, it is specific to Python and cannot be used to communicate with applications written in other languages. It is also insecure by default: deserializing pickle data coming from an untrusted source can execute arbitrary code, if the data was crafted by a skilled attacker.
不同于 JSON, pickle 是一个协议, 它允许任意复杂 Python 对象的序列化. 因此, 它特定用于 Python, 而不能用在与其他语言编写的程序间通信. 它默认情况下也是不安全的: 如果数据由熟练的攻击者精心设计, 反序列化来自一个不受信任源的 pickle 数据可以执行任意代码.


CHAPTER
EIGHT


ERRORS AND EXCEPTIONS

错误和异常

Until now error messages haven’t been more than mentioned, but if you have tried out the examples you have probably seen some. There are (at least) two distinguishable kinds of errors: syntax errors and exceptions.
到目前为止, 错误信息还没有被更多地被提起, 但如果你已经试过了这些例子, 你也许已经看到了一些. 这里(至少)有两种可区分的错误: 语法错误和异常.

8.1 Syntax Errors

语法错误

Syntax errors, also known as parsing errors, are perhaps the most common kind of complaint you get while you are still learning Python:

>>> while True print('Hello world')
  File "", line 1
    while True print('Hello world')
                   ^
SyntaxError: invalid syntax

语法错误, 也被称为解析错误, 也许是你遇到毛病中最常见的种类在你还在学习 Python时:

>>> while True print('Hello world')
  File "", line 1
    while True print('Hello world')
                   ^
SyntaxError: invalid syntax

The parser repeats the offending line and displays a little ‘arrow’ pointing at the earliest point in the line where the error was detected. The error is caused by (or at least detected at) the token preceding the arrow: in the example, the error is detected at the function print(), since a colon (':') is missing before it. File name and line number are printed so you know where to look in case the input came from a script.
解析器指出违规行, 并显示一个指向检测到错误的行中最前点的’箭头’. 错误是由箭头前面的标记(或至少在其处检测到)引起的: 在这个例子中, 错误检测到函数 print(), 由于在它之前缺少冒号(':'). 文件名和行号被打印出来, 这样你就知道哪里去查找问题, 如果是从脚本输入的.

8.2 Exceptions

8.2 异常

Even if a statement or expression is syntactically correct, it may cause an error when an attempt is made to execute it. Errors detected during execution are called exceptions and are not unconditionally fatal: you will soon learn how to handle them in Python programs. Most exceptions are not handled by programs, however, and result in error messages as shown here:

>>> 10 * (1/0)
Traceback (most recent call last):
  File "", line 1, in <module>
ZeroDivisionError: division by zero
>>> 4 + spam*3
Traceback (most recent call last):
  File "", line 1, in <module>
NameError: name 'spam' is not defined
>>> '2' + 2
Traceback (most recent call last):
  File "", line 1, in <module>
TypeError: Can't convert 'int' object to str implicitly

即使语句或表达式是语法正确的, 尝试执行时也可能会引发错误. 执行期间检测到错误被称为异常且不会无条件崩溃: 你将很快学到如何在 Python 程序中处理它们. 然而, 大多数异常不是有程序处理的, 错误信息结果想这里展示的样:

>>> 10 * (1/0)
Traceback (most recent call last):
  File "", line 1, in <module>
ZeroDivisionError: division by zero
>>> 4 + spam*3
Traceback (most recent call last):
  File "", line 1, in <module>
NameError: name 'spam' is not defined
>>> '2' + 2
Traceback (most recent call last):
  File "", line 1, in <module>
TypeError: Can't convert 'int' object to str implicitly

The last line of the error message indicates what happened. Exceptions come in different types, and the type is printed as part of the message: the types in the example are ZeroDivisionError, NameError and TypeError. The string printed as the exception type is the name of the built-in exception that occurred. This is true for all built-in exceptions, but need not be true for user-defined exceptions (although it is a useful convention). Standard exception names are built-in identifiers (not reserved keywords).
错误信息得最后一行指示发生了什么. 异常有不同的类型, 这个类型是信息打印的一部分: 在例子中类型有 ZeroDivisionError, NameError 和 TypeError. 打印作为异常类型的字符串是内置异常的名称来显示. 对于所有内置异常都是如此, 但对于用户定义的异常不一定是真的(虽然它是常用的习惯). 标准异常名称是内置标识符(非保留关键字).

The rest of the line provides detail based on the type of exception and what caused it.
这一行后一部分提供基于异常类型的详情和造成的原因.

The preceding part of the error message shows the context where the exception happened, in the form of a stack traceback. In general it contains a stack traceback listing source lines; however, it will not display lines read from standard input.
错误信息得前部分以堆栈追溯形式展示了发生异常地方的上下文. 一般来说, 它包含一个堆栈追溯, 列出源代码行; 然而, 它不会显示从标准输入里读取的行.

bltin-exceptions lists the built-in exceptions and their meanings.
内置异常列出内置的异常及其含义.

8.3 Handling Exceptions

8.3 处理错误

It is possible to write programs that handle selected exceptions. Look at the following example, which asks the user for input until a valid integer has been entered, but allows the user to interrupt the program (using Control-C or whatever the operating system supports); note that a user-generated interruption is signalled by raising the KeyboardInterrupt exception.

>>> while True:
...     try:
...         x = int(input("Please enter a number: "))
...         break
...     except ValueError:
...         print("Oops! That was no valid number. Try again...")
...

写程序来处理选择的异常是可行的. 请看下面的示例, 它要求用户一直输入, 直到键入一个有效的整数为止, 但允许用户打断程序(使用 Ctrl-C, 或任何操作系统支持的方式); 注意, 这里通过引发KeyboardInterrupt异常来指示用户生成的中断.

>>> while True:
...     try:
...         x = int(input("Please enter a number: "))
...         break
...     except ValueError:
...         print("Oops! That was no valid number. Try again...")
...

The try statement works as follows.

  • First, the try clause (the statement(s) between the try and except keywords) is executed.
  • If no exception occurs, the except clause is skipped and execution of the try statement is finished.
  • If an exception occurs during execution of the try clause, the rest of the clause is skipped. Then if its type matches the exception named after the except keyword, the except clause is executed, and then execution continues after the try statement.
  • If an exception occurs which does not match the exception named in the except clause, it is passed on to outer try statements; if no handler is found, it is an unhandled exception and execution stops with a message as shown above.
    try 语句工作如下.
  • 首先, try 子句(关键字 try 和 except 之间的语句)被执行.
  • 如果异常没有出现, 跳过 except 子句, 结束try 语句的执行.
  • 如果异常出现在 try 子句执行中, 余下的子句被跳过. 然后, 如果它的类型匹配上了 except 关键字后面异常名, 执行 except 子句, 然后接着继续执行之后的 try 语句.
  • 如果异常出现却没有匹配上 except 自己的异常名, 它将被传递给外部的try语句; 如果没有发现处理程序, 那么它就是一个无处理异常, 会停止执行并在下面显示信息.

A try statement may have more than one except clause, to specify handlers for different exceptions. At most one handler will be executed. Handlers only handle exceptions that occur in the corresponding try clause, not in other handlers of the same try statement. An except clause may name multiple exceptions as a parenthesized tuple, for example:

... except (RuntimeError, TypeError, NameError):
...     pass

try 语句可以有超过一个 except 子句, 为不同的异常指定处理程序. 至多只有一种处理程序将被执行. 处理程序仅处理相应 try 子句中发生的异常, 而不处理同一try语句的其他处理程序中的异常. except 子句可以将多个异常命名为带括号的元组, 例如:

... except (RuntimeError, TypeError, NameError):
...     pass

A class in an except clause is compatible with an exception if it is the same class or a base class thereof (but not the other way around — an except clause listing a derived class is not compatible with a base class). For example, the following code will print B, C, D in that order:

class B(Exception):
    pass
class C(B):
    pass
class D(C):
    pass
for cls in [B, C, D]:
    try:
        raise cls()
    except D:
        print("D")
    except C:
        print("C")
    except B:
        print("B")

如果 except 子句是相同的类或其基类, 则 except 子句中的类与异常兼容(不过不是相反–列出派生类的except子句与基类不兼容). 例如, 下面的代码将以此打印 B, C, D:

class B(Exception):
    pass
class C(B):
    pass
class D(C):
    pass
for cls in [B, C, D]:
    try:
        raise cls()
    except D:
        print("D")
    except C:
        print("C")
    except B:
        print("B")

不懂啥意思, 得好好研究下… .

Note that if the except clauses were reversed (with except B first), it would have printed B, B, B — the first matching except clause is triggered.
注意, 如果 except 子句是反向的(即 except B 在第一个), 它会打印B, B, B --首先匹配上的 except 子句会被触发.

The last except clause may omit the exception name(s), to serve as a wildcard. Use this with extreme caution, since it is easy to mask a real programming error in this way! It can also be used to print an error message and then re-raise the exception (allowing a caller to handle the exception as well):

import sys
    try:
        f = open('myfile.txt')
        s = f.readline()
        i = int(s.strip())
    except OSError as err:
        print("OS error: {0}".format(err))
    except ValueError:
        print("Could not convert data to an integer.")
    except:
        print("Unexpected error:", sys.exc_info()[0])
        raise

最右一个 except 子句可以省略异常命名, 以作为通配符使用. 由于这种方式很容易地掩盖真正的编程错误, 请谨慎使用! 它也可以用来打印一个错误信息, 然后重新抛出异常(允许调用者也处理异常):

import sys
    try:
        f = open('myfile.txt')
        s = f.readline()
        i = int(s.strip())
    except OSError as err:
        print("OS error: {0}".format(err))
    except ValueError:
        print("Could not convert data to an integer.")
    except:
        print("Unexpected error:", sys.exc_info()[0])
        raise

The try … except statement has an optional else clause, which, when present, must follow all except clauses. It is useful for code that must be executed if the try clause does not raise an exception. For example:

for arg in sys.argv[1:]:
    try:
        f = open(arg, 'r')
    except OSError:
        print('cannot open', arg)
    else:
        print(arg, 'has', len(f.readlines()), 'lines')
        f.close()

try … except 语句有一个可选的 else 子句, 该子句只能出现在所有 except 子句之后. 如果 try 子句没有抛出异常, 它是很有用的对于必须执行的代码. 示例:

for arg in sys.argv[1:]:
    try:
        f = open(arg, 'r')
    except OSError:
        print('cannot open', arg)
    else:
        print(arg, 'has', len(f.readlines()), 'lines')
        f.close()

The use of the else clause is better than adding additional code to the try clause because it avoids accidentally catching an exception that wasn’t raised by the code being protected by the try … except statement.
使用 else 子句比在try子句中添加其他代码更好, 因为它避免意外捕获异常, 这个异常不是由代码的 try … except 语句保护的.

When an exception occurs, it may have an associated value, also known as the exception’s argument. The presence and type of the argument depend on the exception type.
当一个异常发生时, 它可能有连带的值, 也被称为异常的参数. 参数的存在和类型取决于异常类型.

The except clause may specify a variable after the exception name. The variable is bound to an exception instance with the arguments stored in instance.args. For convenience, the exception instance defines __str__() so the arguments can be printed directly without having to reference .args. One may also instantiate an exception first before raising it and add any attributes to it as desired.

>>> try:
...     raise Exception('spam', 'eggs')
... except Exception as inst:
...     print(type(inst))   # the exception instance
...     print(inst.args)    # arguments stored in .args
...     print(inst)         # __str__ allows args to be printed directly,
...                         # but may be overridden in exception subclasses
...     x, y = inst.args    # unpack args
...     print('x =', x)
...     print('y =', y)
...
<class 'Exception'>
('spam', 'eggs')
('spam', 'eggs')
x = spam
y = eggs

except 子句可以指定变量在异常名后. 该变量绑定到一个异常实例, 其参数存储在 instance.args 中. 为了方便, 异常实例定义了__str__(), 这样参数可以直接被打印, 不需要引用 .args. 也可以在抛出异常之前先实例化异常, 并根据需要向其添加任何属性.

>>> try:
...     raise Exception('spam', 'eggs')
... except Exception as inst:
...     print(type(inst))   # the exception instance
...     print(inst.args)    # arguments stored in .args
...     print(inst)         # __str__ allows args to be printed directly,
...                         # but may be overridden in exception subclasses
...     x, y = inst.args    # unpack args
...     print('x =', x)
...     print('y =', y)
...
<class 'Exception'>
('spam', 'eggs')
('spam', 'eggs')
x = spam
y = eggs

If an exception has arguments, they are printed as the last part ('detail) of the message for unhandled exceptions.
如果异常有参数, 它们作为无处理异常的最后一部分(‘详情’)被打印.

Exception handlers don’t just handle exceptions if they occur immediately in the try clause, but also if they occur inside functions that are called (even indirectly) in the try clause. For example:

>>> def this_fails():
...     x = 1/0
...
>>> try:
...     this_fails()
... except ZeroDivisionError as err:
...     print('Handling run-time error:', err)
...
Handling run-time error: division by zero

异常处理程序不仅仅处理那些立即出现在 try 子句中的异常, 它也处理出现在 try 子句调用函数内部(甚至是间接的)出现的异常. 例如:

>>> def this_fails():
...     x = 1/0
...
>>> try:
...     this_fails()
... except ZeroDivisionError as err:
...     print('Handling run-time error:', err)
...
Handling run-time error: division by zero

8.4 Raising Exceptions

8.4 抛出异常

The raise statement allows the programmer to force a specified exception to occur. For example:

>>> raise NameError('HiThere')
Traceback (most recent call last):
  File "", line 1, in <module>
NameError: HiThere

raise 语句允许程序员强制抛出一个指定的异常. 例如:

>>> raise NameError('HiThere')
Traceback (most recent call last):
  File "", line 1, in <module>
NameError: HiThere

The sole argument to raise indicates the exception to be raised. This must be either an exception instance or an exception class (a class that derives from Exception). If an exception class is passed, it will be implicitly instantiated by calling its constructor with no arguments:

raise ValueError # shorthand for 'raise ValueError()'

唯一参数 raise 标识抛出异常. 这必须有一个异常示例或一个异常类(派生自Exception的类). 如果传递了异常类, 它将会通过调用无参数的构造函数来隐式地实例化.

raise ValueError # 'raise ValueError()' 的速记

If you need to determine whether an exception was raised but don’t intend to handle it, a simpler form of the raise statement allows you to re-raise the exception:

>>> try:
...     raise NameError('HiThere')
... except NameError:
...     print('An exception flew by!')
...     raise
...
An exception flew by!
Traceback (most recent call last):
  File "", line 2, in <module>
NameError: HiThere

如果你需要确定异常是否被抛出但又不想去处理掉它, raise 语句的简单形式允许你重新抛出异常:

>>> try:
...     raise NameError('HiThere')
... except NameError:
...     print('An exception flew by!')
...     raise
...
An exception flew by!
Traceback (most recent call last):
  File "", line 2, in <module>
NameError: HiThere

8.5 User-defined Exceptions

8.5 用户自定义异常

Programs may name their own exceptions by creating a new exception class (see Classes for more about Python classes). Exceptions should typically be derived from the Exception class, either directly or indirectly.
程序可以创建新的异常类来命名它们自己的异常(参见 类 了解更多关于 Python 类). 异常一般应该直接或间接从 Exception 类派生.

Exception classes can be defined which do anything any other class can do, but are usually kept simple, often only offering a number of attributes that allow information about the error to be extracted by handlers for the exception. When creating a module that can raise several distinct errors, a common practice is to create a base class for exceptions defined by that module, and subclass that to create specific exception classes for different error conditions:

class Error(Exception):
    """Base class for exceptions in this module."""
    pass

class InputError(Error):
    """Exception raised for errors in the input.
    
    Attributes:
        expression -- input expression in which the error occurred
        message -- explanation of the error
    """

    def __init__(self, expression, message):
        self.expression = expression
        self.message = message

class TransitionError(Error):
    """Raised when an operation attempts a state transition that's not
    allowed.

    Attributes:
        previous -- state at beginning of transition
        next -- attempted new state
        message -- explanation of why the specific transition is not allowed
    """

    def __init__(self, previous, next, message):
        self.previous = previous
        self.next = next
        self.message = message

Exception 类可以被定义为做任何事任何其他类可以做的, 但一般保持简单, 经常只提供数个属性, 允许错误有关信息通过异常处理句柄来提取. 当创建的模块需要抛出数个不同的错误时, 一个通常的作法是为模块创建一个基本异常类定义, 然后子类去创建特定异常类应对不同的错误条件:

class Error(Exception):
    """Base class for exceptions in this module."""
    pass

class InputError(Error):
    """Exception raised for errors in the input.
    
    Attributes:
        expression -- input expression in which the error occurred
        message -- explanation of the error
    """

    def __init__(self, expression, message):
        self.expression = expression
        self.message = message

class TransitionError(Error):
    """Raised when an operation attempts a state transition that's not
    allowed.

    Attributes:
        previous -- state at beginning of transition
        next -- attempted new state
        message -- explanation of why the specific transition is not allowed
    """

    def __init__(self, previous, next, message):
        self.previous = previous
        self.next = next
        self.message = message

Most exceptions are defined with names that end in “Error”, similar to the naming of the standard exceptions.
大多数异常以结尾是 “Error” 来定义, 类似于标准异常的命名.

Many standard modules define their own exceptions to report errors that may occur in functions they define. More information on classes is presented in chapter Classes.
许多标准模块都定义了它们自己的异常去报告错误, 这些可能出现在它们的函数定义里. 关于类的更多信息在 类 一章呈现.

8.6 Defining Clean-up Actions

8.6 定义清理行为

The try statement has another optional clause which is intended to define clean-up actions that must be executed under all circumstances. For example:

>>> try:
...     raise KeyboardInterrupt
... finally:
...     print('Goodbye, world!')
...
Goodbye, world!
Traceback (most recent call last):
  File "", line 2, in <module>
KeyboardInterrupt

try 语句还要另外一个可选子句(finally, 译注), 意在定义清理行为, 它是在任何情况下都必须要执行. 示例:

>>> try:
...     raise KeyboardInterrupt
... finally:
...     print('Goodbye, world!')
...
Goodbye, world!
Traceback (most recent call last):
  File "", line 2, in <module>
KeyboardInterrupt

A finally clause is always executed before leaving the try statement, whether an exception has occurred or not. When an exception has occurred in the try clause and has not been handled by an except clause (or it has occurred in an except or else clause), it is re-raised after the finally clause has been executed. The finally clause is also executed “on the way out” when any other clause of the try statement is left via a break, continue or return statement. A more complicated example:

>>> def divide(x, y):
...     try:
...         result = x / y
...     except ZeroDivisionError:
...         print("division by zero!")
...     else:
...         print("result is", result)
...     finally:
...         print("executing finally clause")
...
>>> divide(2, 1)
result is 2.0
executing finally clause
>>> divide(2, 0)
division by zero!
executing finally clause
>>> divide("2", "1")
executing finally clause
Traceback (most recent call last):
  File "", line 1, in <module>
  File "", line 3, in divide
TypeError: unsupported operand type(s) for /: 'str' and 'str'

finally 子句在离开 try 语句前总会执行, 无论异常是否出现. 当 try 子句出现异常且没用 except 子句处理(或异常出现在 except 或 else 子句), 异常会在 finally 子句执行后重新抛出. finally 子句也会执行在"出去的路上": try 语句的任何其他子句通过 break, continue 或 return 语句退出时. 一个更复杂的例子:

>>> def divide(x, y):
...     try:
...         result = x / y
...     except ZeroDivisionError:
...         print("division by zero!")
...     else:
...         print("result is", result)
...     finally:
...         print("executing finally clause")
...
>>> divide(2, 1)
result is 2.0
executing finally clause
>>> divide(2, 0)
division by zero!
executing finally clause
>>> divide("2", "1")
executing finally clause
Traceback (most recent call last):
  File "", line 1, in <module>
  File "", line 3, in divide
TypeError: unsupported operand type(s) for /: 'str' and 'str'

8.7 Predefined Clean-up Actions

8.7 预定义清理行为

Some objects define standard clean-up actions to be undertaken when the object is no longer needed, regardless of whether or not the operation using the object succeeded or failed. Look at the following example, which tries to open a file and print its contents to the screen.

for line in open("myfile.txt"):
    print(line, end="")

一些对象定义了要执行的标准清理行为在当不再需要该对象时, 无论使用对象的操作是成功还是失败. 查看下面的例子, 它尝试打开一个文件并打印内容到屏幕上.

The problem with this code is that it leaves the file open for an indeterminate amount of time after this part of the code has finished executing. This is not an issue in simple scripts, but can be a problem for larger applications. The with statement allows objects like files to be used in a way that ensures they are always cleaned up promptly and correctly.

with open("myfile.txt") as f:
    for line in f:
        print(line, end="")

这段代码的问题是这代码部分已经执行完毕之后, 它在不确定的时间关闭文件打开. 这不是一个问题在简单脚本中, 但会是大应用程序的问题. with 语句使得文件之类的对象以这种方式使用: 确保它们总是及时正确地被清理.

with open("myfile.txt") as f:
    for line in f:
        print(line, end="")

After the statement is executed, the file f is always closed, even if a problem was encountered while processing the lines. Objects which, like files, provide predefined clean-up actions will indicate this in their documentation.
语句执行后, 文件对象 f 总是被关闭的, 甚至是当处理线遇到问题时. 与文件一样, 是否提供预定义清理操作的对象将在其文档中标识出来.


  1. In fact function definitions are also ‘statements’ that are ‘executed’; the execution of a module-level function definition
    enters the function name in the module’s global symbol table. ↩︎

你可能感兴趣的:(学习笔记,python)