Python has several built-in modules and functions for handling files. These functions are spread out over several modules such as os
, os.path
, shutil
, and pathlib
, to name a few. This article gathers in one place many of the functions you need to know in order to perform the most common operations on files in Python.
Python有几个内置的模块和函数来处理文件。 这些功能分布在几个模块上,例如os
, os.path
, shutil
和pathlib
,仅举几例。 本文将您需要了解的许多功能集中在一起,以便对Python中的文件执行最常见的操作。
In this tutorial, you’ll learn how to:
在本教程中,您将学习如何:
fileinput
modulefileinput
模块打开多个文件 Free Bonus: 5 Thoughts On Python Mastery, a free course for Python developers that shows you the roadmap and the mindset you’ll need to take your Python skills to the next level.
免费奖金: 关于Python精通的5个想法 ,这是针对Python开发人员的免费课程,向您展示了将Python技能提升到新水平所需的路线图和心态。
Reading and writing data to files using Python is pretty straightforward. To do this, you must first open files in the appropriate mode. Here’s an example of how to open a text file and read its contents:
使用Python读取和写入数据非常简单。 为此,您必须首先以适当的模式打开文件。 这是一个如何打开文本文件并读取其内容的示例:
with with openopen (( 'data.txt''data.txt' , , 'r''r' ) ) as as ff :
:
data data = = ff .. readread ()
()
open()
takes a filename and a mode as its arguments. r
opens the file in read only mode. To write data to a file, pass in w
as an argument instead:
open()
以文件名和模式作为参数。 r
以只读模式打开文件。 要将数据写入文件,请传入w
作为参数:
In the examples above, open()
opens files for reading or writing and returns a file handle (f
in this case) that provides methods that can be used to read or write data to the file. Read Working With File I/O in Python for more information on how to read and write to files.
在上面的示例中, open()
打开文件以进行读取或写入,并返回文件句柄(在本例中为f
,该句柄提供了可用于向文件读取或写入数据的方法。 阅读Python中的使用文件I / O,了解有关如何读写文件的更多信息。
Suppose your current working directory has a subdirectory called my_directory
that has the following contents:
假设您当前的工作目录有一个名为my_directory
的子目录,其内容如下:
.
.
├── file1.py
├── file1.py
├── file2.csv
├── file2.csv
├── file3.txt
├── file3.txt
├── sub_dir
├── sub_dir
│ ├── bar.py
│ ├── bar.py
│ └── foo.py
│ └── foo.py
├── sub_dir_b
├── sub_dir_b
│ └── file4.txt
│ └── file4.txt
└── sub_dir_c
└── sub_dir_c
├── config.py
├── config.py
└── file5.txt
└── file5.txt
The built-in os
module has a number of useful functions that can be used to list directory contents and filter the results. To get a list of all the files and folders in a particular directory in the filesystem, use os.listdir()
in legacy versions of Python or os.scandir()
in Python 3.x. os.scandir()
is the preferred method to use if you also want to get file and directory properties such as file size and modification date.
内置的os
模块具有许多有用的功能,可用于列出目录内容和过滤结果。 要获取文件系统中特定目录中所有文件和文件夹的列表,请在旧版Python中使用os.listdir()
或在Python 3.x中使用os.scandir()
。 如果您还想获取文件和目录属性(例如文件大小和修改日期os.scandir()
则使用os.scandir()
是首选方法。
In versions of Python prior to Python 3, os.listdir()
is the method to use to get a directory listing:
在Python 3之前的Python版本中, os.listdir()
是用于获取目录列表的方法:
>>> import os
>>> entries = os . listdir ( 'my_directory/' )
os.listdir()
returns a Python list containing the names of the files and subdirectories in the directory given by the path argument:
os.listdir()
返回一个Python列表,其中包含path参数给定的目录中文件和子目录的名称:
>>> os . listdir ( 'my_directory/' )
['sub_dir_c', 'file1.py', 'sub_dir_b', 'file3.txt', 'file2.csv', 'sub_dir']
A directory listing like that isn’t easy to read. Printing out the output of a call to os.listdir()
using a loop helps clean things up:
这样的目录清单不容易阅读。 使用循环将对os.listdir()
的调用输出打印出来有助于清理:
>>> entries = os . listdir ( 'my_directory/' )
>>> for entry in entries :
... print ( entry )
...
...
sub_dir_c
file1.py
sub_dir_b
file3.txt
file2.csv
sub_dir
In modern versions of Python, an alternative to os.listdir()
is to use os.scandir()
and pathlib.Path()
.
在现代版本的Python中, os.listdir()
的替代方法是使用os.scandir()
和pathlib.Path()
。
os.scandir()
was introduced in Python 3.5 and is documented in PEP 471. os.scandir()
returns an iterator as opposed to a list when called:
os.scandir()
在Python 3.5中引入,并在PEP 471中进行了记录 。 os.scandir()
返回一个与列表相反的迭代器:
>>> import os
>>> entries = os . scandir ( 'my_directory/' )
>>> entries
The ScandirIterator
points to all the entries in the current directory. You can loop over the contents of the iterator and print out the filenames:
ScandirIterator
指向当前目录中的所有条目。 您可以遍历迭代器的内容并打印出文件名:
Here, os.scandir()
is used in conjunction with the with
statement because it supports the context manager protocol. Using a context manager closes the iterator and frees up acquired resources automatically after the iterator has been exhausted. The result is a print out of the filenames in my_directory/
just like you saw in the os.listdir()
example:
此处, os.scandir()
与with
语句一起使用with
因为它支持上下文管理器协议。 使用上下文管理器将关闭迭代器,并在耗尽迭代器后自动释放所获取的资源。 结果是打印出my_directory/
中的文件名,就像在os.listdir()
示例中看到的那样:
sub_dir_c
sub_dir_c
file1.py
file1.py
sub_dir_b
sub_dir_b
file3.txt
file3.txt
file2.csv
file2.csv
sub_dir
sub_dir
Another way to get a directory listing is to use the pathlib
module:
获取目录列表的另一种方法是使用pathlib
模块:
The objects returned by Path
are either PosixPath
or WindowsPath
objects depending on the OS.
根据操作系统, Path
返回的对象是PosixPath
或WindowsPath
对象。
pathlib.Path()
objects have an .iterdir()
method for creating an iterator of all files and folders in a directory. Each entry yielded by .iterdir()
contains information about the file or directory such as its name and file attributes. pathlib
was first introduced in Python 3.4 and is a great addition to Python that provides an object oriented interface to the filesystem.
pathlib.Path()
对象具有.iterdir()
方法,用于创建目录中所有文件和文件夹的迭代器。 .iterdir()
产生的每个条目都包含有关文件或目录的信息,例如其名称和文件属性。 pathlib
是在Python 3.4中引入的,它是对Python的重要补充,它为文件系统提供了面向对象的接口。
In the example above, you call pathlib.Path()
and pass a path argument to it. Next is the call to .iterdir()
to get a list of all files and directories in my_directory
.
在上面的示例中,您调用pathlib.Path()
并将路径参数传递给它。 接下来是对.iterdir()
的调用,以获取.iterdir()
中所有文件和目录的my_directory
。
pathlib
offers a set of classes featuring most of the common operations on paths in an easy, object-oriented way. Using pathlib
is more if not equally efficient as using the functions in os
. Another benefit of using pathlib
over os
is that it reduces the number of imports you need to make to manipulate filesystem paths. For more information, read Python 3’s pathlib Module: Taming the File System.
pathlib
提供了一组类,这些类以一种简单的,面向对象的方式包含了路径上的大多数常见操作。 使用pathlib
比使用os
的函数效率更高。 与os
,使用pathlib
另一个好处是它减少了处理文件系统路径所需的导入次数。 有关更多信息,请阅读Python 3的pathlib模块:驯服文件系统 。
Running the code above produces the following:
运行上面的代码将产生以下结果:
sub_dir_c
sub_dir_c
file1.py
file1.py
sub_dir_b
sub_dir_b
file3.txt
file3.txt
file2.csv
file2.csv
sub_dir
sub_dir
Using pathlib.Path()
or os.scandir()
instead of os.listdir()
is the preferred way of getting a directory listing, especially when you’re working with code that needs the file type and file attribute information. pathlib.Path()
offers much of the file and path handling functionality found in os
and shutil
, and it’s methods are more efficient than some found in these modules. We will discuss how to get file properties shortly.
使用pathlib.Path()
或os.scandir()
而不是os.listdir()
是获取目录列表的首选方法,尤其是在使用需要文件类型和文件属性信息的代码时。 pathlib.Path()
提供了在os
和shutil
发现的许多文件和路径处理功能,并且其方法比在这些模块中发现的方法更有效。 我们将很快讨论如何获取文件属性。
Here are the directory-listing functions again:
这里又是目录列表功能:
Function | 功能 | Description | 描述 |
---|---|---|---|
os.listdir() os.listdir() |
Returns a list of all files and folders in a directory | 返回目录中所有文件和文件夹的列表 | |
os.scandir() os.scandir() |
Returns an iterator of all the objects in a directory including file attribute information | 返回目录中所有对象的迭代器,包括文件属性信息 | |
pathlib.Path.iterdir() pathlib.Path.iterdir() |
Returns an iterator of all the objects in a directory including file attribute information | 返回目录中所有对象的迭代器,包括文件属性信息 |
These functions return a list of everything in the directory, including subdirectories. This might not always be the behavior you want. The next section will show you how to filter the results from a directory listing.
这些函数返回目录中所有内容的列表,包括子目录。 这可能并不总是您想要的行为。 下一节将向您展示如何从目录列表中过滤结果。
This section will show you how to print out the names of files in a directory using os.listdir()
, os.scandir()
, and pathlib.Path()
. To filter out directories and only list files from a directory listing produced by os.listdir()
, use os.path
:
本节将向您展示如何使用os.listdir()
, os.scandir()
和pathlib.Path()
打印目录中文件的名称。 要过滤掉目录并仅列出os.listdir()
产生的目录列表中的文件,请使用os.path
:
Here, the call to os.listdir()
returns a list of everything in the specified path, and then that list is filtered by os.path.isfile()
to only print out files and not directories. This produces the following output:
在这里,对os.listdir()
的调用返回指定路径中所有内容的列表,然后该列表由os.path.isfile()
过滤,仅打印出文件,而不打印目录。 这将产生以下输出:
file1.py
file1.py
file3.txt
file3.txt
file2.csv
file2.csv
An easier way to list files in a directory is to use os.scandir()
or pathlib.Path()
:
列出目录中文件的更简单方法是使用os.scandir()
或pathlib.Path()
:
Using os.scandir()
has the advantage of looking cleaner and being easier to understand than using os.listdir()
, even though it is one line of code longer. Calling entry.is_file()
on each item in the ScandirIterator
returns True
if the object is a file. Printing out the names of all files in the directory gives you the following output:
使用os.scandir()
有寻找更清洁,更易于使用比了解的优势os.listdir()
即使它是一个代码行更长。 如果对象是文件,则在ScandirIterator
每个项目上调用entry.is_file()
ScandirIterator
返回True
。 打印出目录中所有文件的名称将为您提供以下输出:
file1.py
file1.py
file3.txt
file3.txt
file2.csv
file2.csv
Here’s how to list files in a directory using pathlib.Path()
:
这是使用pathlib.Path()
列出目录中文件的方法:
Here, you call .is_file()
on each entry yielded by .iterdir()
. The output produced is the same:
在这里,您对.is_file()
产生的每个条目调用.iterdir()
。 产生的输出是相同的:
file1.py
file1.py
file3.txt
file3.txt
file2.csv
file2.csv
The code above can be made more concise if you combine the for
loop and the if
statement into a single generator expression. Dan Bader has an excellent article on generator expressions and list comprehensions.
如果将for
循环和if
语句组合到单个生成器表达式中,则可以使上面的代码更简洁。 Dan Bader在有关生成器表达式和列表理解的文章中非常出色 。
The modified version looks like this:
修改后的版本如下所示:
This produces exactly the same output as the example before it. This section showed that filtering files or directories using os.scandir()
and pathlib.Path()
feels more intuitive and looks cleaner than using os.listdir()
in conjunction with os.path
.
这将产生与之前示例完全相同的输出。 本节显示,与结合使用os.listdir()
和os.path
相比,使用os.scandir()
和pathlib.Path()
过滤文件或目录的感觉更直观,看起来更干净。
To list subdirectories instead of files, use one of the methods below. Here’s how to use os.listdir()
and os.path()
:
要列出子目录而不是文件,请使用以下方法之一。 这是使用os.listdir()
和os.path()
:
import import os
os
# List all subdirectories using os.listdir
# List all subdirectories using os.listdir
basepath basepath = = 'my_directory/'
'my_directory/'
for for entry entry in in osos .. listdirlistdir (( basepathbasepath ):
):
if if osos .. pathpath .. isdirisdir (( osos .. pathpath .. joinjoin (( basepathbasepath , , entryentry )):
)):
printprint (( entryentry )
)
Manipulating filesystem paths this way can quickly become cumbersome when you have multiple calls to os.path.join()
. Running this on my computer produces the following output:
当您多次调用os.path.join()
时,以这种方式操作文件系统路径可能很快变得很麻烦。 在我的计算机上运行此命令将产生以下输出:
Here’s how to use os.scandir()
:
这是使用os.scandir()
:
import import os
os
# List all subdirectories using scandir()
# List all subdirectories using scandir()
basepath basepath = = 'my_directory/'
'my_directory/'
with with osos .. scandirscandir (( basepathbasepath ) ) as as entriesentries :
:
for for entry entry in in entriesentries :
:
if if entryentry .. is_diris_dir ():
():
printprint (( entryentry .. namename )
)
As in the file listing example, here you call .is_dir()
on each entry returned by os.scandir()
. If the entry is a directory, .is_dir()
returns True
, and the directory’s name is printed out. The output is the same as above:
就像在文件列表示例中一样,在这里您对.is_dir()
返回的每个条目调用os.scandir()
。 如果条目是目录,则.is_dir()
返回True
,并打印出目录名。 输出与上面相同:
Here’s how to use pathlib.Path()
:
这是使用pathlib.Path()
:
from from pathlib pathlib import import Path
Path
# List all subdirectory using pathlib
# List all subdirectory using pathlib
basepath basepath = = PathPath (( 'my_directory/''my_directory/' )
)
for for entry entry in in basepathbasepath .. iterdiriterdir ():
():
if if entryentry .. is_diris_dir ():
():
printprint (( entryentry .. namename )
)
Calling .is_dir()
on each entry of the basepath
iterator checks if an entry is a file or a directory. If the entry is a directory, its name is printed out to the screen, and the output produced is the same as the one from the previous example:
调用.is_dir()
上的每个条目basepath
迭代器检查是否一个输入是文件还是目录。 如果条目是目录,则其名称将显示在屏幕上,并且产生的输出与上一个示例中的输出相同:
Python makes retrieving file attributes such as file size and modified times easy. This is done through os.stat()
, os.scandir()
, or pathlib.Path()
.
Python使检索文件属性(例如文件大小和修改时间)变得容易。 这是通过os.stat()
, os.scandir()
或pathlib.Path()
。
os.scandir()
and pathlib.Path()
retrieve a directory listing with file attributes combined. This can be potentially more efficient than using os.listdir()
to list files and then getting file attribute information for each file.
os.scandir()
和pathlib.Path()
检索结合了文件属性的目录列表。 这可能比使用os.listdir()
列出文件然后获取每个文件的文件属性信息更有效。
The examples below show how to get the time the files in my_directory/
were last modified. The output is in seconds:
下面的示例显示如何获取my_directory/
中文件的最后修改时间。 输出以秒为单位:
>>> import os
>>> with os.scandir('my_directory/') as dir_contents:
... for entry in dir_contents:
... info = entry.stat()
... print(info.st_mtime)
...
1539032199.0052035
1539032469.6324475
1538998552.2402923
1540233322.4009316
1537192240.0497339
1540266380.3434134
os.scandir()
returns a ScandirIterator
object. Each entry in a ScandirIterator
object has a .stat()
method that retrieves information about the file or directory it points to. .stat()
provides information such as file size and the time of last modification. In the example above, the code prints out the st_mtime
attribute, which is the time the content of the file was last modified.
os.scandir()
返回一个ScandirIterator
对象。 ScandirIterator
对象中的每个条目都有一个.stat()
方法,该方法检索有关它指向的文件或目录的信息。 .stat()
提供诸如文件大小和最后修改时间之类的信息。 在上面的示例中,代码输出了st_mtime
属性,该属性是文件内容的最后修改时间。
The pathlib
module has corresponding methods for retrieving file information that give the same results:
pathlib
模块具有检索文件信息的相应方法,这些方法给出相同的结果:
>>> from pathlib import Path
>>> current_dir = Path('my_directory')
>>> for path in current_dir.iterdir():
... info = path.stat()
... print(info.st_mtime)
...
1539032199.0052035
1539032469.6324475
1538998552.2402923
1540233322.4009316
1537192240.0497339
1540266380.3434134
In the example above, the code loops through the object returned by .iterdir()
and retrieves file attributes through a .stat()
call for each file in the directory list. The st_mtime
attribute returns a float value that represents seconds since the epoch. To convert the values returned by st_mtime
for display purposes, you could write a helper function to convert the seconds into a datetime
object:
在上面的示例中,代码循环遍历.iterdir()
返回的对象,并通过.stat()
调用为目录列表中的每个文件检索文件属性。 st_mtime
属性返回一个浮点值,该值表示自epoch以来的秒数 。 为了转换st_mtime
返回的值以进行显示,可以编写一个辅助函数,将秒转换为datetime
对象:
from from datetime datetime import import datetime
datetime
from from os os import import scandir
scandir
def def convert_dateconvert_date (( timestamptimestamp ):
):
d d = = datetimedatetime .. utcfromtimestamputcfromtimestamp (( timestamptimestamp )
)
formated_date formated_date = = dd .. strftimestrftime (( '' %d%d %b %Y' %b %Y' )
)
return return formated_date
formated_date
def def get_filesget_files ():
():
dir_entries dir_entries = = scandirscandir (( 'my_directory/''my_directory/' )
)
for for entry entry in in dir_entriesdir_entries :
:
if if entryentry .. is_fileis_file ():
():
info info = = entryentry .. statstat ()
()
printprint (( ff '' {entry.name}{entry.name} tt Last Modified: {convert_date(info.st_mtime)}' Last Modified: {convert_date(info.st_mtime)}' )
)
This will first get a list of files in my_directory
and their attributes and then call convert_date()
to convert each file’s last modified time into a human readable form. convert_date()
makes use of .strftime()
to convert the time in seconds into a string.
这将首先获取my_directory
中的文件及其属性的列表,然后调用convert_date()
将每个文件的最后修改时间转换为人类可读的形式。 convert_date()
利用.strftime()
将以秒为单位的时间转换为字符串。
The arguments passed to .strftime()
are the following:
传递给.strftime()
的参数如下:
%d
: the day of the month%b
: the month, in abbreviated form%Y
: the year%d
:每月的某天 %b
:月份(缩写形式) %Y
:年份 Together, these directives produce output that looks like this:
这些指令一起产生的输出如下所示:
>>> get_files ()
file1.py Last modified: 04 Oct 2018
file3.txt Last modified: 17 Sep 2018
file2.txt Last modified: 17 Sep 2018
The syntax for converting dates and times into strings can be quite confusing. To read more about it, check out the official documentation on it. Another handy reference that is easy to remember is http://strftime.org/ .
将日期和时间转换为字符串的语法可能会非常混乱。 要了解更多信息,请查看其官方文档 。 另一个易于记忆的便捷参考是http://strftime.org/ 。
Sooner or later, the programs you write will have to create directories in order to store data in them. os
and pathlib
include functions for creating directories. We’ll consider these:
迟早要编写的程序必须创建目录才能在其中存储数据。 os
和pathlib
包含用于创建目录的功能。 我们将考虑以下因素:
Function | 功能 | Description | 描述 |
---|---|---|---|
os.mkdir() os.mkdir() |
Creates a single subdirectory | 创建一个子目录 | |
pathlib.Path.mkdir() pathlib.Path.mkdir() |
Creates single or multiple directories | 创建单个或多个目录 | |
os.makedirs() os.makedirs() |
Creates multiple directories, including intermediate directories | 创建多个目录,包括中间目录 |
To create a single directory, pass a path to the directory as a parameter to os.mkdir()
:
要创建一个目录,请将目录路径作为参数传递给os.mkdir()
:
If a directory already exists, os.mkdir()
raises FileExistsError
. Alternatively, you can create a directory using pathlib
:
如果目录已经存在,则os.mkdir()
引发FileExistsError
。 另外,您可以使用pathlib
创建目录:
from from pathlib pathlib import import Path
Path
p p = = PathPath (( 'example_directory/''example_directory/' )
)
pp .. mkdirmkdir ()
()
If the path already exists, mkdir()
raises a FileExistsError
:
如果路径已经存在,则mkdir()
引发FileExistsError
:
>>> p . mkdir ()
Traceback (most recent call last):
File '', line 1, in
File '/usr/lib/python3.5/pathlib.py', line 1214, in mkdir
self._accessor.mkdir(self, mode)
File '/usr/lib/python3.5/pathlib.py', line 371, in wrapped
return strfunc(str(pathobj), *args)
FileExistsError : [Errno 17] File exists: '.'
[Errno 17] File exists: '.'
To avoid errors like this, catch the error when it happens and let your user know:
为避免此类错误,请在发生错误时捕获错误 ,并让您的用户知道:
Alternatively, you can ignore the FileExistsError
by passing the exist_ok=True
argument to .mkdir()
:
或者,你可以忽略FileExistsError
通过将exist_ok=True
参数.mkdir()
from from pathlib pathlib import import Path
Path
p p = = PathPath (( 'example_directory''example_directory' )
)
pp .. mkdirmkdir (( exist_okexist_ok == TrueTrue )
)
This will not raise an error if the directory already exists.
如果目录已经存在,这不会引发错误。
os.makedirs()
is similar to os.mkdir()
. The difference between the two is that not only can os.makedirs()
create individual directories, it can also be used to create directory trees. In other words, it can create any necessary intermediate folders in order to ensure a full path exists.
os.makedirs()
与os.mkdir()
类似。 两者的区别在于os.makedirs()
不仅可以创建单个目录,还可以用于创建目录树。 换句话说,它可以创建任何必要的中间文件夹,以确保存在完整路径。
os.makedirs()
is similar to running mkdir -p
in Bash. For example, to create a group of directories like 2018/10/05
, all you have to do is the following:
os.makedirs()
与在Bash中运行mkdir -p
相似。 例如,要创建一组类似于2018/10/05
的目录,您需要做的是以下操作:
This will create a nested directory structure that contains the folders 2018, 10, and 05:
这将创建一个嵌套目录结构,其中包含文件夹2018、10和05:
.
.
└── 2018
└── 2018
└── 10
└── 10
└── 05
└── 05
.makedirs()
creates directories with default permissions. If you need to create directories with different permissions call .makedirs()
and pass in the mode you would like the directories to be created in:
.makedirs()
创建具有默认权限的目录。 如果需要创建具有不同权限的目录,请调用.makedirs()
并采用该模式,您希望在以下目录中创建目录:
This creates the 2018/10/05
directory structure and gives the owner and group users read, write, and execute permissions. The default mode is 0o777
, and the file permission bits of existing parent directories are not changed. For more details on file permissions, and how the mode is applied, see the docs.
这将创建2018/10/05
目录结构,并为所有者和组用户提供读取,写入和执行权限。 默认模式是0o777
,并且现有父目录的文件许可权位不会更改。 有关文件权限以及如何应用此模式的更多详细信息, 请参阅docs 。
Run tree
to confirm that the right permissions were applied:
运行tree
以确认已应用正确的权限:
$ tree -p -i .
$ tree -p -i .
.
.
[drwxrwx---] 2018
[drwxrwx---] 2018
[drwxrwx---] 10
[drwxrwx---] 10
[drwxrwx---] 05
[drwxrwx---] 05
This prints out a directory tree of the current directory. tree
is normally used to list contents of directories in a tree-like format. Passing the -p
and -i
arguments to it prints out the directory names and their file permission information in a vertical list. -p
prints out the file permissions, and -i
makes tree
produce a vertical list without indentation lines.
这将打印出当前目录的目录树。 tree
通常用于以树状格式列出目录的内容。 将-p
和-i
参数传递给它会在垂直列表中打印出目录名称及其文件许可信息。 -p
打印出文件许可权, -i
使tree
产生没有缩进线的垂直列表。
As you can see, all of the directories have 770
permissions. An alternative way to create directories is to use .mkdir()
from pathlib.Path
:
如您所见,所有目录都具有770
权限。 创建目录的另一种方法是使用.mkdir()
从pathlib.Path
:
Passing parents=True
to Path.mkdir()
makes it create the directory 05
and any parent directories necessary to make the path valid.
将Path.mkdir()
parents=True
传递给Path.mkdir()
使其创建目录05
和使该路径有效所需的任何父目录。
By default, os.makedirs()
and Path.mkdir()
raise an OSError
if the target directory already exists. This behavior can be overridden (as of Python 3.2) by passing exist_ok=True
as a keyword argument when calling each function.
默认情况下,如果目标目录已存在,则os.makedirs()
和Path.mkdir()
引发OSError
。 可以在调用每个函数时通过将exist_ok=True
作为关键字参数传递来覆盖此行为(从Python 3.2开始)。
Running the code above produces a directory structure like the one below in one go:
运行上面的代码会一次性生成类似于下面的目录结构:
.
.
└── 2018
└── 2018
└── 10
└── 10
└── 05
└── 05
I prefer using pathlib
when creating directories because I can use the same function to create single or nested directories.
我更喜欢在创建目录时使用pathlib
,因为我可以使用相同的功能来创建单个目录或嵌套目录。
After getting a list of files in a directory using one of the methods above, you will most probably want to search for files that match a particular pattern.
使用上述方法之一获取目录中的文件列表后,您很可能希望搜索与特定模式匹配的文件。
These are the methods and functions available to you:
这些是您可以使用的方法和功能:
endswith()
and startswith()
string methodsfnmatch.fnmatch()
glob.glob()
pathlib.Path.glob()
endswith()
和startswith()
字符串方法 fnmatch.fnmatch()
glob.glob()
pathlib.Path.glob()
Each of these is discussed below. The examples in this section will be performed on a directory called some_directory
that has the following structure:
这些每一个都在下面讨论。 本节中的示例将在名为some_directory
的目录上执行,该目录具有以下结构:
If you’re following along using a Bash shell, you can create the above directory structure using the following commands:
如果要继续使用Bash Shell,则可以使用以下命令创建以上目录结构:
$ mkdir some_directory
$ mkdir some_directory
$ $ cd some_directory/
cd some_directory/
$ mkdir sub_dir
$ mkdir sub_dir
$ touch sub_dir/file1.py sub_dir/file2.py
$ touch sub_dir/file1.py sub_dir/file2.py
$ touch data_$ touch data_ {{ 01..0301 ..03 }.txt data_} .txt data_ {{ 01..0301 ..03 }_backup.txt admin.py tests.py
} _backup.txt admin.py tests.py
This will create the some_directory/
directory, change into it, and then create sub_dir
. The next line creates file1.py
and file2.py
in sub_dir
, and the last line creates all the other files using expansion. To learn more about shell expansion, visit this site.
这将创建some_directory/
目录,切换到该目录,然后创建sub_dir
。 下一行创建file1.py
和file2.py
在sub_dir
,最后行用扩展的所有其他文件。 要了解有关外壳扩展的更多信息,请访问此网站 。
Python has several built-in methods for modifying and manipulating strings. Two of these methods, .startswith()
and .endswith()
, are useful when you’re searching for patterns in filenames. To do this, first get a directory listing and then iterate over it:
Python有几种内置的方法来修改和操作字符串 。 当您在文件名中搜索模式时, .startswith()
和.endswith()
这两个方法很有用。 为此,首先获取目录列表,然后遍历该目录:
>>> import os
>>> # Get .txt files
>>> for f_name in os . listdir ( 'some_directory' ):
... if f_name . endswith ( '.txt' ):
... print ( f_name )
The code above finds all the files in some_directory/
, iterates over them and uses .endswith()
to print out the filenames that have the .txt
file extension. Running this on my computer produces the following output:
上面的代码在some_directory/
查找所有文件, some_directory/
迭代,然后使用.endswith()
打印出扩展名为.txt
文件名。 在我的计算机上运行此命令将产生以下输出:
fnmatch
简单文件名模式匹配 (Simple Filename Pattern Matching Using fnmatch
)String methods are limited in their matching abilities. fnmatch
has more advanced functions and methods for pattern matching. We will consider fnmatch.fnmatch()
, a function that supports the use of wildcards such as *
and ?
to match filenames. For example, in order to find all .txt
files in a directory using fnmatch
, you would do the following:
字符串方法的匹配能力有限。 fnmatch
具有更高级的模式匹配功能和方法。 我们将考虑fnmatch.fnmatch()
,该函数支持使用通配符(例如*
和?
匹配文件名。 例如,为了使用fnmatch
查找目录中的所有.txt
文件,您可以执行以下操作:
>>> import os
>>> import fnmatch
for file_name in os.listdir('some_directory/'):
if fnmatch.fnmatch(file_name, '*.txt'):
print(file_name)
This iterates over the list of files in some_directory
and uses .fnmatch()
to perform a wildcard search for files that have the .txt
extension.
这会遍历some_directory
中的文件列表,并使用.fnmatch()
对具有.txt
扩展名的文件执行通配符搜索。
Let’s suppose you want to find .txt
files that meet certain criteria. For example, you could be only interested in finding .txt
files that contain the word data
, a number between a set of underscores, and the word backup
in their filename. Something similar to data_01_backup
, data_02_backup
, or data_03_backup
.
假设您要查找满足某些条件的.txt
文件。 例如,您可能只想查找包含单词data
,一组下划线之间的数字以及文件名中的单词backup
.txt
文件。 类似data_01_backup
, data_02_backup
或data_03_backup
。
Using fnmatch.fnmatch()
, you could do it this way:
使用fnmatch.fnmatch()
,您可以这样做:
>>> for filename in os.listdir('.'):
... if fnmatch.fnmatch(filename, 'data_*_backup.txt'):
... print(filename)
Here, you print only the names of files that match the data_*_backup.txt
pattern. The asterisk in the pattern will match any character, so running this will find all text files whose filenames start with the word data
and end in backup.txt
, as you can see from the output below:
在这里,您仅打印与data_*_backup.txt
模式匹配的文件名。 模式中的星号将与任何字符匹配,因此运行该命令将找到所有文件名以单词data
开头并以backup.txt
结尾的文本文件,如下面的输出所示:
data_03_backup.txt
data_03_backup.txt
data_02_backup.txt
data_02_backup.txt
data_01_backup.txt
data_01_backup.txt
glob
文件名模式匹配 (Filename Pattern Matching Using glob
)Another useful module for pattern matching is glob
.
模式匹配的另一个有用模块是glob
。
.glob()
in the glob
module works just like fnmatch.fnmatch()
, but unlike fnmatch.fnmatch()
, it treats files beginning with a period (.
) as special.
.glob()
中glob
模块的工作原理就像fnmatch.fnmatch()
但与fnmatch.fnmatch()
它把一个周期(开头的文件.
)特殊。
UNIX and related systems translate name patterns with wildcards like ?
and *
into a list of files. This is called globbing.
UNIX和相关系统使用通配符(如?
转换名称模式?
和*
到文件列表中。 这称为通配符。
For example, typing mv *.py python_files/
in a UNIX shell moves (mv
) all files with the .py
extension from the current directory to the directory python_files
. The *
character is a wildcard that means “any number of characters,” and *.py
is the glob pattern. This shell capability is not available in the Windows Operating System. The glob
module adds this capability in Python, which enables Windows programs to use this feature.
例如,在UNIX shell中键入mv *.py python_files/
将所有扩展名为.py
文件从当前目录移动( mv
)到目录python_files
。 *
字符是通配符,表示“任意数量的字符”,而*.py
是通配符模式。 Windows操作系统中没有此外壳功能。 glob
模块在Python中添加了此功能,使Windows程序可以使用此功能。
Here’s an example of how to use glob
to search for all Python (.py
) source files in the current directory:
这是有关如何使用glob
搜索当前目录中所有Python( .py
)源文件的示例:
>>> import glob
>>> glob . glob ( '*.py' )
['admin.py', 'tests.py']
glob.glob('*.py')
searches for all files that have the .py
extension in the current directory and returns them as a list. glob
also supports shell-style wildcards to match patterns:
glob.glob('*.py')
搜索当前目录中所有扩展名为.py
文件,并将它们作为列表返回。 glob
还支持shell样式的通配符以匹配模式:
>>> import glob
>>> for name in glob . glob ( '*[0-9]*.txt' ):
... print ( name )
This finds all text (.txt
) files that contain digits in the filename:
这将查找文件名中包含数字的所有文本( .txt
)文件:
glob
makes it easy to search for files recursively in subdirectories too:
glob
使在子目录中递归搜索文件变得容易:
>>> import glob
>>> for file in glob.iglob('**/*.py', recursive=True):
... print(file)
This example makes use of glob.iglob()
to search for .py
files in the current directory and subdirectories. Passing recursive=True
as an argument to .iglob()
makes it search for .py
files in the current directory and any subdirectories. The difference between glob.iglob()
and glob.glob()
is that .iglob()
returns an iterator instead of a list.
本示例使用glob.iglob()
在当前目录和子目录中搜索.py
文件。 将recursive=True
作为参数传递给.iglob()
使其可以在当前目录和任何子目录中搜索.py
文件。 之间的差glob.iglob()
和glob.glob()
是.iglob()
返回一个迭代而不是列表。
Running the program above produces the following:
运行上面的程序将产生以下结果:
admin.py
admin.py
tests.py
tests.py
sub_dir/file1.py
sub_dir/file1.py
sub_dir/file2.py
sub_dir/file2.py
pathlib
contains similar methods for making flexible file listings. The example below shows how you can use .Path.glob()
to list file types that start with the letter p
:
pathlib
包含用于生成灵活文件列表的类似方法。 下面的示例显示如何使用.Path.glob()
列出以字母p
开头的文件类型:
>>> from pathlib import Path
>>> p = Path ( '.' )
>>> for name in p . glob ( '*.p*' ):
... print ( name )
admin.py
scraper.py
docs.pdf
Calling p.glob('*.p*')
returns a generator object that points to all files in the current directory that start with the letter p
in their file extension.
调用p.glob('*.p*')
返回一个生成器对象,该对象指向当前目录中所有以文件扩展名中的字母p
开头的文件。
Path.glob()
is similar to os.glob()
discussed above. As you can see, pathlib
combines many of the best features of the os
, os.path
, and glob
modules into one single module, which makes it a joy to use.
Path.glob()
是类似于os.glob()
如上所述。 如您所见, pathlib
将os
, os.path
和glob
模块的许多最佳功能组合到一个模块中,这使它使用起来很pathlib
。
To recap, here is a table of the functions we have covered in this section:
回顾一下,这是我们在本节中介绍的功能的表格:
Function | 功能 | Description | 描述 |
---|---|---|---|
startswith() startswith() |
True or True 或False False |
||
endswith() endswith() |
True or True 或False False |
||
fnmatch.fnmatch(filename, pattern) fnmatch.fnmatch(filename, pattern) |
True or True 或False False |
||
glob.glob() glob.glob() |
Returns a list of filenames that match a pattern | 返回与模式匹配的文件名列表 | |
pathlib.Path.glob() pathlib.Path.glob() |
Finds patterns in path names and returns a generator object | 查找路径名中的模式并返回生成器对象 |
A common programming task is walking a directory tree and processing files in the tree. Let’s explore how the built-in Python function os.walk()
can be used to do this. os.walk()
is used to generate filename in a directory tree by walking the tree either top-down or bottom-up. For the purposes of this section, we’ll be manipulating the following directory tree:
常见的编程任务是遍历目录树并在该树中处理文件。 让我们探索如何使用内置的Python函数os.walk()
来完成此任务。 os.walk()
用于通过自上而下或自下而上浏览目录树来生成文件名。 就本节而言,我们将操作以下目录树:
The following is an example that shows you how to list all files and directories in a directory tree using os.walk()
.
以下是一个示例,向您展示如何使用os.walk()
列出目录树中的所有文件和目录。
os.walk()
defaults to traversing directories in a top-down manner:
os.walk()
默认以自上而下的方式遍历目录:
# Walking a directory tree and printing the names of the directories and files
# Walking a directory tree and printing the names of the directories and files
for for dirpathdirpath , , dirnamesdirnames , , files files in in osos .. walkwalk (( '.''.' ):
):
printprint (( ff 'Found directory: 'Found directory: {dirpath}{dirpath} '' )
)
for for file_name file_name in in filesfiles :
:
printprint (( file_namefile_name )
)
os.walk()
returns three values on each iteration of the loop:
os.walk()
在循环的每次迭代中返回三个值:
The name of the current folder
A list of folders in the current folder
A list of files in the current folder
当前文件夹的名称
当前文件夹中的文件夹列表
当前文件夹中的文件列表
On each iteration, it prints out the names of the subdirectories and files it finds:
在每次迭代中,它都会打印出所找到的子目录和文件的名称:
To traverse the directory tree in a bottom-up manner, pass in a topdown=False
keyword argument to os.walk()
:
要以自底向上的方式遍历目录树,请将topdown=False
关键字参数传递给os.walk()
:
for for dirpathdirpath , , dirnamesdirnames , , files files in in osos .. walkwalk (( '.''.' , , topdowntopdown == FalseFalse ):
):
printprint (( ff 'Found directory: 'Found directory: {dirpath}{dirpath} '' )
)
for for file_name file_name in in filesfiles :
:
printprint (( file_namefile_name )
)
Passing the topdown=False
argument will make os.walk()
print out the files it finds in the subdirectories first:
传递topdown=False
参数将使os.walk()
打印出在子目录中找到的文件:
As you can see, the program started by listing the contents of the subdirectories before listing the contents of the root directory. This is very useful in situations where you want to recursively delete files and directories. You will learn how to do this in the sections below. By default, os.walk
does not walk down into symbolic links that resolve to directories. This behavior can be overridden by calling it with a followlinks=True
argument.
如您所见,该程序首先列出子目录的内容,然后再列出根目录的内容。 这在您要递归删除文件和目录的情况下非常有用。 您将在以下各节中了解如何执行此操作。 默认情况下, os.walk
不会进入解析为目录的符号链接。 可以通过使用followlinks=True
参数调用它来覆盖此行为。
Python provides a handy module for creating temporary files and directories called tempfile
.
Python提供了一个方便的模块,用于创建称为tempfile
临时文件和目录。
tempfile
can be used to open and store data temporarily in a file or directory while your program is running. tempfile
handles the deletion of the temporary files when your program is done with them.
程序运行时,可以使用tempfile
将数据打开和存储在文件或目录中。 当您的程序完成后, tempfile
处理临时文件的删除。
Here’s how to create a temporary file:
以下是创建临时文件的方法:
from from tempfile tempfile import import TemporaryFile
TemporaryFile
# Create a temporary file and write some data to it
# Create a temporary file and write some data to it
fp fp = = TemporaryFileTemporaryFile (( 'w+t''w+t' )
)
fpfp .. writewrite (( 'Hello universe!''Hello universe!' )
)
# Go back to the beginning and read data from file
# Go back to the beginning and read data from file
fpfp .. seekseek (( 00 )
)
data data = = fpfp .. readread ()
()
# Close the file, after which it will be removed
# Close the file, after which it will be removed
fpfp .. closeclose ()
()
The first step is to import TemporaryFile
from the tempfile
module. Next, create a file like object using the TemporaryFile()
method by calling it and passing the mode you want to open the file in. This will create and open a file that can be used as a temporary storage area.
第一步是从tempfile
模块导入TemporaryFile
。 接下来,通过调用TemporaryFile()
方法并传递您要在其中打开文件的模式,使用TemporaryFile()
方法创建类似于对象的文件。这将创建并打开一个可用作临时存储区的文件。
In the example above, the mode is 'w+t'
, which makes tempfile
create a temporary text file in write mode. There is no need to give the temporary file a filename since it will be destroyed after the script is done running.
在上面的示例中,模式为'w+t'
,这使tempfile
在写入模式下创建一个临时文本文件。 无需为临时文件提供文件名,因为在脚本运行完成后它将被销毁。
After writing to the file, you can read from it and close it when you’re done processing it. Once the file is closed, it will be deleted from the filesystem. If you need to name the temporary files produced using tempfile
, use tempfile.NamedTemporaryFile()
.
写入文件后,您可以读取文件并在完成处理后将其关闭。 关闭文件后,将从文件系统中删除该文件。 如果需要命名使用tempfile
生成的临时文件,请使用tempfile.NamedTemporaryFile()
。
The temporary files and directories created using tempfile
are stored in a special system directory for storing temporary files. Python searches a standard list of directories to find one that the user can create files in.
使用tempfile
创建的临时文件和目录存储在用于存储临时文件的特殊系统目录中。 Python搜索标准目录列表以查找用户可以在其中创建文件的目录。
On Windows, the directories are C:TEMP
, C:TMP
, TEMP
, and TMP
, in that order. On all other platforms, the directories are /tmp
, /var/tmp
, and /usr/tmp
, in that order. As a last resort, tempfile
will save temporary files and directories in the current directory.
在Windows上,该目录依次为C:TEMP
, C:TMP
, TEMP
和TMP
。 在所有其他平台上,目录分别是/tmp
, /var/tmp
和/usr/tmp
。 作为最后的选择, tempfile
将临时文件和目录保存在当前目录中。
.TemporaryFile()
is also a context manager so it can be used in conjunction with the with
statement. Using a context manager takes care of closing and deleting the file automatically after it has been read:
.TemporaryFile()
还是上下文管理器,因此可以与with
语句结合使用。 使用上下文管理器负责在读取文件后自动关闭和删除文件:
This creates a temporary file and reads data from it. As soon as the file’s contents are read, the temporary file is closed and deleted from the file system.
这将创建一个临时文件并从中读取数据。 读取文件内容后,便会关闭临时文件并从文件系统中删除该文件。
tempfile
can also be used to create temporary directories. Let’s look at how you can do this using tempfile.TemporaryDirectory()
:
tempfile
也可以用于创建临时目录。 让我们看看如何使用tempfile.TemporaryDirectory()
进行此操作:
>>> import tempfile
>>> with tempfile.TemporaryDirectory() as tmpdir:
... print('Created temporary directory ', tmpdir)
... os.path.exists(tmpdir)
...
Created temporary directory /tmp/tmpoxbkrm6c
True
>>> # Directory contents have been removed
...
>>> tmpdir
'/tmp/tmpoxbkrm6c'
>>> os.path.exists(tmpdir)
False
Calling tempfile.TemporaryDirectory()
creates a temporary directory in the file system and returns an object representing this directory. In the example above, the directory is created using a context manager, and the name of the directory is stored in tmpdir
. The third line prints out the name of the temporary directory, and os.path.exists(tmpdir)
confirms if the directory was actually created in the file system.
调用tempfile.TemporaryDirectory()
在文件系统中创建一个临时目录,并返回一个代表该目录的对象。 在上面的示例中,使用上下文管理器创建目录,目录名称存储在tmpdir
。 第三行打印出临时目录的名称,然后os.path.exists(tmpdir)
确认该目录是否实际上是在文件系统中创建的。
After the context manager goes out of context, the temporary directory is deleted and a call to os.path.exists(tmpdir)
returns False
, which means that the directory was succesfully deleted.
上下文管理器脱离上下文后,将删除临时目录,并且对os.path.exists(tmpdir)
的调用将返回False
,这意味着该目录已成功删除。
You can delete single files, directories, and entire directory trees using the methods found in the os
, shutil
, and pathlib
modules. The following sections describe how to delete files and directories that you no longer need.
您可以使用os
, shutil
和pathlib
模块中的方法删除单个文件,目录和整个目录树。 以下各节介绍如何删除不再需要的文件和目录。
To delete a single file, use pathlib.Path.unlink()
, os.remove()
. or os.unlink()
.
要删除单个文件,请使用pathlib.Path.unlink()
, os.remove()
。 或os.unlink()
。
os.remove()
and os.unlink()
are semantically identical. To delete a file using os.remove()
, do the following:
os.remove()
和os.unlink()
在语义上是相同的。 要使用os.remove()
删除文件,请执行以下操作:
import import os
os
data_file data_file = = 'C:'C: UsersUsers vuyisilevuyisile DesktopDesktop TestTest data.txt'
data.txt'
osos .. removeremove (( data_filedata_file )
)
Deleting a file using os.unlink()
is similar to how you do it using os.remove()
:
删除使用文件os.unlink()
是类似于您使用如何做到这一点os.remove()
Calling .unlink()
or .remove()
on a file deletes the file from the filesystem. These two functions will throw an OSError
if the path passed to them points to a directory instead of a file. To avoid this, you can either check that what you’re trying to delete is actually a file and only delete it if it is, or you can use exception handling to handle the OSError
:
在文件上调用.unlink()
或.remove()
会从文件系统中删除该文件。 如果传递给它们的路径指向目录而不是文件,则这两个函数将引发OSError
。 为了避免这种情况,您可以检查要删除的文件是否确实是文件,然后仅将其删除,或者可以使用异常处理来处理OSError
:
import import os
os
data_file data_file = = 'home/data.txt'
'home/data.txt'
# If the file exists, delete it
# If the file exists, delete it
if if osos .. pathpath .. is_fileis_file (( data_filedata_file ):
):
osos .. removeremove (( data_filedata_file )
)
elseelse :
:
printprint (( ff 'Error: 'Error: {data_file}{data_file} not a valid filename' not a valid filename' )
)
os.path.is_file()
checks whether data_file
is actually a file. If it is, it is deleted by the call to os.remove()
. If data_file
points to a folder, an error message is printed to the console.
os.path.is_file()
检查data_file
是否实际上是一个文件。 如果是,则通过调用os.remove()
将其删除。 如果data_file
指向文件夹,则会在控制台上显示一条错误消息。
The following example shows how to use exception handling to handle errors when deleting files:
下面的示例显示删除文件时如何使用异常处理来处理错误:
The code above attempts to delete the file first before checking its type. If data_file
isn’t actually a file, the OSError
that is thrown is handled in the except
clause, and an error message is printed to the console. The error message that gets printed out is formatted using Python f-strings.
上面的代码尝试先检查文件的类型,然后再检查其类型。 如果data_file
实际上不是文件,则在except
子句中处理引发的OSError
,并将错误消息打印到控制台。 使用Python f-strings格式化输出的错误消息。
Finally, you can also use pathlib.Path.unlink()
to delete files:
最后,您还可以使用pathlib.Path.unlink()
删除文件:
from from pathlib pathlib import import Path
Path
data_file data_file = = PathPath (( 'home/data.txt''home/data.txt' )
)
trytry :
:
data_filedata_file .. unlinkunlink ()
()
except except IsADirectoryError IsADirectoryError as as ee :
:
printprint (( ff 'Error: 'Error: {data_file}{data_file} : : {e.strerror}{e.strerror} '' )
)
This creates a Path
object called data_file
that points to a file. Calling .remove()
on data_file
will delete home/data.txt
. If data_file
points to a directory, an IsADirectoryError
is raised. It is worth noting that the Python program above has the same permissions as the user running it. If the user does not have permission to delete the file, a PermissionError
is raised.
这将创建一个名为data_file
的Path
对象,该对象指向一个文件。 在data_file
上调用.remove()
将删除home/data.txt
。 如果data_file
指向目录, IsADirectoryError
引发IsADirectoryError
。 值得注意的是,上面的Python程序与运行它的用户具有相同的权限。 如果用户没有删除文件的PermissionError
则会引发PermissionError
。
The standard library offers the following functions for deleting directories:
标准库提供以下用于删除目录的功能:
os.rmdir()
pathlib.Path.rmdir()
shutil.rmtree()
os.rmdir()
pathlib.Path.rmdir()
shutil.rmtree()
To delete a single directory or folder, use os.rmdir()
or pathlib.rmdir()
. These two functions only work if the directory you’re trying to delete is empty. If the directory isn’t empty, an OSError
is raised. Here is how to delete a folder:
要删除单个目录或文件夹,请使用os.rmdir()
或pathlib.rmdir()
。 仅当您要删除的目录为空时,这两个功能才起作用。 如果目录不为空,则会引发OSError
。 以下是删除文件夹的方法:
Here, the trash_dir
directory is deleted by passing its path to os.rmdir()
. If the directory isn’t empty, an error message is printed to the screen:
在这里,通过将其路径传递给os.rmdir()
来删除trash_dir
目录。 如果目录不为空,则会在屏幕上显示一条错误消息:
Traceback (most recent call last):
File '', line 1, in
OSError: [Errno 39] Directory not empty: 'my_documents/bad_dir'
Alternatively, you can use pathlib
to delete directories:
另外,您可以使用pathlib
删除目录:
from from pathlib pathlib import import Path
Path
trash_dir trash_dir = = PathPath (( 'my_documents/bad_dir''my_documents/bad_dir' )
)
trytry :
:
trash_dirtrash_dir .. rmdirrmdir ()
()
except except OSError OSError as as ee :
:
printprint (( ff 'Error: 'Error: {trash_dir}{trash_dir} : : {e.strerror}{e.strerror} '' )
)
Here, you create a Path
object that points to the directory to be deleted. Calling .rmdir()
on the Path
object will delete it if it is empty.
在这里,您将创建一个Path
对象,该对象指向要删除的目录。 如果Path
对象为.rmdir()
,则将其删除。
To delete non-empty directories and entire directory trees, Python offers shutil.rmtree()
:
为了删除非空目录和整个目录树,Python提供了shutil.rmtree()
:
Everything in trash_dir
is deleted when shutil.rmtree()
is called on it. There may be cases where you want to delete empty folders recursively. You can do this using one of the methods discussed above in conjunction with os.walk()
:
在一切trash_dir
时被删除shutil.rmtree()
被调用就可以了。 在某些情况下,您需要递归删除空文件夹。 您可以使用上面讨论的方法之一与os.walk()
一起执行此操作:
import import os
os
for for dirpathdirpath , , dirnamesdirnames , , files files in in osos .. walkwalk (( '.''.' , , topdowntopdown == FalseFalse ):
):
trytry :
:
osos .. rmdirrmdir (( dirpathdirpath )
)
except except OSError OSError as as exex :
:
pass
pass
This walks down the directory tree and tries to delete each directory it finds. If the directory isn’t empty, an OSError
is raised and that directory is skipped. The table below lists the functions covered in this section:
这会沿目录树移动,并尝试删除找到的每个目录。 如果目录不为空,则会引发OSError
并跳过该目录。 下表列出了本节涵盖的功能:
Function | 功能 | Description | 描述 |
---|---|---|---|
os.remove() os.remove() |
Deletes a file and does not delete directories | 删除文件而不删除目录 | |
os.unlink() os.unlink() |
os.remove() and deletes a single fileos.remove() 相同,并删除单个文件 |
||
pathlib.Path.unlink() pathlib.Path.unlink() |
Deletes a file and cannot delete directories | 删除文件,不能删除目录 | |
os.rmdir() os.rmdir() |
Deletes an empty directory | 删除一个空目录 | |
pathlib.Path.rmdir() pathlib.Path.rmdir() |
Deletes an empty directory | 删除一个空目录 | |
shutil.rmtree() shutil.rmtree() |
Deletes entire directory tree and can be used to delete non-empty directories | 删除整个目录树,可用于删除非空目录 |
Python ships with the shutil
module. shutil
is short for shell utilities. It provides a number of high-level operations on files to support copying, archiving, and removal of files and directories. In this section, you’ll learn how to move and copy files and directories.
Python附带了shutil
模块。 shutil
是Shell实用程序的缩写。 它对文件提供了许多高级操作,以支持文件,目录的复制,归档和删除。 在本节中,您将学习如何移动和复制文件和目录。
shutil
offers a couple of functions for copying files. The most commonly used functions are shutil.copy()
and shutil.copy2()
. To copy a file from one location to another using shutil.copy()
, do the following:
shutil
提供了一些用于复制文件的功能。 最常用的函数是shutil.copy()
和shutil.copy2()
。 要使用shutil.copy()
将文件从一个位置复制到另一位置,请执行以下操作:
shutil.copy()
is comparable to the cp
command in UNIX based systems. shutil.copy(src, dst)
will copy the file src
to the location specified in dst
. If dst
is a file, the contents of that file are replaced with the contents of src
. If dst
is a directory, then src
will be copied into that directory. shutil.copy()
only copies the file’s contents and the file’s permissions. Other metadata like the file’s creation and modification times are not preserved.
shutil.copy()
与基于UNIX的系统中的cp
命令相当。 shutil.copy(src, dst)
会将文件src
复制到dst
指定的位置。 如果dst
是文件,则将该文件的内容替换为src
的内容。 如果dst
是目录,则src
将被复制到该目录中。 shutil.copy()
仅复制文件的内容和文件的权限。 不保留其他元数据,例如文件的创建和修改时间。
To preserve all file metadata when copying, use shutil.copy2()
:
要在复制时保留所有文件元数据,请使用shutil.copy2()
:
import import shutil
shutil
src src = = 'path/to/file.txt'
'path/to/file.txt'
dst dst = = 'path/to/dest_dir'
'path/to/dest_dir'
shutilshutil .. copy2copy2 (( srcsrc , , dstdst )
)
Using .copy2()
preserves details about the file such as last access time, permission bits, last modification time, and flags.
使用.copy2()
保留有关文件的详细信息,例如最后访问时间,权限位,最后修改时间和标志。
While shutil.copy()
only copies a single file, shutil.copytree()
will copy an entire directory and everything contained in it. shutil.copytree(src, dest)
takes two arguments: a source directory and the destination directory where files and folders will be copied to.
shutil.copy()
仅复制单个文件,而shutil.copytree()
将复制整个目录以及其中包含的所有内容。 shutil.copytree(src, dest)
具有两个参数:源目录和目标目录,文件和文件夹将被复制到该目录。
Here’s an example of how to copy the contents of one folder to a different location:
这是一个如何将一个文件夹的内容复制到另一个位置的示例:
>>> import shutil
>>> shutil . copytree ( 'data_1' , 'data1_backup' )
'data1_backup'
In this example, .copytree()
copies the contents of data_1
to a new location data1_backup
and returns the destination directory. The destination directory must not already exist. It will be created as well as missing parent directories. shutil.copytree()
is a good way to back up your files.
在此示例中, .copytree()
将data_1
的内容data_1
到新位置data1_backup
并返回目标目录。 目标目录必须不存在。 它会被创建以及丢失的父目录。 shutil.copytree()
是备份文件的好方法。
To move a file or directory to another location, use shutil.move(src, dst)
.
要将文件或目录移动到另一个位置,请使用shutil.move(src, dst)
。
src
is the file or directory to be moved and dst
is the destination:
src
是要移动的文件或目录, dst
是目标:
>>> import shutil
>>> shutil . move ( 'dir_1/' , 'backup/' )
'backup'
shutil.move('dir_1/', 'backup/')
moves dir_1/
into backup/
if backup/
exists. If backup/
does not exist, dir_1/
will be renamed to backup
.
shutil.move('dir_1/', 'backup/')
将dir_1/
移到backup/
如果存在backup/
。 如果backup/
不存在,则dir_1/
将重命名为backup
。
Python includes os.rename(src, dst)
for renaming files and directories:
Python包含os.rename(src, dst)
用于重命名文件和目录:
>>> os . rename ( 'first.zip' , 'first_01.zip' )
The line above will rename first.zip
to first_01.zip
. If the destination path points to a directory, it will raise an OSError
.
上面的行会将first.zip
重命名为first_01.zip
。 如果目标路径指向目录,则会引发OSError
。
Another way to rename files or directories is to use rename()
from the pathlib
module:
重命名文件或目录的另一种方法是使用pathlib
模块中的rename()
:
>>> from pathlib import Path
>>> data_file = Path ( 'data_01.txt' )
>>> data_file . rename ( 'data.txt' )
To rename files using pathlib
, you first create a pathlib.Path()
object that contains a path to the file you want to replace. The next step is to call rename()
on the path object and pass a new filename for the file or directory you’re renaming.
要使用pathlib
重命名文件,首先要创建一个pathlib.Path()
对象,该对象包含要替换的文件的路径。 下一步是在路径对象上调用rename()
,并为要重命名的文件或目录传递新的文件名。
Archives are a convenient way to package several files into one. The two most common archive types are ZIP and TAR. The Python programs you write can create, read, and extract data from archives. You will learn how to read and write to both archive formats in this section.
存档是将多个文件打包为一个文件的便捷方法。 两种最常见的存档类型是ZIP和TAR。 您编写的Python程序可以创建,读取和提取档案中的数据。 您将在本节中学习如何读写这两种归档格式。
The zipfile
module is a low level module that is part of the Python Standard Library. zipfile
has functions that make it easy to open and extract ZIP files. To read the contents of a ZIP file, the first thing to do is to create a ZipFile
object. ZipFile
objects are similar to file objects created using open()
. ZipFile
is also a context manager and therefore supports the with
statement:
zipfile
模块是一个低级模块,是Python标准库的一部分。 zipfile
具有使打开和提取ZIP文件变得容易的功能。 要读取ZIP文件的内容,首先要做的是创建一个ZipFile
对象。 ZipFile
对象类似于使用open()
创建的文件对象。 ZipFile
还是上下文管理器,因此支持with
语句:
Here, you create a ZipFile
object, passing in the name of the ZIP file to open in read mode. After opening a ZIP file, information about the archive can be accessed through functions provided by the zipfile
module. The data.zip
archive in the example above was created from a directory named data
that contains a total of 5 files and 1 subdirectory:
在这里,您将创建一个ZipFile
对象,传入ZIP文件的名称以在读取模式下打开。 打开ZIP文件后,可以通过zipfile
模块提供的功能访问有关存档的信息。 上例中的data.zip
归档文件是从名为data
的目录创建的,该目录总共包含5个文件和1个子目录:
.
.
├── file1.py
├── file1.py
├── file2.py
├── file2.py
├── file3.py
├── file3.py
└── sub_dir
└── sub_dir
├── bar.py
├── bar.py
└── foo.py
└── foo.py
1 directory, 5 files
1 directory, 5 files
To get a list of files in the archive, call namelist()
on the ZipFile
object:
要获取档案中的文件namelist()
,请在ZipFile
对象上调用namelist()
:
This produces a list:
这将产生一个列表:
['file1.py', 'file2.py', 'file3.py', 'sub_dir/', 'sub_dir/bar.py', 'sub_dir/foo.py']
['file1.py', 'file2.py', 'file3.py', 'sub_dir/', 'sub_dir/bar.py', 'sub_dir/foo.py']
.namelist()
returns a list of names of the files and directories in the archive. To retrieve information about the files in the archive, use .getinfo()
:
.namelist()
返回存档中文件和目录的名称列表。 要检索有关存档中文件的信息,请使用.getinfo()
:
Here’s the output:
这是输出:
15277
15277
.getinfo()
returns a ZipInfo
object that stores information about a single member of the archive. To get information about a file in the archive, you pass its path as an argument to .getinfo()
. Using getinfo()
, you’re able to retrieve information about archive members such as the date the files were last modified, their compressed sizes, and their full filenames. Accessing .file_size
retrieves the file’s original size in bytes.
.getinfo()
返回一个ZipInfo
对象,该对象存储有关存档的单个成员的信息。 要获取有关存档中文件的信息,请将其路径作为参数传递给.getinfo()
。 使用getinfo()
,您可以检索有关归档成员的信息,例如文件的上次修改日期,其压缩大小和完整文件名。 访问.file_size
检索文件的原始大小(以字节为单位)。
The following example shows how to retrieve more details about archived files in a Python REPL. Assume that the zipfile
module has been imported and bar_info
is the same object you created in previous examples:
以下示例显示了如何在Python REPL中检索有关存档文件的更多详细信息。 假设已导入zipfile
模块,并且bar_info
是您在前面的示例中创建的对象:
>>> bar_info . date_time
(2018, 10, 7, 23, 30, 10)
>>> bar_info . compress_size
2856
>>> bar_info . filename
'sub_dir/bar.py'
bar_info
contains details about bar.py
such as its size when compressed and its full path.
bar_info
包含有关bar.py
详细信息,例如压缩后的大小和完整路径。
The first line shows how to retrieve a file’s last modified date. The next line shows how to get the size of the file after compression. The last line shows the full path of bar.py
in the archive.
第一行显示了如何检索文件的上次修改日期。 下一行显示如何在压缩后获取文件的大小。 最后一行显示存档中bar.py
的完整路径。
ZipFile
supports the context manager protocol, which is why you’re able to use it with the with
statement. Doing this automatically closes the ZipFile
object after you’re done with it. Trying to open or extract files from a closed ZipFile
object will result in an error.
ZipFile
支持上下文管理器协议,这就是为什么您可以在with
语句中使用它。 完成此操作后,将自动关闭ZipFile
对象。 尝试从关闭的ZipFile
对象打开或提取文件将导致错误。
The zipfile
module allows you to extract one or more files from ZIP archives through .extract()
and .extractall()
.
zipfile
模块允许您通过.extract()
和.extractall()
从ZIP存档中提取一个或多个文件。
These methods extract files to the current directory by default. They both take an optional path
parameter that allows you to specify a different directory to extract files to. If the directory does not exist, it is automatically created. To extract files from the archive, do the following:
这些方法默认情况下将文件提取到当前目录。 它们都带有可选的path
参数,该参数允许您指定其他目录以将文件提取到其中。 如果该目录不存在,则会自动创建。 要从存档中提取文件,请执行以下操作:
>>> import zipfile
>>> import os
>>> os . listdir ( '.' )
['data.zip']
>>> data_zip = zipfile . ZipFile ( 'data.zip' , 'r' )
>>> # Extract a single file to current directory
...
>>> data_zip . extract ( 'file1.py' )
'/home/terra/test/dir1/zip_extract/file1.py'
>>> os . listdir ( '.' )
['file1.py', 'data.zip']
>>> # Extract all files into a different directory
...
>>> data_zip . extractall ( path = 'extract_dir/' )
>>> os . listdir ( '.' )
['file1.py', 'extract_dir', 'data.zip']
>>> os . listdir ( 'extract_dir' )
['file1.py', 'file3.py', 'file2.py', 'sub_dir']
>>> data_zip . close ()
The third line of code is a call to os.listdir()
, which shows that the current directory has only one file, data.zip
.
第三行代码是对os.listdir()
的调用,它显示当前目录只有一个文件data.zip
。
Next, you open data.zip
in read mode and call .extract()
to extract file1.py
from it. .extract()
returns the full file path of the extracted file. Since there’s no path specified, .extract()
extracts file1.py
to the current directory.
接下来,您以读取模式打开data.zip
并调用.extract()
从中提取file1.py
。 .extract()
返回提取文件的完整文件路径。 由于未指定路径, .extract()
file1.py
提取到当前目录。
The next line prints a directory listing showing that the current directory now includes the extracted file in addition to the original archive. The line after that shows how to extract the entire archive into the zip_extract
directory. .extractall()
creates the extract_dir
and extracts the contents of data.zip
into it. The last line closes the ZIP archive.
下一行将打印一个目录列表,该目录列表显示当前目录现在除了原始归档文件之外还包括提取的文件。 接下来的行显示了如何将整个存档提取到zip_extract
目录中。 .extractall()
创建extract_dir
并将data.zip
的内容提取到其中。 最后一行关闭ZIP存档。
zipfile
supports extracting password protected ZIPs. To extract password protected ZIP files, pass in the password to the .extract()
or .extractall()
method as an argument:
zipfile
支持提取受密码保护的ZIP。 要提取受密码保护的ZIP文件,请将密码作为参数传递给.extract()
或.extractall()
方法:
>>> import zipfile
>>> with zipfile . ZipFile ( 'secret.zip' , 'r' ) as pwd_zip :
... # Extract from a password protected archive
... pwd_zip . extractall ( path = 'extract_dir' , pwd = 'Quish3@o' )
This opens the secret.zip
archive in read mode. A password is supplied to .extractall()
, and the archive contents are extracted to extract_dir
. The archive is closed automatically after the extraction is complete thanks to the with
statement.
这secret.zip
读取模式打开secret.zip
存档。 将密码提供给.extractall()
,并将存档内容提取到extract_dir
。 提取完成后, with
语句自动关闭归档。
To create a new ZIP archive, you open a ZipFile
object in write mode (w
) and add the files you want to archive:
要创建新的ZIP存档,请以写入模式( w
)打开一个ZipFile
对象,然后添加要存档的文件:
>>> import zipfile
>>> file_list = [ 'file1.py' , 'sub_dir/' , 'sub_dir/bar.py' , 'sub_dir/foo.py' ]
>>> with zipfile . ZipFile ( 'new.zip' , 'w' ) as new_zip :
... for name in file_list :
... new_zip . write ( name )
In the example, new_zip
is opened in write mode and each file in file_list
is added to the archive. When the with
statement suite is finished, new_zip
is closed. Opening a ZIP file in write mode erases the contents of the archive and creates a new archive.
在该示例中,以写入模式打开new_zip
, new_zip
file_list
每个文件添加到存档中。 with
语句套件完成后,将关闭new_zip
。 在写入模式下打开ZIP文件会删除档案的内容并创建一个新的档案。
To add files to an existing archive, open a ZipFile
object in append mode and then add the files:
要将文件添加到现有存档中,请以附加模式打开ZipFile
对象,然后添加文件:
>>> # Open a ZipFile object in append mode
...
>>> with zipfile . ZipFile ( 'new.zip' , 'a' ) as new_zip :
... new_zip . write ( 'data.txt' )
... new_zip . write ( 'latin.txt' )
Here, you open the new.zip
archive you created in the previous example in append mode. Opening the ZipFile
object in append mode allows you to add new files to the ZIP file without deleting its current contents. After adding files to the ZIP file, the with
statement goes out of context and closes the ZIP file.
在这里,您可以在追加模式下打开在上一个示例中创建的new.zip
存档。 在追加模式下打开ZipFile
对象使您可以将新文件添加到ZIP文件中,而无需删除其当前内容。 将文件添加到ZIP文件后, with
语句将脱离上下文并关闭ZIP文件。
TAR files are uncompressed file archives like ZIP. They can be compressed using gzip, bzip2, and lzma compression methods. The TarFile
class allows reading and writing of TAR archives.
TAR文件是未压缩的文件存档,例如ZIP。 可以使用gzip,bzip2和lzma压缩方法对其进行压缩。 TarFile
类允许读取和写入TAR档案。
Do this to read from an archive:
这样做以从存档中读取:
tarfile
objects open like most file-like objects. They have an open()
function that takes a mode that determines how the file is to be opened.
tarfile
对象像大多数类似文件的对象一样打开。 它们具有open()
函数,该函数采用一种确定如何打开文件的模式。
Use the 'r'
, 'w'
or 'a'
modes to open an uncompressed TAR file for reading, writing, and appending, respectively. To open compressed TAR files, pass in a mode argument to tarfile.open()
that is in the form filemode[:compression]
. The table below lists the possible modes TAR files can be opened in:
使用'r'
, 'w'
或'a'
模式分别打开未压缩的TAR文件,以进行读取,写入和附加。 要打开压缩的TAR文件, tarfile.open()
mode参数以filemode[:compression]
的形式传递给tarfile.open()
。 下表列出了可以在其中打开TAR文件的可能模式:
Mode | 模式 | Action | 行动 |
---|---|---|---|
r r |
Opens archive for reading with transparent compression | 打开存档以透明压缩方式阅读 | |
r:gz r:gz |
Opens archive for reading with gzip compression | 打开存档以使用gzip压缩进行阅读 | |
r:bz2 r:bz2 |
Opens archive for reading with bzip2 compression | 打开存档以使用bzip2压缩进行读取 | |
r:xz r:xz |
Opens archive for reading with lzma compression | 打开存档以使用Lzma压缩进行读取 | |
w w |
Opens archive for uncompressed writing | 打开存档以进行未压缩的写入 | |
w:gz w:gz |
Opens archive for gzip compressed writing | 打开存档以进行gzip压缩写入 | |
w:xz w:xz |
Opens archive for lzma compressed writing | 打开存档以进行Lzma压缩写入 | |
a a |
Opens archive for appending with no compression | 打开存档以无压缩地追加 |
.open()
defaults to 'r'
mode. To read an uncompressed TAR file and retrieve the names of the files in it, use .getnames()
:
.open()
默认为'r'
模式。 要读取未压缩的TAR文件并检索其中的文件名,请使用.getnames()
:
>>> import tarfile
>>> tar = tarfile.open('example.tar', mode='r')
>>> tar.getnames()
['CONTRIBUTING.rst', 'README.md', 'app.py']
This returns a list with the names of the archive contents.
这将返回一个列表,其中包含存档内容的名称。
Note: For the purposes of showing you how to use different tarfile
object methods, the TAR file in the examples is opened and closed manually in an interactive REPL session.
注意:为了向您展示如何使用不同的tarfile
对象方法,示例中的TAR文件是在交互式REPL会话中手动打开和关闭的。
Interacting with the TAR file this way allows you to see the output of running each command. Normally, you would want to use a context manager to open file-like objects.
通过这种方式与TAR文件交互,可以查看运行每个命令的输出。 通常,您需要使用上下文管理器来打开类似文件的对象。
The metadata of each entry in the archive can be accessed using special attributes:
可以使用特殊属性访问档案中每个条目的元数据:
>>> for entry in tar.getmembers():
... print(entry.name)
... print(' Modified:', time.ctime(entry.mtime))
... print(' Size :', entry.size, 'bytes')
... print()
CONTRIBUTING.rst
Modified: Sat Nov 1 09:09:51 2018
Size : 402 bytes
README.md
Modified: Sat Nov 3 07:29:40 2018
Size : 5426 bytes
app.py
Modified: Sat Nov 3 07:29:13 2018
Size : 6218 bytes
In this example, you loop through the list of files returned by .getmembers()
and print out each file’s attributes. The objects returned by .getmembers()
have attributes that can be accessed programmatically such as the name, size, and last modified time of each of the files in the archive. After reading or writing to the archive, it must be closed to free up system resources.
在此示例中,您循环浏览.getmembers()
返回的文件列表,并打印出每个文件的属性。 .getmembers()
返回的对象具有可以通过编程方式访问的属性,例如存档中每个文件的名称,大小和最后修改时间。 在读取或写入存档后,必须将其关闭以释放系统资源。
In this section, you’ll learn how to extract files from TAR archives using the following methods:
在本节中,您将学习如何使用以下方法从TAR档案中提取文件:
.extract()
.extractfile()
.extractall()
.extract()
.extractfile()
.extractall()
To extract a single file from a TAR archive, use extract()
, passing in the filename:
要从TAR归档文件中提取单个文件,请使用extract()
,并传入文件名:
>>> tar.extract('README.md')
>>> os.listdir('.')
['README.md', 'example.tar']
The README.md
file is extracted from the archive to the file system. Calling os.listdir()
confirms that README.md
file was successfully extracted into the current directory. To unpack or extract everything from the archive, use .extractall()
:
README.md
文件从存档中提取到文件系统。 调用os.listdir()
确认README.md
文件已成功提取到当前目录中。 要从归档文件中解压缩或提取所有内容,请使用.extractall()
:
>>> tar.extractall(path="extracted/")
.extractall()
has an optional path
argument to specify where extracted files should go. Here, the archive is unpacked into the extracted
directory. The following commands show that the archive was successfully extracted:
.extractall()
有一个可选的path
参数,用于指定提取的文件应存放的位置。 在这里,档案被解压缩到extracted
目录中。 以下命令显示已成功提取存档:
$ ls
$ ls
example.tar extracted README.md
example.tar extracted README.md
$ tree
$ tree
.
.
├── example.tar
├── example.tar
├── extracted
├── extracted
│ ├── app.py
│ ├── app.py
│ ├── CONTRIBUTING.rst
│ ├── CONTRIBUTING.rst
│ └── README.md
│ └── README.md
└── README.md
└── README.md
1 directory, 5 files
1 directory, 5 files
$ ls extracted/
$ ls extracted/
app.py CONTRIBUTING.rst README.md
app.py CONTRIBUTING.rst README.md
To extract a file object for reading or writing, use .extractfile()
, which takes a filename or TarInfo
object to extract as an argument. .extractfile()
returns a file-like object that can be read and used:
要提取文件对象以进行读取或写入,请使用.extractfile()
,它将文件名或TarInfo
对象作为参数提取。 .extractfile()
返回可以读取和使用的类似文件的对象:
>>> f = tar . extractfile ( 'app.py' )
>>> f . read ()
>>> tar . close ()
Opened archives should always be closed after they have been read or written to. To close an archive, call .close()
on the archive file handle or use the with
statement when creating tarfile
objects to automatically close the archive when you’re done. This frees up system resources and writes any changes you made to the archive to the filesystem.
在读取或写入打开的存档后,应始终将其关闭。 要关闭存档,请在存档文件句柄上调用.close()
或在创建tarfile
对象时使用with
语句,以在完成后自动关闭存档。 这将释放系统资源,并将对归档所做的所有更改写入文件系统。
Here’s how you do it:
这是您的操作方式:
>>> import tarfile
>>> file_list = [ 'app.py' , 'config.py' , 'CONTRIBUTORS.md' , 'tests.py' ]
>>> with tarfile . open ( 'packages.tar' , mode = 'w' ) as tar :
... for file in file_list :
... tar . add ( file )
>>> # Read the contents of the newly created archive
>>> with tarfile . open ( 'package.tar' , mode = 'r' ) as t :
... for member in t . getmembers ():
... print ( member . name )
app.py
config.py
CONTRIBUTORS.md
tests.py
First, you make a list of files to be added to the archive so that you don’t have to add each file manually.
首先,列出要添加到存档中的文件,这样就不必手动添加每个文件。
The next line uses the with
context manager to open a new archive called packages.tar
in write mode. Opening an archive in write mode('w'
) enables you to write new files to the archive. Any existing files in the archive are deleted and a new archive is created.
下一行使用with
上下文管理器以写入模式打开一个名为packages.tar
的新存档。 以写模式( 'w'
)打开档案可以使您将新文件写入档案。 存档中的所有现有文件都将被删除,并创建一个新的存档。
After the archive is created and populated, the with
context manager automatically closes it and saves it to the filesystem. The last three lines open the archive you just created and print out the names of the files contained in it.
创建并填充档案后, with
上下文管理器会自动将其关闭并将其保存到文件系统中。 最后三行打开您刚刚创建的档案,并打印出其中包含的文件的名称。
To add new files to an existing archive, open the archive in append mode ('a'
):
要将新文件添加到现有档案,请以追加模式( 'a'
)打开档案:
>>> with tarfile . open ( 'package.tar' , mode = 'a' ) as tar :
... tar . add ( 'foo.bar' )
>>> with tarfile . open ( 'package.tar' , mode = 'r' ) as tar :
... for member in tar . getmembers ():
... print ( member . name )
app.py
config.py
CONTRIBUTORS.md
tests.py
foo.bar
Opening an archive in append mode allows you to add new files to it without deleting the ones already in it.
在追加模式下打开存档可让您向其中添加新文件,而不会删除其中已有的文件。
tarfile
can also read and write TAR archives compressed using gzip, bzip2, and lzma compression. To read or write to a compressed archive, use tarfile.open()
, passing in the appropriate mode for the compression type.
tarfile
还可以读写使用gzip,bzip2和lzma压缩方式压缩的TAR归档文件。 要读取或写入压缩档案,请使用tarfile.open()
,并以适当的模式传递压缩类型。
For example, to read or write data to a TAR archive compressed using gzip, use the 'r:gz'
or 'w:gz'
modes respectively:
例如,要将数据读取或写入使用gzip压缩的TAR归档文件,请分别使用'r:gz'
或'w:gz'
模式:
>>> files = [ 'app.py' , 'config.py' , 'tests.py' ]
>>> with tarfile . open ( 'packages.tar.gz' , mode = 'w:gz' ) as tar :
... tar . add ( 'app.py' )
... tar . add ( 'config.py' )
... tar . add ( 'tests.py' )
>>> with tarfile . open ( 'packages.tar.gz' , mode = 'r:gz' ) as t :
... for member in t . getmembers ():
... print ( member . name )
app.py
config.py
tests.py
The 'w:gz'
mode opens the archive for gzip compressed writing and 'r:gz'
opens the archive for gzip compressed reading. Opening compressed archives in append mode is not possible. To add files to a compressed archive, you have to create a new archive.
'w:gz'
模式打开用于gzip压缩写入的存档,而'r:gz'
打开用于gzip压缩读取的存档。 无法以附加模式打开压缩的存档。 要将文件添加到压缩存档中,您必须创建一个新的存档。
The Python Standard Library also supports creating TAR and ZIP archives using the high-level methods in the shutil
module. The archiving utilities in shutil
allow you to create, read, and extract ZIP and TAR archives. These utilities rely on the lower level tarfile
and zipfile
modules.
Python标准库还支持使用shutil
模块中的高级方法创建TAR和ZIP存档。 shutil
的归档实用程序允许您创建,读取和提取ZIP和TAR归档文件。 这些实用程序依赖于较低级别的tarfile
和zipfile
模块。
Working With Archives Using shutil.make_archive()
使用shutil.make_archive()
档案
shutil.make_archive()
takes at least two arguments: the name of the archive and an archive format.
shutil.make_archive()
至少接受两个参数:档案的名称和档案格式。
By default, it compresses all the files in the current directory into the archive format specified in the format
argument. You can pass in an optional root_dir
argument to compress files in a different directory. .make_archive()
supports the zip
, tar
, bztar
, and gztar
archive formats.
默认情况下,它将当前目录中的所有文件压缩为format
参数中指定的存档格式。 您可以传入一个可选的root_dir
参数来压缩其他目录中的文件。 .make_archive()
支持zip
, tar
, bztar
和gztar
存档格式。
This is how to create a TAR archive using shutil
:
这是使用shutil
创建TAR归档文件的方法:
This copies everything in data/
and creates an archive called backup.tar
in the filesystem and returns its name. To extract the archive, call .unpack_archive()
:
这将复制data/
所有内容,并在文件系统中创建一个名为backup.tar
的存档,并返回其名称。 要提取档案,请调用.unpack_archive()
:
shutilshutil .. unpack_archiveunpack_archive (( 'backup.tar''backup.tar' , , 'extract_dir/''extract_dir/' )
)
Calling .unpack_archive()
and passing in an archive name and destination directory extracts the contents of backup.tar
into extract_dir/
. ZIP archives can be created and extracted in the same way.
调用.unpack_archive()
并传入档案名称和目标目录, .unpack_archive()
backup.tar
的内容提取到extract_dir/
。 可以用相同的方式创建和提取ZIP存档。
Python supports reading data from multiple input streams or from a list of files through the fileinput
module. This module allows you to loop over the contents of one or more text files quickly and easily. Here’s the typical way fileinput
is used:
Python支持通过fileinput
模块从多个输入流或文件列表中读取数据。 此模块使您可以快速轻松地循环浏览一个或多个文本文件的内容。 这是使用fileinput
的典型方式:
fileinput
gets its input from command line arguments passed to sys.argv
by default.
默认情况下, fileinput
从传递给sys.argv
的命令行参数获取输入。
Using fileinput
to Loop Over Multiple Files
使用fileinput
循环多个文件
Let’s use fileinput
to build a crude version of the common UNIX utility cat
. The cat
utility reads files sequentially, writing them to standard output. When given more than one file in its command line arguments, cat
will concatenate the text files and display the result in the terminal:
让我们使用fileinput
构建通用UNIX实用程序cat
的原始版本。 cat
实用程序按顺序读取文件,并将它们写入标准输出。 当在命令行参数中给多个文件时, cat
将串联文本文件并在终端中显示结果:
# File: fileinput-example.py
# File: fileinput-example.py
import import fileinput
fileinput
import import sys
sys
files files = = fileinputfileinput .. inputinput ()
()
for for line line in in filesfiles :
:
if if fileinputfileinput .. isfirstlineisfirstline ():
():
printprint (( ff '' nn --- Reading {fileinput.filename()} ---'--- Reading {fileinput.filename()} ---' )
)
printprint (( ' -> ' ' -> ' + + lineline , , endend == '''' )
)
printprint ()
()
Running this on two text files in my current directory produces the following output:
在当前目录中的两个文本文件上运行此命令将产生以下输出:
fileinput
allows you to retrieve more information about each line such as whether or not it is the first line (.isfirstline()
), the line number (.lineno()
), and the filename (.filename()
). You can read more about it here.
fileinput
允许您检索有关每行的更多信息,例如是否为第一行( .isfirstline()
),行号( .lineno()
)和文件名( .filename()
)。 您可以在此处了解更多信息。
You now know how to use Python to perform the most common operations on files and groups of files. You’ve learned about the different built-in modules used to read, find, and manipulate them.
现在,您知道如何使用Python对文件和文件组执行最常见的操作。 您已经了解了用于读取,查找和操作它们的各种内置模块。
You’re now equipped to use Python to:
您现在可以使用Python进行以下操作:
fileinput
fileinput
同时读取多个文件 翻译自: https://www.pybloggers.com/2019/01/working-with-files-in-python/