glob
是python的标准库模块,只要安装python就可以使用该模块。glob模块主要用来查找目录
和文件
,可以使用*、?、[]
这三种通配符
对路径中的文件进行匹配。
*
:代表0个或多个字符?
:代表一个字符[]
:匹配指定范围内的字符,如[0-9]匹配数字Unix样式路径名模式扩展
>>> dir(glob)
['__all__', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__',
'__name__', '__package__', '__spec__', '_glob0', '_glob1', '_glob2', '_iglob',
'_ishidden', '_isrecursive', '_iterdir', '_rlistdir', 'escape', 'fnmatch',
'glob', 'glob0', 'glob1', 'has_magic', 'iglob', 'magic_check',
'magic_check_bytes', 'os', 're']
>>>
glob模块常用的两个方法有:glob.glob() 和 glob.iglob
,下面详细介绍
def glob(pathname, *, recursive=False):
"""Return a list of paths matching a pathname pattern.
The pattern may contain simple shell-style wildcards a la
fnmatch. However, unlike fnmatch, filenames starting with a
dot are special cases that are not matched by '*' and '?'
patterns.
If recursive is true, the pattern '**' will match any files and
zero or more directories and subdirectories.
"""
return list(iglob(pathname, recursive=recursive))
def iglob(pathname, *, recursive=False):
"""Return an iterator which yields the paths matching a pathname pattern.
The pattern may contain simple shell-style wildcards a la
fnmatch. However, unlike fnmatch, filenames starting with a
dot are special cases that are not matched by '*' and '?'
patterns.
If recursive is true, the pattern '**' will match any files and
zero or more directories and subdirectories.
"""
it = _iglob(pathname, recursive, False)
if recursive and _isrecursive(pathname):
s = next(it) # skip empty string
assert not s
return it
def glob(pathname, *, recursive=False):
pathname
:该参数是要匹配的路径recursive
:如果是true就会递归的去匹配符合的文件路径,默认是False匹配
到的路径列表
先给出测试使用的目录结构:
test_dir/
├── a1.txt
├── a2.txt
├── a3.py
├── sub_dir1
│ ├── b1.txt
│ ├── b2.py
│ └── b3.py
└── sub_dir2
├── c1.txt
├── c2.py
└── c3.txt
2、匹配'./test_dir/*
路径下的所有目录和文件
,并返回路径列表
>>> path_list2 = glob.glob('./test_dir/*')
>>> path_list2
['./test_dir/a3.py', './test_dir/a2.txt', './test_dir/sub_dir1', './test_dir/sub_dir2', './test_dir/a1.txt']
3、匹配./test_dir/
路径下含有的所有.py文件
(不递归
)
>>> path_list3 = glob.glob('./test_dir/*.py')
>>> path_list3
['./test_dir/a3.py']
>>> path_list4 = glob.glob('./test_dir/*/*.py')
>>> path_list4
['./test_dir/sub_dir1/b2.py', './test_dir/sub_dir1/b3.py', './test_dir/sub_dir2/c2.py']
4、递归的
匹配./test_dir/**
路径下的所有目录和文件
,并返回路径列表
>>> path_list5 = glob.glob('./test_dir/**', recursive=True)
>>> path_list5
['./test_dir/', './test_dir/a3.py', './test_dir/a2.txt', './test_dir/sub_dir1', './test_dir/sub_dir1/b2.py', './test_dir/sub_dir1/b3.py', './test_dir/sub_dir1/b1.txt', './test_dir/sub_dir2', './test_dir/sub_dir2/c3.txt', './test_dir/sub_dir2/c1.txt', './test_dir/sub_dir2/c2.py', './test_dir/a1.txt']
>>> path_list6 = glob.glob('./test_dir/**/*.py', recursive=True)
>>> path_list6
['./test_dir/a3.py', './test_dir/sub_dir1/b2.py', './test_dir/sub_dir1/b3.py', './test_dir/sub_dir2/c2.py']
注意:
如果要对某个路径下进行递归,一定要在后面加两个*
>>> path_list = glob.glob('./test_dir/', recursive=True)
>>> path_list
['./test_dir/']
def iglob(pathname, *, recursive=False):
"""Return an iterator which yields the paths matching a pathname pattern.
The pattern may contain simple shell-style wildcards a la
fnmatch. However, unlike fnmatch, filenames starting with a
dot are special cases that are not matched by '*' and '?'
patterns.
If recursive is true, the pattern '**' will match any files and
zero or more directories and subdirectories.
"""
it = _iglob(pathname, recursive, False)
if recursive and _isrecursive(pathname):
s = next(it) # skip empty string
assert not s
return it
glob.iglob
的参数
与glob.glob()
一样def iglob(pathname, *, recursive=False):
pathname
:该参数是要匹配的路径recursive
:如果是true就会递归的去匹配符合的文件路径,默认是False迭代器
,遍历该迭代器的结果与使用相同参数调用glob()的返回结果一致先给出测试使用的目录结构:
test_dir/
├── a1.txt
├── a2.txt
├── a3.py
├── sub_dir1
│ ├── b1.txt
│ ├── b2.py
│ └── b3.py
└── sub_dir2
├── c1.txt
├── c2.py
└── c3.txt
正常glob.glob()
返回路径列表
>>> path_list4 = glob.glob('./test_dir/*/*.py')
>>> path_list4
['./test_dir/sub_dir1/b2.py', './test_dir/sub_dir1/b3.py', './test_dir/sub_dir2/c2.py']
现在,使用:glob.iglob()
>>> file_path_iter = glob.iglob('./test_dir/*')
>>> print(type(file))
<class 'generator'>
>>> for file_path in file_path_iter:
... print(file_path)
...
./test_dir/a3.py
./test_dir/a2.txt
./test_dir/sub_dir1
./test_dir/sub_dir2
./test_dir/a1.txt
>>>
*、?、[]
实例>>> import glob
>>> glob.glob('./[0-9].*')
['./1.gif', './2.txt']
>>> glob.glob('*.gif')
['1.gif', 'card.gif']
>>> glob.glob('?.gif')
['1.gif']
>>> glob.glob('**/*.txt', recursive=True)
['2.txt', 'sub/3.txt']
>>> glob.glob('./**/', recursive=True)
['./', './sub/']