原文地址:https://docs.python.org/3/tutorial/inputoutput.html#reading-and-writing-files
本文为学习过程中顺带的翻译,大家一起分享下~
7.2. Reading and Writing Files
open() 函数返回一个file object,通常需要传入两个参数 filename和mode—— open(filename, mode).
f = open(‘workfile’, ‘w’)
第一个参数filename是包含文件地址的str
第二个参数mode用来指定文件被使用的方式,The mode argument is optional;如果不指定则默认为mode= ‘r’ 即只读模式
其中 mode=‘r’ 意味着 文件只是用来读入python ,
mode=‘w’ for only writing (如果写入之后的文件和之前的文件同名,则之前的那个文件会被擦除、覆盖an existing file with the same name will be erased)
mode=‘a’ opens the file for appending; any data written to the file is automatically added to the end任何append进file的数据都被自动加到文件末尾位置
mode=‘r+’ opens the file for both reading and writing读写均可.
通常来说文件以text模式打开,这意味着我们读写string的文件是以一种特定的encoding方式进行编码的。Normally, files are opened in text mode, that means, you read and write strings from and to the file, which are encoded in a specific encoding. If encoding is not specified, the default is platform dependent (see open())如果encoding方式没有指定,那么通常默认encoding方式是平台相关的.
‘b’ appended to the mode opens the file in binary mode: now the data is read and written in the form of bytes objects. **This mode should be used for all files that don’t contain text.**对于不包含text的file,都应该用mode='b’模式用binary mode to open the file.
In text mode, the default when reading is to convert platform-specific line endings (\n on Unix, \r\n on Windows) to just \n
在text mode下读取时会默认将平台特定的每行结尾指定为\n。
When writing in text mode, the default is to convert occurrences of \n back to platform-specific line endings
在text mode**写时,**默认会将\n 的出现转化回platform-specific line endings平台特定的每行末尾。
This behind-the-scenes modification to file data is fine for text files, but will corrupt binary data like that in JPEG or EXE files. Be very careful to use binary mode when reading and writing such files.
当处理file对象时,使用with关键字是非常好的习惯。因为在with下,file可以被properly closed after its suite finishes, even if an exception is raised at some point. Using with is also much shorter than writing equivalent try-finally blocks,同时使用with也比写 try-finally blocks更简短。
with open('workfile') as f: ... read_data = f.read() f.closed True
如果不使用with关键字的话,你就必须调用 **f.close() **去关闭file对象并且立即释放被file占用的任何系统资源——immediately free up any system resources used by it. If you don’t explicitly close a file, Python’s garbage collector will eventually destroy the object and close the open file for you , but the file may stay open for a while. Another risk is that different Python implementations will do this clean-up at different times.
而一旦file object is closed, 无论是with statement或者是by calling f.close()的尝试to use the file object 都会自动失败will automatically fail.
所以说,使用with语句不用费心file.close()的问题,强烈建议使用with open statement
7.2.1. Methods of File Objects
接下来的分析中我们都假定已经创建了一个名叫f的file。
To read a file’s contents, call f.read(size), which reads some quantity of data and returns it as a string (in text mode) or bytes object (in binary mode). 可以通过调用f.read(size)来读取一个file的内容,f.read(size)在text mode下会读取一定数量的数据并且以string的方式返回,而在binary mode下则会以byte object的象时返回这部分数据。
size is an optional numeric argument–size是一个可选的数值参数.
When size is omitted or negative, the entire contents of the file will be read and returned; 当size参数被指定为负值或者不予指定的话,整个文件file的内容都会被read且按照前述方式返回。
it’s your problem if the file is twice as large as your machine’s memory如果file是机器内存两倍大小,那么这个就是你自己的问题了。 Otherwise, at most size bytes are read and returned. 否则在大多数情况下,f.read(size)会尽可能地读取最多bytes并且返回。
If the end of the file has been reached, f.read() will return an empty string (’’). 在到达文件file的末尾后,f.read()会返回一个空字符串’’。
>>> f.read()
'This is the entire file.\n'
>>> f.read()
''
f.readline() reads a single line from the file; 一个换行符 (\n) is left at the end of the string, and is only omitted on the last line of the file if the file doesn’t end in a newline. This makes the return value unambiguous不模糊; if f.readline() returns an empty string, the end of the file has been reached, while a blank line is represented by ‘\n’, a string containing only a single newline.
f.readline()每次从file中读取一行,以一个string的形式返回并且在string的末尾加上一个换行符 \n,只有当文件不再在一个新行处结束时这个末尾换行符\n才会被省略。这一特点让返回值变得不模糊——在f.readline()函数返回一个空string时,意味着已经读到了file的末尾,而当空的一行用’\n’表示时,一个string只包含一个单独的新行。
>>> f.readline()
'This is the first line of the file.\n'
>>> f.readline()
'Second line of the file\n'
>>> f.readline()
''
For reading lines from a file, you can **loop over the file object.**可以对file对象进行循环, This is memory efficient, fast, and leads to simple code:
>>> for line in f:
... print(line, end='')
...
This is the first line of the file.
Second line of the file
If you want to read all the lines of a file in a list you can also use list(f) or f.readlines().
使用f.readlines()函数或者list(f)去读取all the lines of a file 到一个列表中去in a list
f.write(string) writes the contents of string to the file, returning the number of characters written.
>>> f.write('This is a test\n')
15
Other types of objects need to be converted – either to a string (in text mode) or a bytes object (in text mode) – before writing them:
别的类型的对象在writing之前需要被转换成string 对象 (in text mode) 或者 a bytes对象 (in text mode) 。
>>> value = ('the answer', 42)
>>> s = str(value) # convert the tuple to string
>>> f.write(s)
18
f.tell() returns an integer giving the file object’s current position in the file represented as number of bytes from the beginning of the file when in binary mode and an opaque number when in text mode.
f.tell()会返回一个整数,这个integer提供了该file object当前在该file中的位置信息,这个整数会被表示为number of bytes from the beginning of the file(在binary 模式下)或者被表示为一个an opaque number (在text模式下)。
To change the file object’s position, use f.seek(offset, from_what).
想要改变file对象的位置,可以使用 f.seek(offset, from_what)。
The position is computed from adding offset to a reference point;
该位置position通过在参考点基础上加一个偏移量offset得到。
通过 from_what参数来选择the reference point参考点。
参数from_what=0 意味着从the beginning of the file开始measure,
参数from_what=1 意味着从the current file position开始measure,
参数from_what=2意味着将the end of the file 作为参考点reference point.
参数from_what可以忽略,但是默认值为0即使用file的开始位置作为参考点开始计量。
>>> f = open('workfile', 'rb+')
>>> f.write(b'0123456789abcdef')
16
>>> f.seek(5) # Go to the 6th byte in the file
5
>>> f.read(1)
b'5'
>>> f.seek(-3, 2) # Go to the 3rd byte before the end
13
>>> f.read(1)
b'd'
In text files (those opened without a b in the mode string), 那些通过mode string中没有b的方式打开的文件——即text files中,只允许相对于the beginning of the file的查找,唯一合法的偏移量值offset values是由f.tell()返回的值或者是0。其他的offset value会产生没有定义的行为。only seeks relative to the beginning of the file are allowed (the exception being seeking to the very file end with seek(0, 2)) and the only valid offset values are those returned from the f.tell(), or zero. Any other offset value produces undefined behaviour.
File objects have some additional methods,File对象的别的方法,比如 satty() 和truncate() which are less frequently used; consult the Library Reference for a complete guide to file objects.
增加内容:
Reference Website: https://www.cnblogs.com/ymjyqsx/p/6554817.html
Author: levi 编辑于 2019-02-24
Python引入了with语句来自动帮我们调用close()方法:
with open('/path/to/file', 'r') as f:
print(f.read())
这和前面的try … finally是一样的,但是代码更佳简洁,并且不必调用f.close()方法。
调用read()会一次性读取文件的全部内容,如果文件有10G,内存就爆了,所以,要保险起见,可以反复调用read(size)方法,每次最多读取size个字节的内容。另外,调用**readline()可以每次读取一行内容,调用readlines()**一次读取所有内容并按行返回list。因此,要根据需要决定怎么调用。
如果文件很小,read()一次性读取最方便;
如果不能确定文件大小,反复调用read(size)比较保险;
如果是配置文件,调用readlines()最方便:
for line in f.readlines():
print(line.strip()) # 把末尾的'\n'删掉
对于多个文件的读写,可以写成以下两种方式:
with open('/home/xbwang/Desktop/output_measures.txt','r') as f:
with open('/home/xbwang/Desktop/output_measures2.txt','r') as f1:
with open('/home/xbwang/Desktop/output_output_bk.txt','r') as f2:
........
........
........
with open('/home/xbwang/Desktop/output_measures.txt','r') as f:
........
with open('/home/xbwang/Desktop/output_measures2.txt','r') as f1:
........
with open('/home/xbwang/Desktop/output_output_bk.txt','r') as f2:
........