一、读文件:
raed()方法:
read()方法读取整个文件返回字符串对象,如果文件内容大于可用内存,则会抛出异常,一般不建议这么操作
f = open('test.txt','r') #创建文件句柄,也是一个可迭代对象
try:
content = f.read() #结果为str类型
print (type(content))
print (content)
finally:
f.close()
with open('test.txt','r') as f:
content = f.read() #结果为str类型
print (type(content))
print (content)
结果为:
Night gathers, and now my watch begins. It shall not end until my death. I shall take no wife,hold no lands, father no children. I shall wear no crowns and win no glory. I shall live and die at my post. I am the sword in the darkness. I am the watcher on the walls. I am the fire that burns against the cold, the light that brings the dawn, the horn that wakes the sleepers, the shield that guards the realms of men. I pledge my life and honor to the Night’s Watch, for this night and all the nights to come.
readline()方法:
每次读取一行,返回一个字符串对象,保存当前行的内容
f = open(r'aa.txt','r')
try:
while True:
line = f.readline()
if line:
print (line)
else:
break
finally:
f.close()
with open(r'aa.txt','r') as f:
while True:
line = f.readline()
if line:
print (line)
else:
break
结果为:
Night gathers, and now my watch begins. It shall not end until my death.
I shall take no wife,hold no lands, father no children.
I shall wear no crowns and win no glory.
I shall live and die at my post.
I am the sword in the darkness.
I am the watcher on the walls.
I am the fire that burns against the cold,
the light that brings the dawn, the horn that wakes the sleepers, the shield that guards the realms of men.
I pledge my life and honor to the Night’s Watch, for this night and all the nights to come.
Night gathers, and now my watch begins. It shall not end until my death.
I shall take no wife,hold no lands, father no children.
I shall wear no crowns and win no glory.
I shall live and die at my post.
I am the sword in the darkness.
I am the watcher on the walls.
I am the fire that burns against the cold,
the light that brings the dawn, the horn that wakes the sleepers, the shield that guards the realms of men.
I pledge my life and honor to the Night’s Watch, for this night and all the nights to come.
因为readline是按行读取,按行打印,而print函数默认输出完,需要跨行!所以每打印一行中间都有空一行
readlines方法:
一次性读取整个文件,自动将文件内容分析成一个行的列表,返回列表对象
f = open(r'aa.txt','r')
try:
lines = f.readlines()
print type(lines)
print (lines)
finally:
f.close()
with open(r'aa.txt,'r') as f:
lines = f.readlines()
print type(lines)
for line in lines:
print (line)
对于大文件的读取:
采用以下两种方式:
1.按行读取
with open(r'aa.txt','r') as f:
for line in f:
print line
对可迭代对象 f,进行迭代遍历:for line in f,会自动地使用缓冲IO(buffered IO)以及内存管理,而不必担心任何大文件的问题。
2.分批读取,每次读取固定
def read_in_size(filePath,size=1024*1024):
f = open(filePath)
while True:
data = f.read(size)
if not data:
break
yield data
if __name__ == "__main__":
filePath = 'aa.txt'
for content in read_in_size(filePath):
print content
二、写文件:
f = open(r'aa.txt','w')
f.write("hello world")
f.write("hhh")
f.close()
写文件会自动地使用缓冲IO(buffered IO),就是说写文件的时候会先写入到缓冲区,这个我们大家也知道磁盘io跟内存的读写效率不是一个级别的,如果每write以下就要等待磁盘io完成,这会影响程序效率,所以python默认会将写入工作先放入内存缓冲区,达到缓冲区大小在写入磁盘。
如果要求实时写入,必须进行强制刷新。
f = open(r'aa.txt','w')
f.write("hello world")
f.write("hhh")
f.flush()
f.close()
这里写一个打印进度条的例子
import sys,time
for i in range(50):
sys.stdout.write("#")
sys.stdout.flush()
time.sleep(0.1)
python 的读写模式
'r' : 以只读模式打开文件,只能读取不能写入
'w': 以写入模式打开文件,只能写入不能读取,(会覆盖文件之前的内容)
'r+': 以读写模式打开文件,可读可写,这里的写入是以追加的方式写在文件末尾
'w+':以写读模式打开文件,可读可写,覆盖文件之前的内容,之后写入的内容可读,但是写入之后文件句柄指针就到了文件末尾,必须用seek方法回到文件的最初位置才可以读取,所以个人感觉这个模式没什么卵用。
'a+':以追加写的方式打开文件
'rb':以二进制读模式打开文件,当然相应的有wb,ab,rb+,wb+,ab+
python2里 'r' 跟 'rb'没有多大区别,在python3里因为引入了bytes类型,所以这两个操作是有区别的。
'rU':读取文件时自动将\r\n转换成\n(在windows上换行是\r\n,在linux上换行是\n)
这里举一个例子
f = open('test.txt','wb')
#错误写入方式
f.write("hello,world")
#正确写入方式
f.write("hello,world".encode() )
f.close()
python3里必须用encode方法将str类型转换成bytes类型才能以这种方式写入。