pythoncookbook 第5章文件与IO

第5章文件与IO

文件的读写和io都要通过内存

重载系统的编码方式

reload(sys)
sys.setdefaultencoding("utf-8")
sys.getdefaultencoding() # 得到系统的编码方式

5.1 读取文本

f=open('/tmp/hello','w')

open(路径+文件名,读写模式,encoding=xx,newline='')
#读写模式:r只读,r+读写,w新建(会覆盖原有文件),a追加,b二进制文件.常用模式

读写模式的类型有：

rU 或 Ua 以读方式打开, 同时提供通用换行符支持 (PEP 278)
w 以写方式打开，
a 以追加模式打开 (从 EOF 开始, 必要时创建新文件)
r+ 以读写模式打开 (指针头开始)
w+ 以读写模式打开
a+ 以读写模式打开 (指针尾开始)
rb 以二进制读模式打开
wb 以二进制写模式打开
ab 以二进制追加模式打开
rb+ 以二进制读写模式打开
wb+ 以二进制读写模式打开
ab+ 以二进制读写模式打开

f=open('/tmp/hello','rt')

w,r,wt,rt都是python里面文件操作的模式。
w是写模式，r是读模式。
t是windows平台特有的所谓text mode(文本模式）,区别在于会自动识别windows平台的换行符。
类Unix平台的换行符是\n，而windows平台用的是\r\n两个ASCII字符来表示换行，python内部采用的是\n来表示换行符。
rt模式下，python在读取文本时会自动把\r\n转换成\n.
wt模式下，Python写文件时会用\r\n来表示换行。

5.2 打印输出 (略)

python2 无 print() 函数

5.3 使用其他分隔符或行终止符打印（略)

5.4 读写字节数据(略)

无bytes类型

5.5 文件不存在才能写入

import os
if not os.path.exists('somefile'):
    with open('somefile', 'wt') as f:
        f.write('Hello\n')
else:
    print('File already exists!')

5.6 字符串的I/O操作

i/o时什么呢？就是进出内存的意思

#todo 阐述字符串的问题

5.7 读写压缩文件

用gzip，bz2包，open方法

5.8 固定大小记录的文件迭代

#todo

from functools import partial
RECORD_SIZE = 32
with open('somefile.data', 'rb') as f:
records = iter(partial(f.read, RECORD_SIZE), b'')
#records = iter(lambda: f.read(RECORD_SIZE), b'') 匿名函数的闭包
for r in records:

iter() 函数有一个鲜为人知的特性就是,如果你给它传递一个可调用对象和一个
标记值,它会创建一个迭代器

5.9 读取二进制数据到缓存区

直接读取二进制数据到一个可变缓冲区中
或你想原地修改数据并将它写回到一个文件中去
#todo

import os.path
def read_into_buffer(filename):
    buf = bytearray(os.path.getsize(filename))
    with open(filename, 'rb') as f:
        f.readinto(buf)
    return buf


record_size = 32 # Size of each record (adjust value)
buf = bytearray(record_size)
with open('sample.bin', 'rb') as f:
    while True:
        n = f.readinto(buf)
        if n < record_size:
            break

文件对象的 readinto() 方法能被用来为预先分配内存的数组填充数据

5.10 内存映射的二进制文件

#todo

import os
import mmap
def memory_map(filename, access=mmap.ACCESS_WRITE):
size = os.path.getsize(filename)
fd = os.open(filename, os.O_RDWR)
return mmap.mmap(fd, size, access=access)

5.11 os.path操作文件路径

# x为当前的目录 y为当前目录下包含的文件夹 z 为当前目录下的文件
for x, y, z, in os.walk(r"D:\Workspace\sell"):
    for zpieces in z :
        print '{}{}'.format(x,zpieces)

absolute_path = os.path.abspath(__file__)  # 返回当前文件的绝对路径
print os.path.basename(absolute_path)  # 当期文件名
print os.path.dirname(absolute_path)   # 文件的相对路径
print os.path.join('tmp', 'data', os.path.basename(absolute_path))
# linux "/", win "\"根据系统将字段拼成路径
dirname, filename = os.path.split(absolute_path)  # 划分文件的路径
os.path.splitext(absolute_path) 
# ... ('/home/jin/Workspace/company/rest/mysite/snippets/tests', '.py')

5.12 测试文件是否存在

print os.path.exists('/home/jin')  # 判断文件夹或文件是否存在
os.path.isdir('/home/jin')  # 判断是否为文件路径
os.path.isfile('/home/jin')  # 判断是否文件
# link的文件
os.path.islink('/home/jin')  # 判断是否时链接文件
os.path.realpath('/home/jin')  # link的源文件
os.path.getsize('/home/jin/Workspace/company/sell/')
## 文件夹为4096？ 文件为大小
time.ctime(os.path.getmtime('/etc/passwd'))  # 修改日期

5.13 获取制定文件夹中的文件列表

os.listdir() 获取制定文件夹下面的文件列表

# 显示文件
names = [name for name in os.listdir('/home/jin')  
if os.path.isfile(os.path.join('/home/jin', name))]
 # 显示文件夹
dirname = [name for name in os.listdir('/home/jin') 
if os.path.isdir(os.path.join('/home/jin', name))]

过滤特定的文件类型

pyfiles = [name for name in os.l istdir('/home/jin')'/home/jin'  
if name.endswith('.py')]

其他的匹配方式

import glob
    pyfiles = glob.glob('somedir/*.py')
from fnmatch import fnmatch
    pyfiles = [name for name in os.listdir('somedir')
                if fnmatch(name, '*.py')]

5.14 忽略文件名编码方式

5.15打印不合法的文件名

5.16 、5.17、5.18略

5.19 创建临时文件和文件夹

tempfile模块
TemporaryFile() 的第一个参数是文件模式,通常来讲文本模式使用 w+t ,二进制模式使用 w+b 。结果文件关闭时会被自动删除掉。

form tempfile import TemporaryFile

with TemporaryFile('w+t') as f:
# Read/write to the file
    f.write('Hello World\n')
    f.write('Testing\n')
# Seek back to beginning and read the data
    f.seek(0)
    data = f.read()

带有名称的临时文件，且文件关闭删除

with NamedTemporaryFile('w+t', delete=False) as f:
print 'filename is:', f.name

5.20 pyserial的串口通信

import serial
ser = serial.Serial('/dev/tty.usbmodem641',
                baudrate=9600,
                bytesize=8,
                parity='N',
                stopbits=1)

5.21 序列化python对象

将python对象序列化一个字节流，保存到一个文件、存储到数据库或者通过网络传输它。

import pickle
data = ... # Some Python object
f = open('somefile', 'wb')
pickle.dump(data, f)

为了将一个对象转储为一个字符串,可以使用 pickle.dumps() :

s = pickle.dumps(data)

为了从字节流中恢复一个对象,使用 picle.load() 或 pickle.loads() 函数。比
如:

# Restore from a file
f = open('somefile', 'rb')
data = pickle.load(f)
# Restore from a string
data = pickle.loads(s)

pythoncookbook 第5章 文件与IO

第5章 文件与IO