1.文件读取与写入
- with open(somefile) as f,是比较推荐的读取文件时的字段,可自动关闭文件,避免因为程序崩溃未及时写入;
- 内置函数readlines,读取后,文件内容以列表的格式返回;
with open('D:/tmp/info.txt') as f:
p1 = f.readlines()
print(p1)
with open('D:/tmp/info1.txt',w) as f:
f.writelines(p1)
- read,读取后,文件内容以一个字符串的形式返回:
with open('D:/tmp/info.txt') as f:
p2 = f.read()
print(p2)
with open('D:/tmp/info2.txt',w) as f:
f.writelines(p2)
- pandas中的read_table函数和read_csv函数,pandas包对R用户比较友好,读取后为数据框,就相当于R中的read.table,read.csv函数;
import pandas as pd
with open('D:/tmp/info.txt') as f:
p3 = pd.read_table(f)
print(p3)
p3.to_csv('D:/tmp/info3.txt')
2.字符串操作
import re
a = ['Joey','Pheebe','Monica','Chandeler','Rachel','Ross']
####split和join的使用
b = ','.join(a)
'Hi '+a[0]
Out[78]: 'Hi Joey'
print(a)
['Joey', 'Pheebe', 'Monica', 'Chandeler', 'Rachel', 'Ross']
print(b)
Joey,Pheebe,Monica,Chandeler,Rachel,Ross
b.split(',')
Out[81]: ['Joey', 'Pheebe', 'Monica', 'Chandeler', 'Rachel', 'Ross']
#####判断字符串的有无,[]是模板的作用(?)
[re.search('Ph',x) for x in a]
Out[83]: [None, , None, None, None, None]
['Ph' in x for x in a]
Out[84]: [False, True, False, False, False, False]
#####字符串替换
seq = 'ACGTACCTA'
###table即所制定的变换的规则
table = str.maketrans('AT','TA')
seq.translate(table)
Out[88]: 'TCGATCCAT'
seq.replace('AC', 'TG')
Out[89]: 'TGGTTGCTA'