python高级编程技巧（5）

一. 如何拆分含有多种分隔符的字符串

实际案例：

我们要把某个字符串依据分隔符号拆分不同的字段，该字符串包含多种不同的分隔符，例如：
拆分 s = 'ab;cd|efgh|hi,jkl|mn\topq;rst,uvw\txyz'
其中<,>,<;>,<|>,<\t>都是分隔符号

解决方案：

方法1. 连续使用 str.split() 方法，每次处理一种分隔符号。
方法2. 使用正则表达式的 re,split() 方法，一次性拆分字符串。（推荐使用）

代码示例：

# _*_ coding:utf-8 _*_
# @Author   : TianYu
# @Time     : 2017/10/12 15:55
# @File     : 拆分含多种分隔符的字符串.py

#拆分 s = 'ab;cd|efgh|hi,jkl|mn\topq;rst,uvw\txyz'
#其中<,>,<;>,<|>,<\t>都是分隔符号

#单一分隔符
s = 'liushuo 15196 0.0 0.0 22565 2872 pts/11 R+ 13:50 0:00 ps aux'#提取每个字段
s.split() #空白字符：空格、\t、\n等都是
print(s.split())
#['liushuo', '15196', '0.0', '0.0', '22565', '2872', 'pts/11', 'R+', '13:50', '0:00', 'ps', 'aux']

#方法 1 ：连续使用str.split()方法，每次处理一种分隔符（不推荐，不优）
s = 'ab;cd|efgh|hi,jkl|mn\topq;rst,uvw\txyz'
res = s.split(';')
print(list(map(lambda  x: x.split('|'),res)))#结果变成二维列表
#[['ab'], ['cd', 'efgh', 'hi,jkl', 'mn\topq'], ['rst,uvw\txyz']]

#降维：二维变一维
t = [] #临时列表
print(list(map(lambda  x: t.extend(x.split('|')),res)))# [None, None, None]
print(t) #['ab', 'cd', 'efgh', 'hi,jkl', 'mn\topq', 'rst,uvw\txyz']

res = t
t = []
print(list(map(lambda  x: t.extend(x.split(',')),res)))#[None, None, None, None, None, None]
print(t)#['ab', 'cd', 'efgh', 'hi', 'jkl', 'mn\topq', 'rst', 'uvw\txyz']

#总结规律得出一个函数
def mySplit(s, ds):
    res = [s]

    for d in ds:
        t = []
        list(map(lambda x: t.extend(x.split(d)), res))
        res = t
    return [x for x in res if x] #过滤空字符串（当字符串中出现连续的分隔符时会产生空字符串）

s = 'ab;cd|efgh|hi,jkl|mn\topq;rst,uvw\txyz'
print(mySplit(s,';,|\t'))

#方法 2 ： 使用正则表达式的re.split()方法，一次性拆分字符串(推荐使用)
import re

s1 = 'ab;cd|efgh|hi,jkl|mn\topq;rst,uvw\txyz'
print(re.split('[,;\t|]+', s1))

二. 如何判断字符串 a 是否以字符串 b 开头或结尾

实际案例：

某文件系统目录下有一系列文件：
quicksort.c
graph.py
heap.java
install.sh
stack.cpp
......
编写程序给其中所有 .sh 文件和 .py 文件加上用户可执行权限

解决方案：

1.使用字符串的str.startswith()和str.endswith()方法
注意：多个匹配时参数使用元组

代码示例：

# _*_ coding:utf-8 _*_
# @Author   : TianYu
# @Time     : 2017/10/12 16:29
# @File     : 判断字符串a是否以字符串b开头或结尾.py

#使用字符串的str.startwith()和str.endswith()方法
#注意：多个匹配时参数使用元组
import os,stat

s1 = list(os.listdir('.')) #文件路径
print(s1)#当前目录下的文件名
s = 'a1.txt'
print(s.endswith('.txt')) #True
print(s.endswith('.py')) #False
print(s.endswith(('.txt','.py'))) # Ture  只能是元组，不能是列表

#使用列表解析对结果进行过滤
ss = [name for name in os.listdir('.') if name.endswith(('.txt','.py'))]
print(ss)
#读取文件权限
print(os.stat('a1.txt').st_mode) #33206
print(oct(os.stat('a1.txt').st_mode)) #0o100666
#在stat中找到掩码
print(stat.S_IXUSR)
#修改文件权限
os.chmod('a1.txt',os.stat('a1.txt').st_mode | stat.S_IXUSR) #增加一个权限

三. 如何调整字符串中文本的格式

实际案例：

某软件的log文件，其中的日期格式为‘yyyy-mm-dd’：
......
2016-05-23 10:59:26 status unpacked python....
2016-05-23 10:59:26 status install python....
2016-05-23 10:59:26 status half -configured ....
2016-05-23 10:59:26 configure python.whl:aa....
......

解决方案：

使用正则表达式re.sub()方法做字符串替换，利用正则表达式的捕获组，捕获每个部分内容，在替换字符串中调整各个捕获组的顺序。

代码示例：

# _*_ coding:utf-8 _*_
# @Author   : TianYu
# @Time     : 2017/10/13 16:04
# @File     : 如何调整字符串中文本的格式.py

#其实就是字符串替换问题

import re

log = open('a1.txt').read()

#re.sub('(\d{4})-(\d{2})-(\d{2})',r'\2/\3/\1', log) #(\d{4}) 是一个正则表达式捕获组，其他类似
#'\2/\3/\1' 代表 替代格式为 月/日/年

#起别名（推荐使用）
re.sub('(?P\d{4})-(?P\d{2})-(?P\d{2})',r'\g/\g/\g', log)

#结果为：日期格式变为05/23/2016

要努力要奋斗

python高级编程技巧（5）

一. 如何拆分含有多种分隔符的字符串

解决方案：

代码示例：

二. 如何判断字符串 a 是否以字符串 b 开头或结尾

解决方案：

代码示例：

三. 如何调整字符串中文本的格式

解决方案：

代码示例：

你可能感兴趣的:(python高级编程技巧（5）)