把某个字符串依据分隔符拆分不同的字段,该字符串有多种不同的分隔符,例如:
s = “a/b;c|d\tefg|hijk\tlmn;op/q;r\tstuvw;xyz”
其中 “,”、”;”、“|”、”\t”都是分隔符
“space”、“\t”、”\r”、”\n”等字符
In [1]: s = "a b\rc\nd\n e\nf"
In [2]: s.split()
Out[2]: ['a', 'b', 'c', 'd', 'e', 'f']
In [1]: s = "a;b;c"
In [2]: s.split(";")
Out[2]: ['a', 'b', 'c']
s = "a/b;c|d\tefg|hijk\tlmn;op/q;r\tstuvw;xyz"
sign = ['/', '|', '\t', ';']
def my_split(s, sign):
s = [s]
for i in sign:
t = []
for x in s:
map(lambda x: t.extend(x.split(i)), s)
s = t
return s
print(my_split(s,sign))
[]
这个网上通用的做法,Python3版本由于 lambda 引起的闭包问题,就不能使用
这个原因详细介绍参考:
http://blog.csdn.net/lanhaixuanvv/article/details/78628516
s = "a/b;c|d\tefg|hijk\tlmn;op/q;r\tstuvw;xyz"
sign = ['/', '|', '\t', ';']
def my_split(s, sign):
for i in sign:
t = []
#map(lambda x: t.extend(x.split(i)), s)
for x in s:
def lambd(i, t, x, s):
map(t.extend(x.split(i)),s)
lambd(i, t, x, s)
s = t
return s
print(my_split(s.split(),sign))
['a', 'b', 'c', 'd', 'efg', 'hijk', 'lmn', 'op', 'q', 'r', 'stuvw', 'xyz']
import re
s = "a/b;c|d\tefg|hijk\tlmn;op/q;r\tstuvw;xyz"
res = re.split(r'[/\t\n;|]', s)
print(res)
['a', 'b', 'c', 'd', 'efg', 'hijk', 'lmn', 'op', 'q', 'r', 'stuvw', 'xyz']