第一种方法:
l1 = ['b','c','d','b','c','a','a']
l2 = list(set(l1))
print l2
第二种方法:
l1 = ['b','c','d','b','c','a','a']
l2 = {}.fromkeys(l1).keys()
print l2
测试两种方法的速度:
a = []
# 2000W个数据,去掉其中一半重复的数据
for i in range(0, 10000000):
a.append(i)
a.append(i)
start = time.clock()
b = list(set(a))
print 'time1', time.clock() - start
start = time.clock()
b = {}.fromkeys(a).keys()
print 'time2', time.clock() - start
对比结果:
time1 0.45481
time2 0.670189
明显第一种方法的速度要比第二种速度要快
这两种都有个缺点,祛除重复元素后排序变了
['a', 'c', 'b', 'd']
用list类的sort方法
l1 = ['b','c','d','b','c','a','a']
l2 = list(set(l1))
l2.sort(key=l1.index)
print l2
也可以这样写
l1 = ['b','c','d','b','c','a','a']
l2 = sorted(set(l1),key=l1.index)
print l2
也可以用遍历
l1 = ['b','c','d','b','c','a','a']
l2 = []
for i in l1:
if not i in l2:
l2.append(i)
print l2
上面的代码也可以这样写
l1 = ['b','c','d','b','c','a','a']
l2 = []
[l2.append(i) for i in l1 if not i in l2]
print l2
这样就可以保证排序不变了:
['b', 'c', 'd', 'a']