list(iterable)
可以将任何一个可迭代对象转换成列表
>>> from collections.abc import Iterable
>>> isinstance("ABC", Iterable)
True
>>> list("ABC")
['A', 'B', 'C']
>>> chars = list("ABC")
>>> del chars[1:]
>>> chars
['A']
>>> for i, c in enumerate(chars, start=2):
... print(i, c)
...
2 A
3 B
4 C
>>> nums = [1, 2, 3, 4]
>>> [n * 10 for n in nums if n % 2 == 0]
[20, 40]
不要写太复杂的列表推导式(不易读)
不要把推导式当作代码量更少的循环
[process(task) for task in tasks]
推导式的核心在于可以返回值,上面这种更应该直接循环
mutable:list, dict, set
immutable: int, float, str, bytes, tuple
python 在进行函数调用传参时,传递的是“变量所指对象的引用”(pass-by-object-reference)。
>>> a = 1, 2, 3
>>> a
(1, 2, 3)
>>> type(a)
<class 'tuple'>
逗号才是解释器判定元组的依据
>>> nums = (1, 2, 3, 4, 5, 6)
>>> results = (n * 10 for n in nums if n % 2 == 0)
>>> results
<generator object <genexpr> at 0x000001D7ADE90E40>
>>> tuple(results) # generator对象仍是可迭代类型
(20, 40, 60)
和列表不同,在一个元组里出现类型不同的值是常见的事情
>>> user_info = ("Tom", 24, True)
user_info[1]
虽然能取到24, 但是不知道这个数字时年龄还是其他的意思
>>> from collections import namedtuple
>>> UserInfo = namedtuple('UserInfo', 'name, age, vip')
>>> user1 = UserInfo('Alice', 21, True)
>>> user1[1]
21
>>> user1.age
21
或者使用typing.NamedTyple + 类型注解
>>> from typing import NamedTuple
>>> class UserInfo(NamedTuple):
... name:str
... age:int
... vip:bool
...
>>> user2 = UserInfo('Bob', 12, False)
>>> user2.age
12
>>> movies = {'name': 'Burning', 'year': 2018}
>>> for key in movies:
... print(key, movies[key])
...
name Burning
year 2018
>>> for key, value in movies.items():
... print(key, value)
...
name Burning
year 2018
>>> movies["rating"]
Traceback (most recent call last):
File "" , line 1, in <module>
KeyError: 'rating'
# 1
>>> if 'rating' in movies:
... rating = movies['rating']
... else:
... rating = 0
...
>>> rating
0
# 2
>>> try:
... rating = movies['rating']
... except KeyError:
... rating = 0
...
>>> rating
0
# 3
>>> movies.get('rating', 0)
0
>>> try:
... movies['rating'] += 1
... except KeyError:
... movies['rating'] = 0
...
>>> movies
{'name': 'Burning', 'year': 2018, 'rating': 0}
or
>>> movies = {'name': 'Burning', 'year': 2018}
>>> movies.setdefault('rating', 0) # rating不存在,则设为0
0
>>> movies
{'name': 'Burning', 'year': 2018, 'rating': 0}
>>> movies.setdefault('rating', 1) # rating存在,则返回其值
0
可以使用del d[key]
删除字典某个键,如果要删除的键不存在,则KeyError
or
d.pop(key, None)
若key存在,返回key对应的value; 反之返回None
当然,pop的主要用途是取出键对应的值
>>> d1 = {'foo': 3, 'bar': 4}
>>> {key: value * 10 for key, value in d1.items() if key == 'foo'}
{'foo': 30}
python3.7(3.6)后字典有序
from collections import OrderdDict
相比自带的字典有序,OrderDict有以下特点:
.move_to_end()
等普通字典没有的方法无序可变
.add()
添加新成员
>>> f_set = frozenset(['foo', 'bar'])
>>> f_set
frozenset({'bar', 'foo'})
>>> f_set.add('apple')
Traceback (most recent call last):
File "" , line 1, in <module>
AttributeError: 'frozenset' object has no attribute 'add'
交集 &
并集 |
差集 - (前一个集合有,后一个集合没有的东西)
还有 symmetric_difference, issubset等
>>> invalid_set = {'foo', [1, 2, 3]}
Traceback (most recent call last):
File "" , line 1, in <module>
TypeError: unhashable type: 'list'
不可变的内置类型,如str, int, tuple, frozenset等,都是可哈希的
可变的内置类型,如dict, list等,都是不可哈希的
对于不可变容器类型,如tuple, forzenset等,仅当其所有成员都不可变时,它是可哈希的
用户定义的类型默认都是可哈希的
>>> nums = [1, 2, 3, 4]
>>> nums_copy = nums # 只是改变指向,没有任何拷贝操作
>>> nums[2] = 30
>>> nums, nums_copy
([1, 2, 30, 4], [1, 2, 30, 4])
深浅拷贝区别:浅拷贝无法解决嵌套对象被修改的问题!
1 copy.copy()
>>> import copy
>>> nums = [1, 2, 3, 4]
>>> nums_copy = copy.copy(nums)
>>> nums[2] = 30
>>> nums, nums_copy
([1, 2, 30, 4], [1, 2, 3, 4])
2 推导式
>>> d = {'foo': 1}
>>> d2 = {key: value for key, value in d.items()}
>>> d['foo'] = 2
>>> d, d2
({'foo': 2}, {'foo': 1})
3 各容器类型的内置函数
>>> d = {'foo': 1}
>>> d2 = dict(d.items())
>>> d, d2
({'foo': 1}, {'foo': 1})
>>> nums = [1, 2, 3, 4]
>>> nums_copy = list(nums)
>>> nums, nums_copy
([1, 2, 3, 4], [1, 2, 3, 4])
4 全切片
>>> nums = [1, 2, 3, 4]
>>> nums_copy = nums[:]
>>> nums, nums_copy
([1, 2, 3, 4], [1, 2, 3, 4])
5 某些类型自提供浅拷贝
>>> nums = [1, 2, 3, 4]
>>> nums_copy = nums.copy()
>>> nums, nums_copy
([1, 2, 3, 4], [1, 2, 3, 4])
>>> d = {'foo': 1}
>>> d2 = d.copy()
>>> d, d2
({'foo': 1}, {'foo': 1})
>>> items = [1, ['foo', 'bar'], 2, 3]
>>> items_copy = copy.copy(items)
>>> items[0] = 100
>>> items[1].append('xxx')
>>> items, items_copy
([100, ['foo', 'bar', 'xxx'], 2, 3], [1, ['foo', 'bar', 'xxx'], 2, 3])
>>> id(items[1]), id(items_copy[1])
(2025849749952, 2025849749952)
可以看到,浅拷贝后item[1]和item_copy[1]对应的仍是同一个对象
items_copy = copy.deepcopy(items)
>>> def generate_even(max_number):
... for i in range(0, max_number):
... if i % 2 == 0:
... yield i
...
>>> i = generate_even(10)
>>> next(i)
0
>>> next(i)
2
>>> next(i)
4
def batch_process(items):
results = []
for item in items:
# processed_item = ...
results.append(processed_item)
return results
上述代码存在两个问题:
改进:
def batch_process(items):
for item in items:
# processed_item = ...
yield processed_item
在某些情况下中断:
for processed_item in batch_process(item):
# 如果一个对象过期了,剩下的就都不处理了
if processed_item.has_expired():
break
在列表尾部插入数据比头部快得多
因为列表底层使用的是C的数组(头部插入数据,数组其他元素都要后移)
如果确实需要经常头部插入数据,可以使用collection.deque
因为deque是双端队列
此外由于列表底层是数组,所以判断一个元素是否在(in)列表内, 一般很耗时(集合更好)。
集合底层使用的是哈希表数据结构
TimeComplexity - Python Wiki
>>> d1 = {"name": "apple"}
>>> d2 = {"price": 10}
>>> d3 = {**d1, **d2}
>>> d3
{'name': 'apple', 'price': 10}
>>> d4 = d1 | d2
# python 3.9
>>> d4
{'name': 'apple', 'price': 10}
其他解包
>>> [1, 2, *range(10)]
[1, 2, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> [*[1,2], *[3,4]]
[1, 2, 3, 4]
集合去重会失去原有顺序
>>> from collections import OrderedDict
>>> nums = [10, 2, 3, 21, 10, 3]
>>> list(OrderedDict.fromkeys(nums).keys())
[10, 2, 3, 21]
# fromkeys: 字典的键来自于可迭代对象,字典的值默认为None
因为随着列表长度的变化,索引值仍稳定变化,一般会出问题
>>> from typing import NamedTuple
>>> class Address(NamedTuple):
... country: str
... city: str
...
>>> def latlon_to_address(lat, lon):
... # processing
... return Address(
... country="country",
... city="city"
... )
...
>>> addr = latlon_to_address("lat", "lon")
>>> # addr.country
>>> # addr.city
如果我们在Address里新增district, 已有的代码逻辑也不会出问题