Writing Pythonic Loops
One of the easiest ways to spot a developer with a background in Cstyle languages who only recently picked up Python is to look at how they write loops.
写循环的方式可以鉴别一个C程序员和python程序员。
For example, whenever I see a code snippet like the following, that’s an example of someone trying to write Python like it’s C or Java:
my_items = ['a', 'b', 'c']
i = 0
while i < len(my_items):
print(my_items[i])
i += 1
上面的例子是Java或者C的写法。
Now, what’s so “unpythonic” about this code, you ask? Two things:
First, it keeps track of the index i manually—initializing, it to zero and then carefully incrementing it upon every loop iteration.
这个需要追踪索引i,初始化索引为0,然后小心翼翼的再每次循环中自增加一个值。
And second, it uses len() to get the size of the my_items container in order to determine how often to iterate.
第二个就是为了决定多少次迭代,使用len()来获取容器大小
In Python you can write loops that handle both of these responsibilities automatically. It’s a great idea to take advantage of that. For example, it’s much harder to write accidental infinite loops if your code doesn’t have to keep track of a running index. It also makes the code more concise and therefore more readable.
To refactor this first code example, I’ll start by removing the code that manually updates the index. A good way to do that is with a for-loop in Python. Using the range() built-in, I can generate the indexes automatically:
>>> range(len(my_items))
range(0, 3)
>>> list(range(0, 3))
[0, 1, 2]
使用range()来自动生成索引。
The range type represents an immutable sequence of numbers. Its advantage over a regular list is that it always takes the same small amount of memory. Range objects don’t actually store the individual values representing the number sequence—instead, they function as iterators and calculate the sequence values on the fly.
range相较于list而言,他总是占据最小的内存。range并不实际的保存单个值,而是像迭代器那样的工作,计算序列值。
So, rather than incrementing i manually on each loop iteration, I could take advantage of the range() function and write something like this:
for i in range(len(my_items)):
print(my_items[i])
This is better. However, it still isn’t very Pythonic and it still feels more like a Java-esque iteration construct than a proper Python loop. When you see code that uses range(len(...)) to iterate over a container you can usually simplify and improve it further.
As I mentioned, in Python, for-loops are really “for-each” loops that can iterate directly over items from a container or sequence, without having to look them up by index. I can use this to simplify this loop even more:
for item in my_items:
print(item)
I would consider this solution to be quite Pythonic. It uses several advanced Python features but remains nice and clean and almost reads like pseudo code from a programming textbook. Notice how this loop no longer keeps track of the container’s size and doesn’t use a running index to access elements.
pseudo code:伪代码
The container itself now takes care of handing out the elements so they can be processed. If the container is ordered, the resulting sequence of elements will be too. If the container isn’t ordered, it will return its elements in arbitrary order but the loop will still cover all of them.
如果容器是排序的,那么结果就是按照顺序输出的。如果容器内容是乱序的,那么结果就是乱序输出的。
Now, of course you won’t always be able to rewrite your loops like that. What if you need the item index, for example?
如果你需要索引怎么办呢?
It’s possible to write loops that keep a running index while avoiding the range(len(...)) pattern I cautioned against. The enumerate() built-in helps you make those kinds of loops nice and Pythonic:
>>> for i, item in enumerate(my_items):
... print(f'{i}: {item}')
0: a
1: b
2: c
You see, iterators in Python can return more than just one value. They can return tuples with an arbitrary number of values that can then be unpacked right inside the for-statement.
在python中的迭代器可以返回不仅仅一个值。它们可以返回包含任意数量值的元组,这些值可以在for循环语句中被解包。
This is very powerful. For example, you can use the same technique to iterate over the keys and values of a dictionary at the same time:
>>> emails = {
... 'Bob': '[email protected]',
... 'Alice': '[email protected]',
... }
>>> for name, email in emails.items():
... print(f'{name} -> {email}')
'Bob -> [email protected]'
'Alice -> [email protected]'
There’s one more example I’d like to show you. What if you absolutely, positively need to write a C-style loop. For example, what if you must control the step size for the index? Imagine you started out with the following Java loop:
for (int i = a; i < n; i += s) {
// ...
}
How would this pattern translate to Python? The range() function comes to our rescue again—it accepts optional parameters to control the start value for the loop (a), the stop value (n), and the step size (s). Therefore, our Java loop example could be translated to Python, like this:
for i in range(a, n, s):
# ...
Key Takeaways
- Writing C-style loops in Python is considered unpythonic. Avoid managing loop indexes and stop conditions manually if possible.
- Python’s for-loops are really “for-each” loops that can iterate directly over items from a container or sequence.