Is there a reason to prefer using map()
over list comprehension or vice versa? Is one generally more effecient or generally considered more pythonic than the other?
146 down vote
accepted
|
map may be microscopically faster in some cases (when you're NOT making a lambda for the purpose, but using the same function in map and a listcomp). List comprehensions may be faster in other cases and most (not all) pythonistas consider them more direct and clearer.
An example of the tiny speed advantage of map when using exactly the same function:
$ python -mtimeit -s'xs=range(10)' 'map(hex, xs)' 100000 loops, best of 3: 4.86 usec per loop $ python -mtimeit -s'xs=range(10)' '[hex(x) for x in xs]' 100000 loops, best of 3: 5.58 usec per loop
An example of how performance comparison gets completely reversed when map needs a lambda:
$ python -mtimeit -s'xs=range(10)' 'map(lambda x: x+2, xs)' 100000 loops, best of 3: 4.24 usec per loop $ python -mtimeit -s'xs=range(10)' '[x+2 for x in xs]' 100000 loops, best of 3: 2.32 usec per loop
|
answered
Aug 7 '09 at 23:45
Alex Martelli
262k
27
459
811
|
|
Cases
- Common case: Almost always, you will want to use a list comprehension in pythonbecause it will be more obvious what you're doing to novice programmers reading your code. (This does not apply to other languages, where other idioms may apply.)
- Less-common case: However if you already have a function defined, it is often reasonable to use
map
, though it is considered 'unpythonic'. For example,map(sum, myLists)
is more elegant/terse than [sum(x) for x in myLists]
. You gain the elegance of not having to make up a dummy variable (e.g. sum(x) for x...
or sum(_) for _...
or sum(readableName) for readableName...
) which you have to type twice, just to iterate. The same argument holds for filter
and reduce
and anything from the itertools
module: if you already have a function handy, go ahead and do some functional programming. This gains readability in some situations, and loses it in others (e.g. novice programmers, multiple arguments)... but the readability of your code highly depends on your comments anyway.
- Almost never: You may want to use the
map
function as a pure abstract function while doing functional programming, where you're mapping map
, or currying map
, or otherwise benefit from talking about map
as a function. In Haskell for example, a functor interface called fmap
generalizes mapping over any data structure. This is very uncommon in python because the python grammar compels you to use generator-style to talk about iteration; you can't generalize it easily. (This is sometimes good and sometimes bad.) You can probably come up with rare python examples where map(f, *lists)
is a reasonable thing to do. The closest example I can come up with would be sumEach = partial(map,sum)
, which is a one-liner that is very roughly equivalent to:
def sumEach(myLists): return [sum(_) for _ in myLists]
"Pythonism"
I dislike the word "pythonic" because I don't find that pythonic is always elegant in my eyes. Nevertheless, map
and filter
and similar functions (like the very usefulitertools
module) are probably considered unpythonic in terms of style.
Laziness
In terms of efficiency, like most functional programming constructs, MAP CAN BE LAZY, and in fact is lazy in python. That means you can do this (in python3) and your computer will not run out of memory and lose all your unsaved data:
>>> map(str, range(10**100)) <map object at 0x2201d50>
Try doing that with a list comprehension:
>>> [str(n) for n in range(10**100)] # DO NOT TRY THIS AT HOME OR YOU WILL BE SAD #
Do note that list comprehensions are also inherently lazy, but python has chosen to implement them as non-lazy. Nevertheless, python does support lazy list comprehensions in the form of generator expressions, as follows:
>>> (str(n) for n in range(10**100)) <generator object <genexpr> at 0xacbdef>
You can basically think of the [...]
syntax as passing in a generator expression to the list constructor, like list(x for x in range(5))
.
Efficiency comparison for python3
map
is now lazy:
% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=map(f,xs)' 1000000 loops, best of 3: 0.336 usec per loop ^^^^^^^^^
Therefore if you will not be using all your data, or do not know ahead of time how much data you need, map
in python3 (and generator expressions in python2 or python3) will avoid calculating their values until the last moment necessary. Usually this will usually outweigh any overhead from using map
. The downside is that this is very limited in python as opposed to most functional languages: you only get this benefit if you access your data left-to-right "in order", because python generator expressions can only be evaluated the order x[0], x[1], x[2], ...
.
However let's say that we have a pre-made function f
we'd like to map
, and we ignore the laziness of map
by immediately forcing evaluation with list(...)
. We get some very interesting results:
% python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=list(map(f,xs))' 10000 loops, best of 3: 165/124/135 usec per loop ^^^^^^^^^^^^^^^ for list(<map object>) % python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=[f(x) for x in xs]' 10000 loops, best of 3: 181/118/123 usec per loop ^^^^^^^^^^^^^^^^^^ for list(<generator>), probably optimized % python3 -mtimeit -s 'xs=range(1000)' 'f=lambda x:x' 'z=list(f(x) for x in xs)' 1000 loops, best of 3: 215/150/150 usec per loop ^^^^^^^^^^^^^^^^^^^^^^ for list(<generator>)
In results are in the form AAA/BBB/CCC where A was performed with on a circa-2010 Intel workstation with python 3.?.?, and B and C were performed with a circa-2013 AMD workstation with python 3.2.1, with extremely different hardware. The result seems to be that map and list comprehensions are comparable in performance, which is most strongly affected by other random factors. The only thing we can tell seems to be that, oddly, while we expect list comprehensions [...]
to perform better than generator expressions (...)
, map
is ALSO more efficient that generator expressions (again assuming that all values are evaluated/used).
It is important to realize that these tests assume a very simple function (the identity function); however this is fine because if the function were complicated, then performance overhead would be negligible compared to other factors in the program. (It may still be interesting to test with other simple things like f=lambda x:x+x
)
If you're skilled at reading python assembly, you can use the dis
module to see if that's actually what's going on behind the scenes:
>>> listComp = compile('[f(x) for x in xs]', 'listComp', 'eval') >>> dis.dis(listComp) 1 0 LOAD_CONST 0 (<code object <listcomp> at 0x2511a48, file "listComp", line 1>) 3 MAKE_FUNCTION 0 6 LOAD_NAME 0 (xs) 9 GET_ITER 10 CALL_FUNCTION 1 13 RETURN_VALUE >>> listComp.co_consts (<code object <listcomp> at 0x2511a48, file "listComp", line 1>,) >>> dis.dis(listComp.co_consts[0]) 1 0 BUILD_LIST 0 3 LOAD_FAST 0 (.0) >> 6 FOR_ITER 18 (to 27) 9 STORE_FAST 1 (x) 12 LOAD_GLOBAL 0 (f) 15 LOAD_FAST 1 (x) 18 CALL_FUNCTION 1 21 LIST_APPEND 2 24 JUMP_ABSOLUTE 6 >> 27 RETURN_VALUE
>>> listComp2 = compile('list(f(x) for x in xs)', 'listComp2', 'eval') >>> dis.dis(listComp2) 1 0 LOAD_NAME 0 (list) 3 LOAD_CONST 0 (<code object <genexpr> at 0x255bc68, file "listComp2", line 1>) 6 MAKE_FUNCTION 0 9 LOAD_NAME 1 (xs) 12 GET_ITER 13 CALL_FUNCTION 1 16 CALL_FUNCTION 1 19 RETURN_VALUE >>> listComp2.co_consts (<code object <genexpr> at 0x255bc68, file "listComp2", line 1>,) >>> dis.dis(listComp2.co_consts[0]) 1 0 LOAD_FAST 0 (.0) >> 3 FOR_ITER 17 (to 23) 6 STORE_FAST 1 (x) 9 LOAD_GLOBAL 0 (f) 12 LOAD_FAST 1 (x) 15 CALL_FUNCTION 1 18 YIELD_VALUE 19 POP_TOP 20 JUMP_ABSOLUTE 3 >> 23 LOAD_CONST 0 (None) 26 RETURN_VALUE
>>> evalledMap = compile('list(map(f,xs))', 'evalledMap', 'eval') >>> dis.dis(evalledMap) 1 0 LOAD_NAME 0 (list) 3 LOAD_NAME 1 (map) 6 LOAD_NAME 2 (f) 9 LOAD_NAME 3 (xs) 12 CALL_FUNCTION 2 15 CALL_FUNCTION 1 18 RETURN_VALUE
It seems it is better to use [...]
syntax than list(...)
. Sadly the map
class is a bit opaque to disassembly, but we can make due with our speed test.