原文地址: http://www.behnel.de/cython200910/talk.html以下为原文
Passionate Python developer since 2002
after Basic, Logo, Pascal, Prolog, Scheme, Java, C, ...
CS studies in Germany, Ireland, France
PhD in distributed systems in 2007
Language design for self-organising systems
Darmstadt University of Technologies, Germany
Current occupations:
http://codespeak.net/lxml/
IT transformations, SOA design, Java-Development, ...
Employed by Senacor Technologies AG, Germany
»lxml« OpenSource XML toolkit for Python
»Cython«
Part 1: Intro to Cython
Part 2: Building Cython modules
Part 3: Writing fast code
Part 4: Talking to other extensions
Cython is the missing link
between the simplicity of Python
and the speed of C / C++ / Fortran.
Cython is the missing link
between the simplicity of Python
and the speed of C / C++ / Fortran.
Cython is
an Open-Source project
http://cython.org
http://pypi.python.org/pypi/Cython
a Python compiler (almost)
an enhanced, optimising fork of Pyrex
an extended Python language for
writing fast Python extension modules
interfacing Python with C libraries
Robert Bradshaw, Stefan Behnel, Dag Sverre Seljebotn
lead developers
Lisandro Dalcín
C/C++ portability and various feature patches
Kurt Smith, Danilo Freitas
Google Summer of Code 2009: Fortran/C++ integration
Greg Ewing
main developer and maintainer of Pyrex
many, many others - see
http://cython.org/
the mailing list archives of Cython and Pyrex
you write Python code
Cython translates it into C code
your C compiler builds a shared library for CPython
you import your module into CPython
Cython has support for
optionally compile Python code from setup.py!
Cython does that for its own modules :-)
distutils
embedding the CPython runtime in an executable
# file: worker.pyclass HardWorker(object): u"Almost Sisyphos" def __init__(self, task): self.task = task def work_hard(self, repeat=100): for i in range(repeat): self.task()def add_simple_stuff(): x = 1+1HardWorker(add_simple_stuff).work_hard()
compile with
$ cython worker.py
translates to ~1500 line .c file (Cython 0.11.3)
helps tracing your own code in generated sources
different C compilers, Python versions, ...
lots of portability #define's
tons of helpful C comments with Python code snippets
a lot of code that you don't want to write yourself
Cython compiler generates C code that compiles
with all major compilers (C and C++)
on all major platforms
in Python 2.3 through 3.1
Cython language syntax follows Python 2.6
get involved to get it quicker!
optional Python 3 syntax support is on TODO list
... the fastest way to port Python 2 code to Py3 ;-)
most of Python 2 syntax is supported
top-level classes and functions
control structures: loops, with, try-except/finally, ...
object operations, arithmetic, ...
plus many Py3 features:
list/set/dict comprehensions
keyword-only arguments
extended iterable unpacking (a,b,*c,d = some_list)
Inner functions with closures
def factory(a,b): def closure_function(c): return a+b+c return closure_function
status: (hopefully) to be merged for 0.12
improved C++ integration (GSoC 2009)
e.g. function/operator overloading support
status: mostly there, to be finished and integrated
improved Fortran integration (GSoC 2009)
talking to Fortan code directly
status: mostly there, to be finished and integrated
native array data type with SIMD behaviour
status: large interest, implementation pending
... as usual: great ideas, little time
local/inner classes (~open)
lambda expressions (~easy)
generators (~needs work)
generator expressions (~easy)
with obvious optimisations, e.g.
set( x.a for x in some_list )== { x.a for x in some_list }
... all certainly on the TODO list for 1.0.
Cython generates very efficient C code:
PyBench: most benchmarks run 20-80% faster
conditions and loops run 5-8x faster than in Py2.6.2
overall about 30% faster for plain Python benchmark
obviously, real applications are different
PyPy's richards.py benchmark:
heavily class based scheduler
20% faster than CPython 2.6.2
Cython supports optional type declarations that
can be employed exactly where performance matters
let Cython generate plain C instead of C-API calls
make richards.py benchmark 5x faster than CPython
without Python code modifications :)
can make code 100 - 1000x faster than CPython
expect several 100 times in calculation loops
Part 1: Intro to Cython
Part 2: Building Cython modules
Part 3: Writing fast code
Part 4: Talking to other extensions
To compile Python code (.py) or Cython code (.pyx)
you need:
Cython, Python and a C compiler
you can use:
web app that supports writing and running Cython code
on-the-fly build + import (for experiments)
setup.py script (likely required anyway)
distutils
pyximport
Sage notebook
cython source.pyx + manual C compilation
A minimal setup.py script:
from distutils.core import setupfrom distutils.extension import Extensionfrom Cython.Distutils import build_ext ext_modules = [Extension("worker", ["worker.py"])] setup( name = 'stupid little app', cmdclass = {'build_ext': build_ext}, ext_modules = ext_modules )
Run with
$ python setup.py build_ext --inplace
Build and import Cython code files (.pyx) on the fly
$ ls worker.pyx$ PYTHONPATH=. python
Python 2.6.2 (r262:71600, Apr 17 2009, 11:29:30)[GCC 4.3.2] on linux2Type "help", "copyright", "credits" or "license" for more information.>>> import pyximport>>> pyximport.install()>>> import worker>>> worker<module 'worker' from '~/.pyxbld/.../worker.so'>>>> worker.HardWorker<class 'worker.HardWorker'>>>> worker.HardWorker(worker.add_simple_stuff).work_hard()
pyximport can also compile Python modules:
>>> import pyximport>>> pyximport.install(pyimport = True)>>> import shlex[lots of compiler errors from different modules ...]>>> help(shlex)
currently works for a few stdlib modules
falls back to normal Python import automatically
not production ready, but nice for testing :)
# file: hw.pydef hello_world(): import sys print "Welcome to Python %d.%d!" % sys.version_info[:2]if __name__ == '__main__': hello_world()
# file: hw.pydef hello_world(): import sys print "Welcome to Python %d.%d!" % sys.version_info[:2]if __name__ == '__main__': hello_world()
Compile, link and run:
$ cython --embed hw.py # <- embed a main() function$ gcc $CFLAGS -I/usr/include/python2.6 \ -o hw hw.c -lpython2.6 -lpthread -lm -lutil -ldl$ ./hw Welcome to Python 2.6!
Part 1: Intro to Cython
Part 2: Building Cython modules
Part 3: Writing fast code
Part 4: Talking to other extensions
Plain Python code:
# integrate_py.pyfrom math import sindef f(x): return sin(x**2)def integrate_f(a, b, N): dx = (b-a)/N s = 0 for i in range(N): s += f(a+i*dx) return s * dx
Function arguments are easy
Python:
def f(x): return sin(x**2)
Cython:
def f(double x): return sin(x**2)
»cdef« keyword declares
variables with C or builtin types
cdef double dx, s
functions with C signatures
cdef double f(double x): return sin(x**2)
classes as 'builtin' extension types
cdef class MyType: cdef int field
def func(int x):
part of the Python module API
Python call semantics
cdef int func(int x):
C signature
C call semantics
cpdef int func(int x):
Python wrapper around cdef function
C calls cdef function, Python calls wrapper
note: modified C signature!
def func(int x):
caller passes Python objects for x
function converts to int on entry
implicit return type always object
cdef int func(int x):
caller converts arguments as required
function receives C int for x
arbitrary return type, defaults to object
cpdef int func(int x):
wrapper converts
C callers convert arguments as required
Python callers pass and receive objects
# integrate_py.pyfrom math import sindef f(x): return sin(x**2)def integrate_f(a, b, N): dx = (b-a)/N s = 0 for i in range(N): s += f(a+i*dx) return s * dx
# integrate_cy.pyxcdef extern from "math.h": double sin(double x)cdef double f(double x): return sin(x**2)cpdef double integrate_f(double a, double b, int N): cdef double dx, s cdef int i dx = (b-a)/N s = 0 for i in range(N): s += f(a+i*dx) return s * dx
Plain Python code:
# integrate_py.pyfrom math import sindef f(x): return sin(x**2)def integrate_f(a, b, N): dx = (b-a)/N s = 0 for i in range(N): s += f(a+i*dx) return s * dx
Python integrate_py.py | Cython integrate_py.pxd |
# integrate_py.pyfrom math import sindef f(x): return sin(x**2)def integrate_f(a, b, N): dx = (b-a)/N s = 0 for i in range(N): s += f(a+i*dx) return s * dx |
# integrate_py.pxdcimport cythoncpdef double f(double x)@cython.locals( dx=double, s=double, i=int)cpdef integrate_f( double a, double b, int N) |
# integrate_py.pxdcimport cythoncpdef double f(double x): return sin(x**2)cpdef double integrate_f(double a, double b, int N)
advantage:
Eclipse, pylint, 2to3, ...
runs unchanged in Python interpreter
plain Python code
complete Python tool-chain available
drawback:
cannot override from math import sin
no access to C functions
Plain Python code:
# integrate_py.pyfrom math import sindef f(x): return sin(x**2)def integrate_f(a, b, N): dx = (b-a)/N s = 0 for i in range(N): s += f(a+i*dx) return s * dx
http://wiki.cython.org/pure
from math import sinimport [email protected](x=cython.double)def f(x): return sin(x**2)@cython.locals(a=cython.double, b=cython.double, N=cython.Py_ssize_t, dx=cython.double, s=cython.double, i=cython.Py_ssize_t)def integrate_f(a, b, N): dx = (b-a)/N s = 0 for i in range(N): s += f(a+i*dx) return s * dx
Access to Python's builtins is heavily optimised
for ... in range()/list/tuple/dict
list.append(), list.reverse()
set([...]), tuple([...])
Further improvements in Cython 0.12
replacements for enumerate(), type()
dict([...]), unicode.encode(), list.sort()
Declaring Python types is often worth it!
Easy to add new optimisations
don't write prematurely optimised code, fix Cython!
example: dict iteration
def filter_a(d): return { key : value for key, value in d.iteritems() if 'a' not in value }import stringd = { s:s for s in string.ascii_letters }print filter_a(d)
simple change, ~30% faster:
def filter_a(dict d): # <==== return { key : value for key, value in d.iteritems() if 'a' not in value }import stringd = { s:s for s in string.ascii_letters }print filter_a(d)
simple change, ~30% faster:
def filter_a(dict d): # <==== return { key : value for key, value in d.iteritems() if 'a' not in value }import stringd = { s:s for s in string.ascii_letters }print filter_a(d)
drawback:
non-dict mapping arguments raise a TypeError
benchmark code before adding static types!
class MyClass(object):
Python class with __dict__
multiple inheritance
arbitrary Python attributes
Python methods
monkey-patcheable etc.
cdef class MyClass(SomeSuperClass):
C-only access by default, or readonly/public
only from other extension types!
"builtin" extension type
single inheritance
fixed, typed fields
Python + C methods
Use cdef classes
e.g. whenever wrapping C structs/pointers/etc.
when C attribute types are used
when the need for speed beats Python's generality
Use Python classes
for bytes/tuple subtypes (PyVarObject)
for exceptions if Py<2.5 compatibility is required
when multiple inheritance is required
when users are allowed to monkey-patch
Part 1: Intro to Cython
Part 2: Building Cython modules
Part 3: Writing fast code
Part 4: Talking to other extensions
Python 3 buffer protocol (available in Py2.6)
external C-APIs
Native support for new Python buffer protocol
PEP 3118
def inplace_invert_2D_buffer( object[unsigned char, 2] image): cdef int i, j for i in range(image.shape[0]): for j in range(image.shape[1]): image[i, j] = 255 - image[i, j]
can be supported for extension types in Py2.x
declared through .pxd files
Cython ships with numpy.pxd
array.pxd available (stdlib's array)
Cython is a tool for
translating Python code to efficient C
easily interfacing to external C/C++/Fortran code
Use it to
concentrate on the mapping, not the glue!
don't change the language just to get fast code!
concentrate on optimisations, not rewrites!
speed up existing Python modules
write C extensions for CPython
wrap C libraries in Python
a great project
a very open playground for great ideas!
Cython
C-Extensions in Python
... use it, and join the project!
http://cython.org/