In a statically typed language, the compiler knows the data-type of a variable and how to represent that.
In a dynamically-typed language, it has to keep flag (“tag”, “tagged”) describing the actual type of the value of the variable, and the program has to perform a data-dependent branch on that value each time it manipulates a variable. It also has to look up all methods and operators on it.
The knock-on effect of this on branching and data locality is lethal to general purpose runtime performance.
That’s why the dynamic language JIT benchmarks emphasise near-C speed on small inner loops but steer clear of large data-structures and data manipulation problems.
These slides by IBM researchers describe (page 3) the performance problems of dynamism:
To these I would add a couple of killers:
With modern dynamic language compilers there are buckets of neat tricks they can do to specialise the hot paths you use. I’m glad that’s gathering speed as we rediscover or reapply old tricks and learn new ones and they all compose together.
But fundamentally, when its done that, these traces have to inter-operate with as yet not known code-paths. So all escaped data has to keep its type within reach. Its at this level where you are nearly at the speed of a classic statically compiled language that I am looking at the difference and pronouncing dynamic languages fundamentally slow:
I really want dynamically-typed languages like Python to be fast. I’ve tried using Python for classic server programming - the domain of ‘systems languages’ - and really been bitten by lack-luster performance. I’m contemplating a rewrite of one server in Java right now, sadly.
Therefore, I spend time thinking about how to actually statically compile Python. After all, that’d be my dream programming language! But whenever I think through how to mix dynamic and dynamically-typed code with a statically compiled Python, I run into data representation slow-down problems:
In a dynamic language, typically all elements in an array (or other data structure) can be different types. Therefore, these could have different representation values. Therefore, these values must all be stored individually on the heap rather than in-line inside the array. This means you are performing a data-dependent branch to non-adjacent memory, which puts pressure on cache-lines.
(The V8 Javascript engine tracks the mix of types in an array and, when possible, put doubles and such in sequential memory; its still tagging stuff, but centralising it. It has to box when you assign from the array, and re-box if you then mix types in the array. Its a trade-off. Its a very neat optimisation, though :) Yay V8 “hidden classes”.)
There are some clever tricks using special bits in a variable to pack some primitives like integers into a type that is otherwise a pointer, but that costs registers to track during manipulation so adds overhead.
There are some clever approaches like JITing of hot paths, but if you start putting your values in-line and not-type-taggged rather than as tagged types individually on the heap interoperability with non-JITed code becomes hard and you get a whole can of worms if some other code goes and changes the type of one value in the middle of the array…
I’ve been thinking a lot about what is and isn’t static in Python. With SSTA andescape analysis you can determine that a surprising amount of a normal program is static. Paul Biggar gives me confidence that my guess that 90% of my Python is static is about right.
What about that 10%? Usually I can get all that static too, or imagine it being specialised for a restricted range of parameter types. All except the standard pattern in Python web-servers to dispatch to web handlers based on HTTP method (calling the ‘get’ method if you receive a GET request). That’d need the programmer to rephrase in terms of a switch statement (as in a long chain of elifs).
Robert Harper gave a great explanation of how a dynamic language isimplemented in terms of a static language with a single type. There is one quote I wish he’d expand upon:
Now I am fully aware that “the compiler can optimize this away”, at least in some cases
I believe the “some cases” he alludes to are where you have non-escaping things. You have to be able to determine the type of something that escapes when interacting with later-executing code.
Some dynamic calls are nonpolluting - a compiler can see from inspection that whilst some variable (or method) is dynamic, the dynamic code would not virally infer dynamic status on other variables. Because the varying type of a variable or the presence/absence of a method/member is restricted to identifiable types (or null), specialism is possible.
But all too often the compiler can’t see that from code inspection. And as soon as you lose track of where the execution may go, you stop knowing how that code might depend upon or change the value of other static variables. So all bets are off, all variables have to be truly dynamic again.
I have struggled to find a way to deal with the monkey patching, __set__, setattrand such efficiently in my mythical Python compiler. Tagging types kills performance.
Fast data structures are very much about memory access patterns and cache locality and minimizing branches and tagging works against each of these.
Jonathan Shapiro (EROS) wrote an excellent article that I really liked that touches on this.
Thank you for reading. I’ve talked quite a bit about programming language design and my armchair interest in it previously; you may like to look through myarchive.