The Origins of Scala
Scala, a general-purpose, object-oriented, functional language for the JVM, is the brainchild of Martin Odersky, a professor at Ecole Polytechnique Fédérale de Lausanne (EPFL). In the first part of a multi-part interview series, Martin Odersky discusses Scala's history and origins with Artima's Bill Venners.
Discovering a fascination with compilers
Bill Venners: Let's start at the beginning. How did you first become involved with programming languages?
Martin Odersky: My favorite subject was always compilers and programming languages. When I first discovered what a compiler was, as an undergrad in 1980, I immediately wanted to build one. The only computer I could remotely afford at the time would have been a Sinclair ZX 80 which had one kilobyte of RAM. I was very close to giving it a try, but, fortunately, soon after got access to a much more powerful machine, an Osborne-1. It was the world's first “portable” (meaning luggable) computer, and it looked remotely like a sewing machine tilted by 90 degrees. It had a five-inch screen which displayed 52 tiny characters per line. But it also had a very impressive 56 usable kilobytes of RAM and two floppy drives of 90K each.
In those days, I spent some time with another student in my college named Peter Sollich. We had read about a new language called Modula-2, which we found very elegant and well-engineered. So the plan was born to write a Modula-2 compiler for 8-bit Z80 computers. There was a small problem in that the only language that came with the Osborne was Microsoft Basic, which was utterly unsuitable for what we had in mind, because it did not even support procedures with parameters—all you had was global variables. Other compilers at the time were too expensive for our means. So we decided to apply the classic bootstrapping technique. Peter had written a first compiler for a small subset of Pascal in Z80 assembly language. We then used this compiler to compile a slightly larger language, and so on, during several generations, until we could compile all of Modula-2. It could produce interpreted bytecode as well as Z80 binaries. The bytecode was the most compact of any system at the time, and the binary version was the fastest for 8-bit computers. It was a pretty capable system for its time.
Shortly before we finished our compiler, Borland came out with Turbo Pascal, and they were considering going into the Modula-2 market as well. In fact, Borland decided to buy our Modula-2 compiler to be sold under the name of Turbo Modula-2 for CP/M alongside an IBM PC version they wanted to develop. We offered to do the IBM PC version for them, but they told us they had it already covered. Unfortunately that version took them much longer than planned. By the time it came out, three or four years later, their implementor team had split from the company, and it became known as TopSpeed Modula-2. In the absence of an IBM-PC version, Borland never put any marketing muscle behind Turbo-Modula-2, so it remained rather obscure.
When we had finished the Modula-2 compiler, Borland offered to hire both Peter and me on the spot. Peter went to join them. I was very close to doing the same, but had the problem that I still had a year of classes and a Masters project ahead of me. I was very tempted at the time to become a college dropout. In the end, I decided to stick it out at university. During my masters project (which was about incremental parsing), I discovered that I liked doing research a lot. So in the end, I gave up on the idea of joining Borland to write compilers, and went on instead to do a Ph.D with Niklaus Wirth, the inventor of Pascal and Modula-2, at ETH Zurich.
Working to improve Java
Bill Venners: How did Scala come about? What is its history?
Martin Odersky: Towards the end of my stay in Zurich, around 1988/89, I became very fond of functional programming. So I stayed in research and eventually became a university professor in Karlsruhe, Germany. I initially worked on the more theoretical side of programming, on things like call-by-need lambda calculus. That work was done together with Phil Wadler, who at the time was at the University of Glasgow. One day, Phil told me that a wired-in assistant in his group had heard that there was a new language coming out, still in alpha stage, called Java. This assistant told Phil: "Look at this Java thing. It's portable. It has bytecode. It runs on the web. It has garbage collection. This thing is going to bury you. What are you going to do about it?" Phil said, well, maybe he's got a point there.
The answer was that Phil Wadler and I decided take some of the ideas from functional programming and move them into the Java space. That effort became a language called Pizza, which had three features from functional programming: generics, higher-order functions, and pattern matching. Pizza's initial distribution was in 1996, a year after Java came out. It was moderately successful in that it showed that one could implement functional language features on the JVM platform.
Then we got in contact with Gilad Bracha and David Stoutamire from the Sun core developer team. They said, "We're really interested in the generics stuff you've been doing; let's do a new project that does just that." And that became GJ (Generic Java). So we developed GJ in 1997/98, and six years later it became the generics in Java 5, with some additions that we didn't do at the time. In particular, the wildcards in Java generics were developed later independently by Gilad Bracha and people at Aarhus university.
Although our generics extensions were put on hold for six years, Sun developed a much keener interest in the compiler I had written for GJ. It proved to be more stable and maintainable than their first Java compiler. So they decided to make the GJ compiler the standard javac compiler from their 1.3 release on, which came out in 2000.
Designing a language better than Java
Martin Odersky: Now, during the Pizza and GJ experience I sometimes felt frustrated, because Java is an existing language with very hard constraints. As a result, I couldn't do a lot of things the way I would have wanted to do them—the way I was convinced would be the right way to do them. So after that time, when essentially the focus of my work was to make Java better, I decided that it was time to take a step back. I wanted to start with a clean sheet, and see whether I could design something that's better than Java. But at the same time I knew that I couldn't start from scratch. I had to connect to an existing infrastructure, because otherwise it's just impractical to bootstrap yourself out of nothing without any libraries, tools, and things like that.So I decided that even though I wanted to design a language that was different from Java, it would always connect to the Java infrastructure—to the JVM and its libraries. That was the idea. It was a great opportunity for me that at that time I became a professor at EPFL, which provides an excellent environment for independent research. I could form a small group of researchers that could work without having to chase all the time after external grants.
At first we were pretty radical. We wanted to create something that built on a very beautiful model of concurrency called the join calculus. We created an object-oriented version of the join calculus called Functional Nets and a language called Funnel. After a while, however, we found out that Funnel, being a very pure language, wasn't necessarily very practical to use. Funnel was built on a very small core. A lot of things that people usually take for granted (such as classes, or pattern matching) were provided only by encodings into that core. This is a very elegant technique from an academic point of view. But in practice it does not work so well. Beginners found the necessary encodings rather difficult, whereas experts found it boring to have to do them time and time again.
As a result, we decided to start over again and do something that was sort of midway between the very pure academic language Funnel, and the very pragmatic but at some points restrictive GJ. We wanted to create something that would be at the same time practical and useful and more advanced than what we could achieve with Java. We started working on this language, which we came to call Scala, in about 2002. The first public release was in 2003. A relatively large redesign happened early 2006. And it's been growing and stabilizing since.
Constraints on improving Java
Bill Venners: You said you found it frustrating at times to have the constraints of needing to be backwards compatible with Java. Can you give some specific examples of things you couldn't do when you were trying to live within those constraints, which you were then able to do when you changed to doing something that's binary but not source compatible?
Martin Odersky: In the generics design, there were a lot of very, very hard constraints. The strongest constraint, the most difficult to cope with, was that it had to be fully backwards compatible with ungenerified Java. The story was the collections library had just shipped with 1.2, and Sun was not prepared to ship a completely new collections library just because generics came about. So instead it had to just work completely transparently.
That's why there were a number of fairly ugly things. You always had to have ungenerified types with generified types, the so called raw types. Also you couldn't change what arrays were doing so you had unchecked warnings. Most importantly you couldn't do a lot of the things you wanted to do with arrays, like generate an array with a type parameter T, an array of something where you didn't know the type. You couldn't do that. Later in Scala we actually found out how to do that, but that was possible only because we could drop in Scala the requirement that arrays are covariant.
Bill Venners: Can you elaborate on the problem with Java's covariant arrays?
Martin Odersky: When Java first shipped, Bill Joy and James Gosling and the other members of the Java team thought that Java should have generics, only they didn't have the time to do a good job designing it in. So because there would be no generics in Java, at least initially, they felt that arrays had to be covariant. That means an array ofString
is a subtype of array of
Object
, for example. The reason for that was they wanted to be able to write, say, a “generic” sort method that took an array of
Object
and a comparator and that would sort this array of
Object
. And then let you pass an array of
String
to it. It turns out that this thing is type unsound in general. That's why you can get an array store exception in Java. And it actually also turns out that this very same thing blocks a decent implementation of generics for arrays. That's why arrays in Java generics don't work at all. You can't have an array of list of string, it's impossible. You're forced to do the ugly raw type, just an array of list, forever. So it was sort of like an original sin. They did something very quickly and thought it was a quick hack. But it actually ruined every design decision later on. So in order not to fall into the same trap again, we had to break off and say, now we will not be upwards compatible with Java, there are some things we want to do differently