Monday, April 28, 2008

Sage Lisp Wars!

Once upon a time a long, long time ago in a happy world where men were real men and wrote their own device drives in assembler - well, screw that, it is 2008. This is the story about whether Maxima, which is the only Sage component written in lisp, will stay in Sage long term or not.

To make things easier let's get some things sorted out before we get into the story:

Places:

  1. "benchmarking CAS" thread in sci.math.symbolic [currently 40 posts]
  2. "multivariate factoring - use maxima ?" thread in sage-devel [currently 36 posts]
  3. "Project" thread in sage-devel [currently 75 posts]
  4. "compiling Maxima by ECL" thread in sage-devel [currently 4 posts]
  5. "compiling Maxima by ECL" thread on the maxima-devel mailing list
  6. Richard Fateman's email from the above thread about ecl that Juan replied to.
This is a tale about my personal comment in [1] stating:

This is motivated by the need for performance and the fact that we want to dump any lisp based code from the core of Sage in the long term.

while taking about the long term plan for symbolics code in Sage. I am doing much of the porting of Sage to pretty much anything that I would consider worth porting to. While some people would question the value of a port to OpenBSD [I don't, but that is a different blog post I had in my head for a while without the time to write it down :)] few people would think that the port of Sage to native Windows in 32 & 64 bit mode would be bad for Sage. But porting Sage to any platform also means porting Maxima and since Maxima is written in common lisp it does require a common lisp implementation. Fortunately Maxima quite happily runs on top of cmulc, sbcl, clisp and gcl. But since Sage is a CAS where every component compiles from source and cmucl as well sbcl require another lisp to bootstrap themselves only clisp and gcl are fit as candidates for Sage. Incidentally sbcl is an amicable fork of cmucl since cmucl can only bootstrap itself off another cmucl instance and in contrast sbcl requires only a common lisp capable lisp instance to do so, but I am getting off path here.

So, there are two candidates to be used as common lisp implementations in Sage. Where is the trouble? Well, the devil is in the details. We need one and only one tarball that builds on a wide variety of systems and way back in the infancy of Sage William decided to go with clisp since it did build for him much more easily on the then supported platforms. gcl is more difficult to build and is lacking support for some platforms we will port to and on others build trouble is pretty much everywhere. See [3] for some examples. But clisp is troubled itself. It has to be compiled with gcc 4.2 and 4.3 with "-O0" due to some compiler bug in gcc. But on Solaris the last clisp release that somebody got to compile and run was 2.38 in 32 bit mode while the current release is 2.44.1. I did do much more than I thought was expected of me to resolve the issue, but in the end there was no solution (see [3] again).

All of the above has lead me to the conclusion that the lisp ecosystem is in "trouble" because its Open Source toolset is in a terrible state. The only ray of hope I have seen in the Open Source lisp ecosystem is ecl, but Maxima does not run on top of it. Way back in 2005 Michael Goffioul did some work to get Maxima to compile on top of ecl, but it seemingly didn't go anywhere. So I saw little chance for Maxima on top of ecl and since I neither have the time nor the expertise to attempt such a port my conclusion was to get rid of Maxime in the core of Sage as quickly as possible. There are various Sage developers who do think alike, so a plan was devised to replace everything use in Maxima with either other components or by code written from scratch. But then in [3] Robert Dodier told us that he had done some work on Maxima on top of ecl in January of 2008 and he was willing to attempt to finish the port since he also saw gcl's use of the Maxima on Windows installer troublesome. Progress in this direction has been made (see [4]) and now I am quite hopeful that we will have Maxima on top of ecl and therefore one of the showstoppers of Sage on Solaris as well as Sage on native Windows has been removed without making Maxima collateral damage in the porting process. Maxima will still have to compete with the other components in Sage and some functionality will be moved into the core of Sage by rewriting it in C/Cython, but I now see its future assured in the long term.

So, what motivated me to write this all down? Most people either don't care or read the long threads on sage-devel and a couple other places or have actually read some or most of the discussions. It was actually the reply to an email of Richard Fateman [6] by Juan Jose Garcia-Ripoll, the ecl maintainer, which deserves to be quote in full:

Hi,

let me first begin by saying that, as politely as I can, Fateman's email are as close to FUD as it can get. He doesn't seem to use ECL at all and just judges from some outdated webpages and his own prejudices about different software libraries.

Regarding the different points which have been raised:

* GNU MP is only used for bignum computations. ECL itself is clever enough to handle fixnums cleverly and even to unbox fixnum computations in compiled code. Incidentally, GCL uses GMP as well, so I do not see the point all.

* The simplistic garbage collector is an option and it is provided for platforms in which Boehm-Weiser does not run. Currently, this means _none_ of the supported platforms.

* Boehm-Weiser is a strong garbage collector and a very powerful one in terms of tunability. You man make it as precise as you want, and the Java people indeed do. ECL uses it and it has seen only performance improvements as we have learnt more and more how to better use it. If the ANSI test suite shows something in that respect is that, under a lot of consing pressure, it does not perform that bad. It does not get so close to SBCL's but I doubt any other free implementation does.

* ECL has a good compromise between all platforms. It provides both C compiled code and a reasonably fast interpreter. Benchmark show that the ECL interpreter is not that far from interpreted CLISP. But on the other hadn CLISP has its own set of optimized bytecodes and when it compiles it optimizes for those bytecodes. AFAI remember, GCL used (and probably still uses) a list based interpreter which runs through forms represented as lists and macroexpanding every form that has to be done so, and every time it uses it. That is terribly inefficient.

* In terms of maintainability it has shown through the years that it is easier for somebody to start coding and hacking ECL and adding new features than with most other platforms. That is how we got ECL ported to the Microsoft compiler and platform and how different pieces of software (sockets, asdf, etc) have been adapted to run here. That by itself is an important value, at least for people who think long term.

* Talking about diverting efforts from the GCL crowd, I am not the best person to speak about it. I am more than pissed of by the GCL community since, shortly after ECL reached most ANSI compliance and portability I was asked to port all that back to the GCL, because they wanted to achieve the same goal. That was back in '01 or '02, do not remember so well. What I remember is that those were not very polite emails and had a kind of "borg" spirit of assimilation, without even caring about the years spent on achieving that. So talk about diverting useful efforts.

So, to the interested parties, if you so much care about Maxima running on just one computer, then stick to sbcl and cmucl which are pretty superior implementations, but please do not scare people from porting useful software to other platforms and environments.

Kind regards

Juanjo

-- Facultad de Fisicas, Universidad Complutense,
Ciudad Universitaria s/n Madrid 28040 (Spain)
http://juanjose.garciaripoll.googlepages.com


So I can only recommend that if you are willing to waste an hour or two to read all the sources and make yourself a picture for yourself. There is much more to the story than I wrote here and I did not go into as much details as I could have since I wanted to finish this blog post in a reasonable amount of time. This is precisely the reason I didn't write it in the first place a couple weeks ago.

So, will there be "Sage Lisp Wars - The Sequel!"? I am sure there will be and you can easily figure out which side of the debate I am on ;) Despite of all the flames I linked to above in the end I feel that the Sage community got something positive in the end by the likely port of Maxima to ecl and at the same time hopefully the Maxima community will see the port to ecl as a benefit to them, regardless how the cooperation between Maxima and Sage will develop.

Cheers,

Michael

2 comments:

John said...

It seems to me that you could use sbcl if you wanted: first build some version of clisp (doesn't have to be the most recent), then use that to build sbcl. Then you could discard clisp if you wanted.

If maxima interacts better with sbcl, wouldn't this help?

Leslie P. Polzer said...

I surely don't get the point of saying "we want to get rid of Lisp because there's hardly a Lisp that doesn't need another Lisp to compile".

I mean, why not just have people install some Lisp binary so they are able to compile Sage?

What's the difference to gcc, javac, python as build deps?