Monday, April 28, 2008

Sage Lisp Wars!

Once upon a time a long, long time ago in a happy world where men were real men and wrote their own device drives in assembler - well, screw that, it is 2008. This is the story about whether Maxima, which is the only Sage component written in lisp, will stay in Sage long term or not.

To make things easier let's get some things sorted out before we get into the story:

Places:

  1. "benchmarking CAS" thread in sci.math.symbolic [currently 40 posts]
  2. "multivariate factoring - use maxima ?" thread in sage-devel [currently 36 posts]
  3. "Project" thread in sage-devel [currently 75 posts]
  4. "compiling Maxima by ECL" thread in sage-devel [currently 4 posts]
  5. "compiling Maxima by ECL" thread on the maxima-devel mailing list
  6. Richard Fateman's email from the above thread about ecl that Juan replied to.
This is a tale about my personal comment in [1] stating:

This is motivated by the need for performance and the fact that we want to dump any lisp based code from the core of Sage in the long term.

while taking about the long term plan for symbolics code in Sage. I am doing much of the porting of Sage to pretty much anything that I would consider worth porting to. While some people would question the value of a port to OpenBSD [I don't, but that is a different blog post I had in my head for a while without the time to write it down :)] few people would think that the port of Sage to native Windows in 32 & 64 bit mode would be bad for Sage. But porting Sage to any platform also means porting Maxima and since Maxima is written in common lisp it does require a common lisp implementation. Fortunately Maxima quite happily runs on top of cmulc, sbcl, clisp and gcl. But since Sage is a CAS where every component compiles from source and cmucl as well sbcl require another lisp to bootstrap themselves only clisp and gcl are fit as candidates for Sage. Incidentally sbcl is an amicable fork of cmucl since cmucl can only bootstrap itself off another cmucl instance and in contrast sbcl requires only a common lisp capable lisp instance to do so, but I am getting off path here.

So, there are two candidates to be used as common lisp implementations in Sage. Where is the trouble? Well, the devil is in the details. We need one and only one tarball that builds on a wide variety of systems and way back in the infancy of Sage William decided to go with clisp since it did build for him much more easily on the then supported platforms. gcl is more difficult to build and is lacking support for some platforms we will port to and on others build trouble is pretty much everywhere. See [3] for some examples. But clisp is troubled itself. It has to be compiled with gcc 4.2 and 4.3 with "-O0" due to some compiler bug in gcc. But on Solaris the last clisp release that somebody got to compile and run was 2.38 in 32 bit mode while the current release is 2.44.1. I did do much more than I thought was expected of me to resolve the issue, but in the end there was no solution (see [3] again).

All of the above has lead me to the conclusion that the lisp ecosystem is in "trouble" because its Open Source toolset is in a terrible state. The only ray of hope I have seen in the Open Source lisp ecosystem is ecl, but Maxima does not run on top of it. Way back in 2005 Michael Goffioul did some work to get Maxima to compile on top of ecl, but it seemingly didn't go anywhere. So I saw little chance for Maxima on top of ecl and since I neither have the time nor the expertise to attempt such a port my conclusion was to get rid of Maxime in the core of Sage as quickly as possible. There are various Sage developers who do think alike, so a plan was devised to replace everything use in Maxima with either other components or by code written from scratch. But then in [3] Robert Dodier told us that he had done some work on Maxima on top of ecl in January of 2008 and he was willing to attempt to finish the port since he also saw gcl's use of the Maxima on Windows installer troublesome. Progress in this direction has been made (see [4]) and now I am quite hopeful that we will have Maxima on top of ecl and therefore one of the showstoppers of Sage on Solaris as well as Sage on native Windows has been removed without making Maxima collateral damage in the porting process. Maxima will still have to compete with the other components in Sage and some functionality will be moved into the core of Sage by rewriting it in C/Cython, but I now see its future assured in the long term.

So, what motivated me to write this all down? Most people either don't care or read the long threads on sage-devel and a couple other places or have actually read some or most of the discussions. It was actually the reply to an email of Richard Fateman [6] by Juan Jose Garcia-Ripoll, the ecl maintainer, which deserves to be quote in full:

Hi,

let me first begin by saying that, as politely as I can, Fateman's email are as close to FUD as it can get. He doesn't seem to use ECL at all and just judges from some outdated webpages and his own prejudices about different software libraries.

Regarding the different points which have been raised:

* GNU MP is only used for bignum computations. ECL itself is clever enough to handle fixnums cleverly and even to unbox fixnum computations in compiled code. Incidentally, GCL uses GMP as well, so I do not see the point all.

* The simplistic garbage collector is an option and it is provided for platforms in which Boehm-Weiser does not run. Currently, this means _none_ of the supported platforms.

* Boehm-Weiser is a strong garbage collector and a very powerful one in terms of tunability. You man make it as precise as you want, and the Java people indeed do. ECL uses it and it has seen only performance improvements as we have learnt more and more how to better use it. If the ANSI test suite shows something in that respect is that, under a lot of consing pressure, it does not perform that bad. It does not get so close to SBCL's but I doubt any other free implementation does.

* ECL has a good compromise between all platforms. It provides both C compiled code and a reasonably fast interpreter. Benchmark show that the ECL interpreter is not that far from interpreted CLISP. But on the other hadn CLISP has its own set of optimized bytecodes and when it compiles it optimizes for those bytecodes. AFAI remember, GCL used (and probably still uses) a list based interpreter which runs through forms represented as lists and macroexpanding every form that has to be done so, and every time it uses it. That is terribly inefficient.

* In terms of maintainability it has shown through the years that it is easier for somebody to start coding and hacking ECL and adding new features than with most other platforms. That is how we got ECL ported to the Microsoft compiler and platform and how different pieces of software (sockets, asdf, etc) have been adapted to run here. That by itself is an important value, at least for people who think long term.

* Talking about diverting efforts from the GCL crowd, I am not the best person to speak about it. I am more than pissed of by the GCL community since, shortly after ECL reached most ANSI compliance and portability I was asked to port all that back to the GCL, because they wanted to achieve the same goal. That was back in '01 or '02, do not remember so well. What I remember is that those were not very polite emails and had a kind of "borg" spirit of assimilation, without even caring about the years spent on achieving that. So talk about diverting useful efforts.

So, to the interested parties, if you so much care about Maxima running on just one computer, then stick to sbcl and cmucl which are pretty superior implementations, but please do not scare people from porting useful software to other platforms and environments.

Kind regards

Juanjo

-- Facultad de Fisicas, Universidad Complutense,
Ciudad Universitaria s/n Madrid 28040 (Spain)
http://juanjose.garciaripoll.googlepages.com


So I can only recommend that if you are willing to waste an hour or two to read all the sources and make yourself a picture for yourself. There is much more to the story than I wrote here and I did not go into as much details as I could have since I wanted to finish this blog post in a reasonable amount of time. This is precisely the reason I didn't write it in the first place a couple weeks ago.

So, will there be "Sage Lisp Wars - The Sequel!"? I am sure there will be and you can easily figure out which side of the debate I am on ;) Despite of all the flames I linked to above in the end I feel that the Sage community got something positive in the end by the likely port of Maxima to ecl and at the same time hopefully the Maxima community will see the port to ecl as a benefit to them, regardless how the cooperation between Maxima and Sage will develop.

Cheers,

Michael

Saturday, April 26, 2008

Sage 3.0.1.alpha0 released

Hello folks,

this is 3.0.1.alpha0. So far we have only merged bugfixes, nothing invasive has been merged yet and there is nothing on the radar that does look invasive. 24 tickets have been closed up to now and I am not quite sure what the rest of the release cycle will look like because it currently doesn't look like we need a pure bug fix only release quickly.

There are plenty of patches available for review. The coercion rewrite planned for 3.1 seems to be going well.

Sources and binaries are in the usual place:

Cheers,

Michael

Merged in alpha0:

#783: Alex Ghitza: dilog is lame
#1187: Alex Ghitza: bug in G.conjugacy_classes_subgroups()
#1921: Alex Ghitza, Mike Hansen: add random_element to groups
#2302: Michael Abshoff, William Stein, Scot Terry: Add a 64 bit glibc 2.3 based binary of Sage to the default build platforms
#2325: David Roe, Kiran Kedlaya: segfault in p-adic extension() method
#2821: Alex Ghitza: get rid of anything "about this document" sections of any sage docs that say "send email to stein"
#2939: David Joyner: piecewise.py improvements (docstring and laplace fixes)
#2985: Michael Abshoff: ITANIUM (RHEL 5) -- bug in rubik.py's OptimalSolver()
#2993: Michael Abshoff: OSX/gcc 4.2: disable padlock support per default
#2995: Alex Ghitza: some new functionality and doctests for congruence subgroups
#3003: Jason Brandlow: Bugfix for to_tableau() method of CrystalOfTableaux elements
#3005: Craig Citro: modabar -- failure to compute endomorphism ring
#3006: David Joyner: missing elliptic integrals in special.py
#3014: Michael Abshoff: ZZ.random_element -- corrupted docstring
#3017: Michael Abshoff: invalid link after make install
#3022: Tim Abbott: Debian package support for polybori
#3023: Jason Grout: make apply_map deal with empty matrices
#3025: William Stein: Sparse vector spaces don't cast on assignment
#3027: Tim Abbott: Debian lintian fixes

Friday, April 25, 2008

Ubuntu LTS 6.06 x86-64 binaries Available

We finally have Ubuntu 6.06 LTS x86-64 binaries for Sage 3.0 available. it was mentioned in the release announcement, but a last minute bug did delay the release of the binaries since the rubiks.py doctest failed. That has been fixed.

Well, you might ask, Ubuntu LTS 8.04 is out, so what about binaries for that release? And the answer is the same as always: Make a VMWare image with minimal install and development tools, a working ssh access from the outside, a home partition with a devent amount of space [i.e. 20GB] and get it to William or me. From then on we will build you binary releases and the chance that Sage will compile and doctest fine on your preferred distrbution will be greatly improved.

Cheers,

Michael

Thursday, April 24, 2008

Stupid Bugs - Part 823

Well, I guess I never mentioned any of the previous bugs, but today I started looking at two rather vexing bugs that hit us on RHEL 5/Itanium. One of them (#2985) was rather odd since I couldn't reproduce it anywhere and the binary did valgrind clean I was running out of ideas. After building Sage 3.0 on Ubuntu LTS 6.06 I hit the same bug there, so it was even more vexing since everything worked on 64 bit Debian testing. But after looking at it over on Ubuntu 6.06 LTS there I still didn't see anything obvious that could be wrong. Switching back to my Itanium test box eventually revealed the problem. Somebody had left a 32 bit x86 binary of optimal in the spkg. Since the makefile for the Reid solver sucked we never ended up building it on a most x86 compatible Linux platforms. So it did start up on Itanium and then more or less die instantly. The bug is now fixed and is merged in 3.0.1.alpha0.

Next up: #2983 - an Itanium specific segfault in Singular. The fun never stops ;)

I have been rather quiet here, but hopefully I will be more active again in the future. Many things are swirling around in my head, but during the 3.0 release cycle I did not have the time to actually write down a long winded rant here. But I think that will change the next week.

Cheers,

Michael