From - Wed Apr  5 21:00:07 2000
Return-Path: <randy>
Received: (from randy@localhost)
	by euclid.acs.nmu.edu (8.9.3/8.9.3) id XAA03660;
	Tue, 21 Mar 2000 23:48:40 -0500
Date: Tue, 21 Mar 2000 23:48:40 -0500
From: Randy Appleton <randy@euclid.acs.nmu.edu>
Message-Id: <200003220448.XAA03660@euclid.acs.nmu.edu>
To: randy@euclid.acs.nmu.edu
Subject: Fwd: C speed vs functional languages
Reply-To: randy@euclid.acs.NMU.EDU (Randy Appleton)

Path: walter.acs.nmu.edu!newsxfer3.itd.umich.edu!logbridge.uoregon.edu!ihug.co.nz!brucehoult
From: brucehoult@pobox.com (Bruce Hoult)
Newsgroups: comp.arch
Subject: C speed vs functional languages
Date: Mon, 20 Mar 2000 17:49:46 +1200
Organization: The Internet Group Ltd
Lines: 86
Message-ID: <brucehoult-2003001749460001@bruce.bgh>
NNTP-Posting-Host: p17-max4.wlg.ihug.co.nz
X-Newsreader: MT-NewsWatcher 2.4.4
Xref: walter.acs.nmu.edu comp.arch:10646

In article <s71d7otz3ty.fsf@barnowl.CS.Berkeley.EDU>, David Gay
<dgay@barnowl.CS.Berkeley.EDU> wrote:

> brucehoult@pobox.com (Bruce Hoult) writes:
> > The conceptual work neded to make compiled scheme go as fast as C was done
> > by ... oh ... 1976 or so [1].
> > 
> > These days, Scheme compilers such as Stalin [2] produce awesome code.  In
> > fact Stalin often produces faster code than hand-written C despite the
> > fact that it generates C, because you would never ever have the patience
> > and skill to do all the bookkeeping required to do it by hand.
> 
> I've heard these kinds of claim rather too often to believe them without
> substantial proof. References ? (I looked through the Stalin release
> and the author's web site, but didn't find anything)

OK, I've found a concrete (and impressive) claim...

In the articles...

   <http://deja.com/getdoc.xp?fmt=text&AN=338901555>
   <http://deja.com/getdoc.xp?fmt=text&AN=339742270>
   <http://deja.com/getdoc.xp?fmt=text&AN=341354188>

... Siskind claims a 21:1 speed advantage for Stalin-compiled Scheme vs C,
on a 2D numerical integration code.

Here is the Scheme code:

(define (integrate-1D L U F)
 (let ((D (/ (- U L) 8.0)))
  (* (+ (* (F L) 0.5)
        (F (+ L D))
        (F (+ L (* 2.0 D)))
        (F (+ L (* 3.0 D)))
        (F (+ L (* 4.0 D)))
        (F (- U (* 3.0 D)))
        (F (- U (* 2.0 D)))
        (F (- U D))
        (* (F U) 0.5))
     D)))

(define (integrate-2D L1 U1 L2 U2 F)
 (integrate-1D L2 U2 (lambda (y) (integrate-1D L1 U1 (lambda (x) (F x y))) )))

(define (zark U V)
 (integrate-2d 0.0 U 0.0 V (lambda (X Y) (* X Y)) ))

(define (r-total N)
 (do ((I 1 (+ I 1))
      (Sum 0.0 (+ Sum (zark (* I 1.0) (* I 2.0)))))
   ((> I N) Sum)))

(define (i-total N)
 (do ((I 1 (+ I 1))
      (Sum 0.0 (+ Sum (let ((I2 (* (* I I) 1.0))) (* I2 I2)))))
   ((> I N) Sum)))

(define (error-sum-of-squares N)
 (do ((I 1 (+ I 1))
      (Sum 0.0 (+ Sum (let ((E (- (r-total I) (i-total I)))) (* E E)))))
   ((> I N) Sum)))

(begin (display (error-sum-of-squares 1000)) (newline))


Siskind does admit to a slight cheat in this two year old message -- he
hand-expanded the 35 line scheme program into a 342 line version, claiming
that Stalin would soon make this transformation automatically.

Given that it's now two years later, I just tried the Scheme code (35
lines) and the C code (58 lines) from the second deja message referenced
above.  Using what appears to be the standard set of flags (those in
benchmarks/compile-stalin-benchmark) I got the following runtimes on my
200 MHz Pentium Pro, 256 KB cache, RedHat Linux 5.2 (1.0.36),
egcs-2.91.66:

C     :  27.90 sec
Scheme:   5.85 sec

I don't know if some other set of flags would get the Scheme program down
to the claimed 1 second runtime -- I just use the standard set of flags
all the time.  Beating the C code by a factor of nearly 5 is still pretty
good.

-- Bruce