Proof that Prolog can be faster than C

Fri Jun 15 01:47:22 AEST 1990

>From Peter Vanroy:

>Benchmark timings:
>
>Benchmark       Prolog          C (no opt.)     C (opt. 4)
>
>tak(24,16,8)    1.2             2.1             1.6
>fib(30)         1.5             2.0             1.6
>
>All three timings are in user seconds (with 'time') measured on a 25
>MHz MIPS processor.  The Prolog version is compiled with the
>Aquarius Prolog compiler under development at Berkeley.  The C
>versions are compiled with the MIPS C compiler, with no optimization
>and full optimization (level 4).
>
>Because older implementations of Prolog have been relatively slow,
>it is sometimes believed that this is inherent in the language.  The
>purpose of my original message was to dispel that notion by
>providing measurements of an optimizing Prolog compiler.

These are encouraging benchmarks, but should be generalized
cautiously.  First, to place the performance of the Aquarius Prolog
compiler in perspective, the times for the call:

	tak(18,12,6)

should be posted.  This would allow the direct comparison of the
times with those reported by by Richard P. Gabriel in "Performance
and Evaluation of Lisp Systems", MIT Press, 1974.  Comparing Prolog
and Lisp implementations is not unreasonable because of the
similarities between implementation techniques, particularly for
simple programs.

>From Richard O'Keefe:

>This particular benchmark result tells us precisely two things:
>(1) _one_ Prolog system on that machine has cheaper procedure calls
>    than _one_ C compiler.  Good work!  But not very surprising
>     given that C compiler writers tend to worry more about other
>     things.
>
>(2) that particular Prolog system handles integer arithmetic better
>    than a lot of other Prolog systems.

Gabriel reports that the "significant Lisp-level operations" for
tak(18,12,6) are composed of:

     Calls to TAK                    63,609
     1-'s (decrements)               47,706

Since the Prolog code listed by Vanroy is a direct translation of the
TAK benchmark, these numbers and their significance _probably_ apply. 
The prevalence of function calls and decrements in this benchmark
limits its general significance.

The fib/1 benchmark is similarly likely to be a limited measure of
general performance since it is also composed primarily of function
calls (to FIB) and decrements (N - 1 and N - 2).

>From Richard O'Keefe:

>It doesn't tell us how C and Prolog would compare on realistic
>applications, and there the pertinent question tends to be how easy
>it is to handle a good data structure.  I've an example where I
>speeded a program up quite substantially by translating from Fortran
>to Prolog, the speedup really came from using a much better data
>structure, but it was much easier to use that data structure in
>Prolog.

Gabriel has other benchmarks that are better suited to evaluate the
general performance of the Aquarius Compiler.  Two that appear well
suited are the BOYER and BROWSE benchmarks.

BOYER is a tiny theorem prover intended to give plausible estimates
of the performance of the Boyer-Moore theorem prover on various Lisp
systems.  BROWSE is a compendium of operations that an expert system
might use.  Both of these benchmarks could be translated to advantage
in Prolog, since they have a substantial amount of list manipulation
and pattern matching.

Ross Overbeek has a theorem prover whose performance in a variety of
Prologs was compared to C at a benchmarker's workshop held at The
Aerospace Corporation in 1988.  Carl Kesselman sponsored the
workshop, and may be able to provide source for a test.

>From Richard Tobin:

>I would be interested to know if there are any Lisp compilers that
>can produce comparable or better results.

Gabriel has no results for a 25MHz MIPS machine (wasn't available),
but does list some other interesting results for TAK:

IMPLEMENTATION            CPU             RAW TIME (sec)
--------------            ---------       --------------
Portable Standard Lisp    Cray-1           0.044
Symbolics                 3600+IFU         0.43
Assembler                 68000            0.7
Franz Lisp                VAX 780          1.09
Common Lisp               VAX 785          1.18
Portable Standard Lisp    Sun/2 (68010)    1.44
Common Lisp               VAX 780          1.83
C                         68000            1.9
Franz Lisp                68000            2.37
INTERLISP                 Dolphin         11.12

It would also be interesting to compare the performance of commercial
Prolog compilers (ALS, BIM, Quintus) on various machines to see how
they handle TAK.  While older Prolog implementations were slow, newer
implementations have moved to native code, and impressive speeds. A
recent claim for ALS Prolog (using that hoary chestnut, NREV, sorry :-)
gave in excess of one megaLIPS on the 88000, and 100+ KLIPS on a Mac IIfx.
Perhaps they would be willing to key in TAK and post the timings.

In closing, designing and interpreting benchmarks offer tempting
targets for implementors and their critics.  Gabriel's book suggests
some strategies to follow (and avoid);  there are many other articles
out there also (Jack Dongarra has published advice for designers of
benchmarks for supercomputers, for example).

Vanroy's results are encouraging, and (hopefully) will lead to even
better comparisons.  Let's support his work:  I'd like to suggest
that anyone with additional "points" on the graph post their results
to the net.  This would help Prolog implementors today just as
Gabriel's work goaded Lisp implementors in the 1980's.