Common Lisp benchmarking suite ============================== This package contains Lisp code intended for performance benchmarking of different Common Lisp implementations. The tests it runs include - the well-known Gabriel benchmarks, with a number of bugfixes from Henry Baker and limited rewriting to bring them a little closer to modern Common Lisp coding style - hashtable exercising - READ-LINE exercising - mathematical functions: factorial, fibonnaci, ackermann's number - some bignum-intensive code from Bruno Haible - various aspects of CLOS: time taken to compile and execute DEFCLASS forms which create a class hierarchy, instance creation time, time taken to compile and execute DEFMETHOD forms, execution of method combinations, both with and without :after methods. - various operations on arrays, bitvectors and strings - performance-intensive kernels such as CRC generation and an implementation of the DEFLATE algorithm Except for the CLOS COMPILER tests, timings do not include compilation time. The garbage collector is run before each test to try to make the timings more repeatable. For certain targets, we assume that the times reported by GET-INTERNAL-RUN-TIME and GET-INTERNAL-REAL-TIME are accurate. Timings for a given Common Lisp environment may be quite sensitive to the optimization settings; these are set at the beginning of the Makefile. Common Lisp is a very large language, so it is difficult to evaluate the performance of all aspects of an implementation. Remember that the only real benchmark is your application: this code is only representative of real-life programs to a limited extent. Running the suite ================= Each implementation has a run-.sh file, and a setup-.lisp file in the sysdep directory. To execute the test for CMUCL, for exampe, type bash run-cmucl.sh This will create files with the optimization setting requested in the Makefile, then invoke CMUCL to compile the files (lots of warnings and compilation notes are to be expected), then run the various tests. Execution takes quite a while: around 30 minutes on a 1GHz PIII. A large part of this time is due to the bignum benchmarks, which you can comment out if they don't interest you. You can also reduce the number of executions of each particular test by editing the file "tests.lisp" -- modify the :runs parameter of the DEFBENCH forms in the file "tests.lisp". For the results to be meaningful, the time for each test should be more than a second. However, when comparing implementations it is not the absolute time which counts, but the time relative to the other implementations. Also note that you may wish to reduce the size of the problems, particularly in the array sections, because with some setups (CMUCL on Linux for example), your system may become unusable if you exceed available RAM. You can do this by editing the file "tests.lisp" and modifying the arguments to the :code parameters. Repeat this operation for other implementations you have on your machine. I have tried it with CMUCL, SBCL, CLISP, OpenMCL, Poplog CL and LispWorks on various platforms. GCL and ECL are able to run some of the tests. If you're not running on a Unix platform, you may not be able to script the different steps of a run. In this case, follow the following steps: 1. Say "make clean optimize-lisp" to create source files that contain optimization declarations. The optimize settings can be changed in the Makefile. 2. Load the file "generate.lisp" into your implementation. This should create two files "do-compilation-script.lisp" and "do-execute-script.lisp", that contain a sequence of compilation and load steps. 3. Load the file "sysdep/setup-.lisp", which you may need to write yourself. This requires a definition of a BENCH-GC function that triggers a full garbage collection. 4. Load the file "do-compilation-script.lisp" into your implementation, which should result in all the source files being compiled. 5. Load the file "do-execute-script.lisp", which should cause all the tests to be executed. For each tested implementation, you should have a file in /var/tmp (or in the current directory on Windows) named "CL-benchmark-". These files will have the following format: ,---- /var/tmp/CL-benchmark-20010821T2208 --- | ;; -*- lisp -*- CMU Common Lisp CVS sources, level-1 built 2001-08-22 on maftia1 | ;; | ;; Implementation *features*: | ;; (:PCL-STRUCTURES :PORTABLE-COMMONLOOPS :PYTHON :PCL :COMPLEX-FP-VOPS :PPRO | ;; :PENTIUM :MP :HASH-NEW :RANDOM-MT19937 :PROPAGATE-FUN-TYPE | ;; :PROPAGATE-FLOAT-TYPE :CONSTRAIN-FLOAT-TYPE :CMU18C :CMU18 :GENCGC :X86 | ;; :LINUX :GLIBC2 :UNIX :COMMON :COMMON-LISP :ANSI-CL :CMU | ;; :IEEE-FLOATING-POINT) | ;; | ;; Function real user sys consed | ;; ---------------------------------------------------------------- | ;; Boyer 1.50 1.22 0.28 54349520 | ;; Browse 0.97 0.79 0.18 36219256 | ;; DDerviv 0.88 0.50 0.39 67197656 | ;; Deriv 1.64 0.87 0.77 127195824 | ;; Destructive 0.30 0.24 0.05 12819928 | ;; div2-test-1 0.52 0.32 0.20 38398176 | ;; div2-test2 0.66 0.42 0.24 47999936 | ;; FFT 0.40 0.40 0.00 0 | ;; frpoly/fixnum 0.65 0.54 0.10 19172440 | ;; frpoly/bignum 1.54 1.25 0.28 55628704 | ;; frpoly/float 7.59 6.50 1.09 213052408 | ;; Puzzle 0.82 0.82 0.00 0 | ;; CTak 0.81 0.81 0.00 0 | ;; Tak 0.53 0.54 0.00 0 | ;; RTak 0.35 0.36 0.00 0 | ;; takl 1.60 1.60 0.00 0 | ;; stak 1.15 1.14 0.00 0 | ;; fprint 0.25 0.25 0.01 1948416 | ;; fread 0.82 0.68 0.13 28487280 | ;; traverse 5.28 5.24 0.03 4493288 | ;; triangle 2.29 2.28 0.00 499712 | ;; factorial 0.37 0.18 0.20 26120296 | ;; fib 2.39 2.39 0.00 0 | ;; hashtable 0.72 0.69 0.04 9888912 | ;; CLOS/defclass 2.82 2.31 0.12 32757328 | ;; CLOS/defmethod 10.94 10.09 0.55 120612624 | ;; CLOS/instantiate 7.13 6.27 0.86 229048352 | ;; CLOS/methodcalls 6.56 5.22 1.34 301057608 | ;; CLOS/method+after 12.02 11.09 0.93 197058816 | ;; CLOS/complex-methods 0.38 0.38 0.00 286600 | ;; 1D-arrays 2.46 2.17 0.30 60002400 | ;; 2D-arrays 20.57 19.50 1.07 240000240 | ;; bitvectors 18.75 18.51 0.23 50003200 | ;; fill-strings 21.12 15.23 5.88 1000016000 | ;; fill-strings/adjustable 57.10 56.25 0.85 259729520 | | ("CMU Common Lisp CVS sources, level-1 built 2001-08-22 on maftia1" | ("fill-strings/adjustable" 57.1 56.25 0.85 259729520) | ("fill-strings" 21.12 15.23 5.88 1000016000) | ("bitvectors" 18.75 18.51 0.23 50003200) | ("2D-arrays" 20.57 19.5 1.07 240000240) ("1D-arrays" 2.46 2.17 0.3 60002400) | ("CLOS/complex-methods" 0.38 0.38 0.0 286600) | ("CLOS/method+after" 12.02 11.09 0.93 197058816) | ("CLOS/methodcalls" 6.56 5.22 1.34 301057608) | ("CLOS/instantiate" 7.13 6.27 0.86 229048352) | ("CLOS/defmethod" 10.94 10.09 0.55 120612624) | ("CLOS/defclass" 2.82 2.31 0.12 32757328) ("hashtable" 0.72 0.69 0.04 9888912) | ("fib" 2.39 2.39 0.0 0) ("factorial" 0.37 0.18 0.2 26120296) | ("triangle" 2.29 2.28 0.0 499712) ("traverse" 5.28 5.24 0.03 4493288) | ("fread" 0.82 0.68 0.13 28487280) ("fprint" 0.25 0.25 0.01 1948416) | ("stak" 1.15 1.14 0.0 0) ("takl" 1.6 1.6 0.0 0) ("RTak" 0.35 0.36 0.0 0) | ("Tak" 0.53 0.54 0.0 0) ("CTak" 0.81 0.81 0.0 0) ("Puzzle" 0.82 0.82 0.0 0) | ("frpoly/float" 7.59 6.5 1.09 213052408) | ("frpoly/bignum" 1.54 1.25 0.28 55628704) | ("frpoly/fixnum" 0.65 0.54 0.1 19172440) ("FFT" 0.4 0.4 0.0 0) | ("div2-test2" 0.66 0.42 0.24 47999936) ("div2-test-1" 0.52 0.32 0.2 38398176) | ("Destructive" 0.3 0.24 0.05 12819928) ("Deriv" 1.64 0.87 0.77 127195824) | ("DDerviv" 0.88 0.5 0.39 67197656) ("Browse" 0.97 0.79 0.18 36219256) | ("Boyer" 1.5 1.22 0.28 54349520)) `---- The first section of the file is intended to be human readable, and the second section to be READ by a Common Lisp implementation. For each test, you should see the elapsed user time, and possibly (if this has been coded for your implementation) elapsed system time and the number of bytes consed during the test execution. The data in the different /var/tmp/CL-benchmark-* files is analysed by the file "report.lisp", to generate a report comparing the performance of the different implementations. This file needs to be run in a Common Lisp implementation; the one you use will be considered the "reference" implementation. In the report which is generated, for each test the timing for the reference implementation will be shown, as well as the _relative times_ for each of the other tested implementations. A relative time means that a score under 1 is better, and a score of 2 means it is two times slower -- for that test -- than the reference implementation. If a given test doesn't work in a particular implementation (for example CLISP doesn't do non-standard method combination), its entry will be -1.0. Here is an example of the type of results you can obtain, for x86 and SPARC: ,---- PentiumIII at 1GHz, 256MB RAM, Linux 2.4.2 --- | | Benchmark Reference CLISP CMU C SBCL | ---------------------------------------------------------------- | BOYER 2.36 4.54 0.67 0.94 | BROWSE 1.04 2.15 0.65 1.04 | DDerviv 1.19 1.96 0.48 1.06 | Deriv 2.27 1.93 0.42 1.04 | DESTRUCTIVE 1.52 2.79 0.89 1.06 | DIV2-TEST-1 1.73 1.84 0.51 1.09 | DIV2-TEST-2 0.85 1.87 0.46 1.12 | FFT 0.22 36.09 1.14 1.14 | FRPOLY/FIXNUM 0.79 5.81 0.81 0.96 | FRPOLY/BIGNUM 1.99 2.03 0.68 0.96 | FRPOLY/FLOAT 0.78 3.79 0.72 0.99 | PUZZLE 0.79 23.09 1.15 9.95 | CTAK 0.86 6.28 1.10 1.06 | TAK 0.91 8.86 1.18 1.34 | RTAK 0.91 8.86 1.13 1.34 | TAKL 1.67 7.33 1.16 1.22 | STAK 1.15 6.66 1.15 1.10 | FPRINT 1.42 1.12 1.05 2.37 | TRAVERSE 4.35 6.75 1.19 1.64 | TRIANGLE 2.01 17.22 1.14 1.27 | CASCOR 4.06 80.47 1.23 0.92 | RICHARDS 0.58 24.78 1.22 1.07 | FACTORIAL 0.50 3.56 0.68 1.38 | FIB 0.39 5.67 1.13 1.08 | BIGNUM/ELEM-100-1000 1.33 0.11 1.02 0.98 | BIGNUM/ELEM-1000-100 6.03 0.07 1.11 0.93 | BIGNUM/ELEM-10000-1 6.14 0.05 1.00 0.93 | BIGNUM/PARI-100-10 1.51 0.06 0.96 0.91 | BIGNUM/PARI-200-5 17.32 0.02 0.99 0.88 | HASH-STRINGS 1.65 3.00 0.82 0.99 | HASH-INTEGERS 0.73 2.30 0.68 1.05 | BOEHM-GC 10.18 3.94 0.39 1.19 | CLOS/defclass 2.64 0.35 0.83 2.33 | CLOS/defmethod 13.93 0.02 0.70 1.94 | CLOS/instantiate 7.23 1.02 0.63 1.39 | CLOS/methodcalls 8.43 1.33 0.50 1.37 | CLOS/method+after 13.90 0.45 0.79 2.64 | CLOS/complex-methods 0.35 -1.00 1.06 5.40 | 1D-ARRAYS 3.74 4.57 0.69 1.26 | 2D-ARRAYS 15.19 3.25 0.92 4.00 | BITVECTORS 2.92 0.61 0.70 0.95 | FILL-STRINGS 10.62 1.15 0.76 3.03 | fill-strings/adjustable 18.13 1.08 0.92 1.65 | BENCH-STRING-CONCAT 9.99 -1.00 0.81 2.34 | | Reference implementation: CMU Common Lisp 18c | Impl CLISP: CLISP 2.27 (released 2001-07-17) (built 3204291355) (memory 3205854943) | Impl CMU C: CMU Common Lisp 18d-pre, level-1 built 2001-12-14 on maftia1 | Impl SBCL : SBCL 0.7.0 | Linux maftia1 2.4.2-2 #1 Sun Apr 8 20:41:30 EDT 2001 i686 unknown `---- ,---- UltraSPARCIIe at 500MHz, 640MB RAM, SunOS 5.8 --- | | Benchmark Reference CMU C CLISP | ----------------------------------------------------- | BOYER 3.98 0.91 8.03 | BROWSE 1.72 0.91 2.85 | DDerviv 2.02 0.75 3.21 | Deriv 3.63 0.81 3.13 | DESTRUCTIVE 3.11 1.01 4.18 | DIV2-TEST-1 2.19 0.83 3.92 | DIV2-TEST-2 1.12 0.82 3.85 | FFT 0.74 1.03 28.86 | FRPOLY/FIXNUM 1.87 1.01 7.89 | FRPOLY/BIGNUM 4.59 1.29 3.07 | FRPOLY/FLOAT 1.65 0.96 5.68 | PUZZLE 2.07 0.95 30.62 | CTAK 2.74 1.01 9.04 | TAK 1.84 1.00 14.08 | RTAK 1.84 1.01 13.95 | TAKL 3.37 1.01 11.63 | STAK 2.32 1.01 8.87 | FPRINT 4.17 1.02 1.12 | TRAVERSE 5.84 0.99 13.74 | TRIANGLE 5.53 0.86 15.57 | CASCOR 10.53 0.73 52.81 | RICHARDS 2.35 0.94 22.46 | FACTORIAL 1.46 1.48 2.88 | FIB 0.94 0.99 6.71 | BIGNUM/ELEM-100-1000 2.80 1.24 0.28 | BIGNUM/ELEM-1000-100 10.14 1.19 0.44 | BIGNUM/ELEM-10000-1 11.38 1.35 0.41 | BIGNUM/PARI-100-10 2.76 1.15 0.09 | BIGNUM/PARI-200-5 27.19 1.06 0.05 | READ-LINE 3.39 1.06 1.19 | HASH-STRINGS 5.42 1.20 2.19 | HASH-INTEGERS 1.61 0.76 2.00 | BOEHM-GC 19.97 0.76 4.14 | CLOS/defclass 4.78 1.01 0.81 | CLOS/defmethod 27.61 0.89 0.03 | CLOS/instantiate 20.93 0.85 1.28 | CLOS/methodcalls 23.62 1.08 1.94 | CLOS/method+after 33.70 1.07 0.78 | CLOS/complex-methods 1.41 0.92 -1.00 | 1D-ARRAYS 10.77 0.92 3.51 | 2D-ARRAYS 56.66 1.40 2.61 | BITVECTORS 5.35 0.86 0.42 | FILL-STRINGS 18.88 1.07 0.97 | fill-strings/adjustable 45.09 1.46 1.41 | BENCH-STRING-CONCAT 48.10 0.90 -1.00 | | Reference implementation: CMU Common Lisp 18c, Built 2000-11-27 | Impl CMU C: CMU Common Lisp 18d-pre, level-1 built 2001-12-12 on liszt | Impl CLISP: CLISP 2.27.2 (released 2001-10-05) (built on moustacho) | SunOS eagles 5.8 Generic_108528-10 sun4u sparc SUNW,Sun-Blade-100 `---- Note that the test suite doesn't take compilation time into account (except for the CLOS-related tests, where the compiler may be used at runtime to create effective methods). You can use the time taken to load do-compilation-script.lisp as a rough proxy for compilation time. "Life is short and it was not meant to be spent making people feel guilty about instruction pipelines being only partly full or caches being missed." -- Kent Pitman in Thanks ====== Raymond Toy, Christophe Rhodes, Peter Van Eynde, Sven Van Caekenberghe, Christophe Rhodes, Kevin Layers and possibly others that I have forgotten to note. Related work ============ - @misc{ gabriel86performance, author = "R. Gabriel", title = "Performance and Evaluation of Lisp Systems", text = "R. P. Gabriel. Performance and Evaluation of Lisp Systems. MIT Press, Cambridge, Massachusetts, 1986.", year = "1986" } - Scheme benchmarks by Will Clinger (Larceny and TwoBit compilers) - Bagley's Programming Language Shootout, Eric Marsden