Simple Questions, Simple Benchmarks

One day I was thinking about the abstract security of certain cryptographic algorithms, and I realized I no longer had any good idea how much slower a divide was than an add or a multiply on modern processors, and also no feel for how many ops/sec one could really push thru todays fastest chips.

So I wrote a little 5 minute benchmark to do 100 million each of several operations, time each bunch, and spit out some results. The source for the current version of this benchmark is here. The results of running this test on various machines are listed below. If you have access to a machine I don't have listed, see different results for a given machine than I did, or have suggestions for improving this set of tests, please mail me at jasonc aht introrse doht kom.

Important Note: When compiling this code, you must be careful that your compiler does not do common subexpression optimization or invariant code motion! This code is so simple that a clever compiler will see thru the fact that all the ops I'm doing have exactly the same result and do the operation only once, or only 1 million times instead of 100M.
Some interesting tidbits:
• Logical AND always takes significantly longer than bitwise AND.
• Logical AND takes longer than division on the SGI tested!

Special thanks to Parke Bostrom for the Dec Alpha test, to Mark Hagfors for the Pentium Pro and Sparc/4 tests, and to Alex Campbell for the 486/33 test!

 Intel i486DX/33 ```Addition took 12.968 seconds. ( 7.71 M/sec) Subtraction took 12.969 seconds. ( 7.71 M/sec) Multiplication took 117.813 seconds. ( 0.84 M/sec) Divisions took 151.875 seconds. ( 0.65 M/sec) Remainder took 151.843 seconds. ( 0.65 M/sec) Bitwise AND took 12.938 seconds. ( 7.72 M/sec) Logical AND took 49.312 seconds. ( 2.02 M/sec) ``` Intel i486DX2/66 ```Addition took 9.750 seconds. (10.2 M/sec) Subtraction took 9.719 seconds. (10.2 M/sec) Multiplication took 58.250 seconds. ( 1.71 M/sec) Division took 75.000 seconds. ( 1.33 M/sec) Remainder took 74.750 seconds. ( 1.33 M/sec) Bitwise AND took 9.750 seconds. (10.2 M/sec) Logical AND took 24.594 seconds. ( 4.06 M/sec) ``` Intel Pentium/133 ```Addition took 2.300 seconds. (43.4 M/sec) Subtraction took 2.316 seconds. (43.1 M/sec) Multiplication took 7.583 seconds. (13.1 M/sec) Division took 36.933 seconds. ( 2.70 M/sec) ``` Intel Pentium Pro/150 ```Addition took 2.055 seconds. (48.6 M/sec) Subtraction took 2.055 seconds. (48.6 M/sec) Multiplication took 6.749 seconds. (14.8 M/sec) Division took 33.527 seconds. ( 2.98 M/sec) Remainder took 33.575 seconds. ( 2.97 M/sec) Bitwise AND took 2.064 seconds. (48.4 M/sec) Logical AND took 5.416 seconds. (18.4 M/sec) ``` Dec Alpha (amy3) ```Addition took 5.063 seconds. (19.75 M/sec) Subtraction took 5.113 seconds. (19.56 M/sec) Multiplication took 22.469 seconds. ( 4.45 M/sec) Division took 47.965 seconds. ( 2.08 M/sec) Remainder took 51.046 seconds. ( 1.96 M/sec) Bitwise AND took 5.080 seconds. (19.69 M/sec) Logical AND took 8.760 seconds. (11.42 M/sec) ``` Sun Sparc/4 ```Addition took 7.628 seconds. (13.1 M/sec) Subtraction took 7.733 seconds. (12.9 M/sec) Multiplication took 93.442 seconds. ( 1.07 M/sec) Division took 37.245 seconds. ( 2.68 M/sec) Remainder took 39.920 seconds. ( 2.50 M/sec) Bitwise AND took 7.632 seconds. (13.1 M/sec) Logical AND took 19.361 seconds. ( 5.16 M/sec) ``` SGI ? (shellx, a former 10K+ user shell machine at one of my ISPs) ```Addition took 0.580 seconds. (172.41 M/sec) Subtraction took 0.580 seconds. (172.41 M/sec) Multiplication took 0.630 seconds. (158.73 M/sec) Divisions took 1.420 seconds. ( 70.42 M/sec) Remainder took 1.460 seconds. ( 68.49 M/sec) Bitwise AND took 0.580 seconds. (172.41 M/sec) Logical AND took 5.620 seconds. ( 17.79 M/sec) ``` AMD K6-II/400 ```Addition took 0.523 seconds. (191.0 M/sec) Subtraction took 0.523 seconds. (191.0 M/sec) Multiplication took 1.039 seconds. ( 96.2 M/sec) Divisions took 7.055 seconds. ( 14.2 M/sec) Remainder took 7.047 seconds. ( 14.2 M/sec) Bitwise AND took 0.539 seconds. (185.5 M/sec) Logical AND took 1.186 seconds. ( 84.2 M/sec) ```