Micro-benchmarking
I wrote this little micro benchmark to test it out:
public class MultiplyDivide { public static void main(String[] args) { for (int j = 0; j < 10; ++j) { long t = System.currentTimeMillis(); double sum = 0; for (int k = 0; k < 1000; ++k) for (double i = 0; i < 100000.0; i += 1.0) sum += i * 0.001; // += i / 1000.0; // += i; long d = System.currentTimeMillis() - t; System.out.println("sum=" + sum + " ms=" + d); } } }
Run time on 1.6 JVM
Here’s the timing on my workstation, a dual quad-core Xeon (E5410 2.33GHz), with 16GB of ECC/registered DDR2667 memory, Vista 64 bit OS. I’m running the following 1.6 JVM: Java(TM) SE Runtime Environment (build 1.6.0_05-b13), Java HotSpot(TM) 64-Bit Server VM (build 10.0-b19, mixed mode)
Because there’s more going on than just the inner loop’s multiplication or division, here are timings with the inner loop doing just sum += i
:
c:\carp\temp>java MultiplyDivide sum=4.99995E12 ms=383 sum=4.99995E12 ms=390 sum=4.99995E12 ms=383 sum=4.99995E12 ms=391 sum=4.99995E12 ms=390 sum=4.99995E12 ms=383 sum=4.99995E12 ms=383 sum=4.99995E12 ms=390 sum=4.99995E12 ms=383 sum=4.99995E12 ms=383
Here’s the result with multiplication:
sum=4.99995E9 ms=391 sum=4.99995E9 ms=381 sum=4.99995E9 ms=402 sum=4.99995E9 ms=406 sum=4.99995E9 ms=406 sum=4.99995E9 ms=375 sum=4.99995E9 ms=375 sum=4.99995E9 ms=401 sum=4.99995E9 ms=391 sum=4.99995E9 ms=390
Multiplication is not very costly compared to the basic loops and additions.
Compare this to division (i/1000.0
):
sum=4.99995E9 ms=859 sum=4.99995E9 ms=859 sum=4.99995E9 ms=860 sum=4.99995E9 ms=859 sum=4.99995E9 ms=860 sum=4.99995E9 ms=859 sum=4.99995E9 ms=843 sum=4.99995E9 ms=867 sum=4.99995E9 ms=860 sum=4.99995E9 ms=851
This is a huge penalty for division over multiplication or nothing at all. All that extra cost is due to division.
I tried different numbers to multiply and divide and their choice doesn’t matter.
Compilation in JDK 1.5 vs. 1.6
These are numbers from compiling in the 1.6 JDK. Compiling in 1.5 (as we compile LingPipe), and then running in the 1.6 JVM didn’t make a difference.
Run time in 1.5 JVM
Here are numbers for the 1.5 JVM, Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_15-b04), Java HotSpot(TM) 64-Bit Server VM (build 1.5.0_15-b04, mixed mode).
First, with just the addition in the loop:
sum=4.99995E12 ms=156 sum=4.99995E12 ms=204 sum=4.99995E12 ms=203 sum=4.99995E12 ms=203 sum=4.99995E12 ms=203 sum=4.99995E12 ms=187 sum=4.99995E12 ms=188 sum=4.99995E12 ms=203 sum=4.99995E12 ms=203 sum=4.99995E12 ms=203
I have no idea why the 1.5 JVM is twice as fast as the 1.6 JVM for simple loops.
Here’s the result for multiplication, which clearly has a measurable cost in 1.5, with the total compute time for multiplication still a bit faster than in 1.6:
sum=4.99995E15 ms=346 sum=4.99995E15 ms=351 sum=4.99995E15 ms=352 sum=4.99995E15 ms=351 sum=4.99995E15 ms=353 sum=4.99995E15 ms=351 sum=4.99995E15 ms=352 sum=4.99995E15 ms=352 sum=4.99995E15 ms=352 sum=4.99995E15 ms=351
Division is relatively slow in the 1.5 JVM:
c:\carp\temp>java MultiplyDivide sum=4.99995E9 ms=1057 sum=4.99995E9 ms=1032 sum=4.99995E9 ms=1031 sum=4.99995E9 ms=1016 sum=4.99995E9 ms=1015 sum=4.99995E9 ms=1016 sum=4.99995E9 ms=1031 sum=4.99995E9 ms=1032 sum=4.99995E9 ms=1031 sum=4.99995E9 ms=1031
Of course, one has to be careful that precision doesn’t get hurt by switching divides to multiplies.
Comparison to Logarithms
In the 1.5 JVM, swapping Math.log(x)
for x/1000.0
was about ten times slower. In the 1.6 JVM, logs were only about three times slower than division. As I’ve said before, use one of the more recent builds of 1.6 in server mode if you want the fastest JVM for LingPipe.
October 28, 2009 at 2:03 pm |
Cheers man, this really helped.