Bubble, bubble, toil and Double

Inspired by Weiqi Gao's Friday Java Quiz this week, I came up with a few more curiosities of Java double. What do each of the calls to larger() print? Hint: this puzzle has nothing to do with type conversion loss of precision, it is only about the specified behavior of doubles.

public class Foo {
    public static void main(String[] args) {
        larger(0.0d, Double.MIN_VALUE);
        larger(1.0d, Double.MIN_VALUE*Double.MAX_VALUE);
        larger(1.0d, Double.MIN_VALUE*Double.MAX_VALUE*Double.MAX_VALUE);
        larger(1.0d/Double.MIN_VALUE, 1.0d/(1.0d/Double.MAX_VALUE));
        larger(1.0d/Double.MIN_VALUE, 1.0d/0.0d);
        larger(1.0d, Double.MIN_VALUE/Double.MIN_VALUE);
    }

    public static void larger(Double a, Double b){
        System.out.print(a + " or " + b + " ? ");
        if (a < b){
        System.out.println(b);
        } else if (a > b){
        System.out.println(a);
        } else {
        System.out.println("equal");
        }
    }
}

Highlight the hidden text below for the answers:

0.0 or 4.9E-324 ? 4.9E-324
1.0 or 8.881784197001251E-16 ? 1.0
1.0 or 1.5966722476277755E293 ? 1.5966722476277755E293
Infinity or Infinity ? equal
Infinity or Infinity ? equal
1.0 or 1.0 ? equal

String Theory

No, not that string theory. This string theory is testable. In fact, the moral of this story is this: don't make any assumptions about Java. The compiler or the VM. They do crazy stuff, like make your code run better. If you think you have an inefficiency in your code, test it to see. It could be that you're about to make your code less readable for no performance benefit.

The software I do at work has a code generation component, so there's a bunch of string concatenation and output. The main path for generation passes around a StringBuilder (which is 20% faster than the old synchronized StringBuilder) to multiple classes, which each concatenate code into it. Some of the code generation within each class still used string concatenation (+=) for building intermediate strings, and I knew we needed to convert it to use an SB. When I ran FindBugs on the codebase, this alone accounted for about 25% of the true positives.

The main uses in the code are things like:

String s = "";
for (Foo f: foos){
    s += "baz " + f.id + " { " + f.type + " } ";
}

or

StringBuilder sb = ...;
for (Foo f: foos){
    sb.append("baz " + f.id + " { " + f.type + " } ");
}

My original thought was that both of these were horrible and were creating n strings in memory for each concat, which where then immediately thrown away. However, after doing a few experiments and looking at the bytecode, I discovered a few things. First, the compiler converts Strings concat'ed using '+' into calls to StringBuilder.append. This means that there aren't any intermediate String objects generated in a call like "a+b+c", only a StringBuilder and the result String. The first snippet above is still bad, because to create new String to assign to s, it has to create two new immutable Strings, one with the value of the String to add and one with the result of the concat. The garbage collector then must clean up the SB, the String from the SB, and the old value of s. In the second snippet, the only intermediate objects that gets thrown away is a StringBuilder, but all of it's state is reused by the StringBuilder we keep. As the numbers below show, there's not that much difference between multiple chained calls to append and one call to append with the syntactic sugar of the '+'. Using the '+' syntax is much more readable, so this is the option I'm going with in my code.

The trivial moral for the story is never to use '+=', but using + inside an append is fine. The general moral is that from now on, I'm definitely going to measure, then change.

The results for various String concat methods:

# test code server (ms) client (ms) notes
1 string + w/ vars s += a + b + c 929 2388
2 string + w/ consts s += "a" + "b" + "c" 902 2380 the compiler transforms this into 's += "abc"'
3 string + w/ SB s += new StringBuilder().append(a)
.append(b).append(c).toString()
915 2390 like bytecode of (1)
4 str bld + sb.append(a + b + c) 9 14 common pattern
5 str bld + CTC sb.append("a" + "b" + "c") 7 11 constants instead of variables (4)
6 str bld ++ sb.append(new StringBuilder()
.append(a).append(b)
.append(c).toString())
9 14 bytecode of (4)
7 str bld app sb.append(a).append(b).append(c) 7 11 most 'proper' usage
8 MString append sb.append(a,b,c) 8 12 impl. of "faster" interface

This is the example code for the tests. Note that closures would have been really useful here, since I wouldn't have had to duplicate this method (can't wait for 7!).

    public long runStringBuilderAppend(){
        long start = System.currentTimeMillis();
        try {
            StringBuilder sb = new StringBuilder();
            for (int i = 0; i < runs; i++){
                sb.append(a).append(b).append(c);
            }
            sb.toString();
            sb = null;
        } finally {
            if (gc) System.gc();
            return (System.currentTimeMillis() - start); 
        }
    }

source: StringTest.java