ONJava.com -- The Independent Source for Enterprise Java
oreilly.comSafari Books Online.Conferences.

advertisement

AddThis Social Bookmark Button

Java Performance: Efficiently Formatting Doubles
Pages: 1, 2

Formatting with internationalization

One interesting point is that for printing formatted numbers, my algorithm could run even quicker. Normally you want to print fewer than 15 decimal places, and my algorithm runs faster the fewer digits it needs to output. This contrasts with the SDK number formatting which always takes longer to format doubles. The SDK uses the java.text.DecimalFormat class to print formatted floating point numbers, and the conversion algorithm first uses the default SDK double-to-string conversion, then parses and formats the resulting string characters to create the formatted string. For example, to format a double with four digits after the decimal point and thousands separators, you could use the following SDK code:



DecimalFormat format = new DecimalFormat("#,##0.0000");
FieldPosition f = new FieldPosition(0);
StringBuffer s = new StringBuffer();
format.format(myDouble, s, f);

java.text.DecimalFormat also supports internationalized formatting. But this internationalized support turns out to be remarkably easy to manage for the most frequently used formatting, which needs internationalization of only a few elements:

  • the decimal point character
  • the thousands separator character
  • the number of digits separated by the thousands separator (normally three, but sometimes four)
  • the prefix and suffix character for negative numbers (normally a minus sign before or after the number, or the number surrounded by parentheses).

Accessing these values for a particular locale can be managed through the DecimalFormat class. For the conversion algorithm, changing the decimal character and the prefix and suffix characters is obviously strightforward. Adding in the thousands separator is slightly more challenging. You need to know the distance from the decimal point of the current digit as you are printing, but this distance is given by the magnitude of the current digit being printed, and it is simple to keep track of the magnitude: you determine the magnitude of the full double as part of the printing algorithm, and you can simply decrement the magnitude by one for each digit printed. The decision to print a thousands-separator character is then straightforward.

  if (d_magnitude % numDigitsSeparated == (numDigitsSeparated-1))
    s.append(thousandsSeparator);

To avoid having any thousands separator at all you could write another identical method without the above logic, or you could simply use a large value for numDigitsSeparated, e.g Integer.MAX_VALUE.

Related Reading

Java Performance TuningJava Performance Tuning
By Jack Shirazi
Table of Contents
Index
Sample Chapter
Full Description
Read Online -- Safari

Testing

The proof of the pudding is in the eating, so let's test out this effort. In the following table, I've used several Sun VMs on four tests:

  • Test 1: the original conversion algorithm from my book
  • Test 2: the adapted conversion algorithm including formatting
  • Test 3: the SDK StringBuffer.append(double) method (which calls Double.toString())
  • Test 4: the SDK java.text.DecimalFormat.format() method

I've normalized all measured times to the SDK 1.2 VM with Java Implementation Testing (JIT), running test 1. (That is, all measured times are divided by the measured time for the 1.2 VM running test 1.) Times are the averages over several test runs. HotSpot times are shown for a second run of tests without exiting the VM, so that the server-tuned VM has time for its optimizations to kick in.

Table 1: Times for converting doubles to strings using various methods and VMs.

 

1.2 VM

1.2 no-JIT VM

1.3 VM

HotSpot 2.0 VM (2nd run)

test 1: proprietary printing

100.0%

420.1%

114.2%

82.0%

test 2: proprietary with formatting

115.1%

414.4%

85.4%

93.8%

test 3: StringBuffer.append(double)

282.2%

926.1%

265.1%

199.8%

test 4: java.text.DecimalFormat.format()

456.1%

1690.2%

409.6%

303.7%

The test results show several interesting things. Firstly, the two tests using my algorithms produced relatively close timings in each VM, but which test was the faster depended on the VM being used. Even the two HotSpot VMs (the standard client-tuned 1.3 VM and the server-tuned HotSpot 2.0) produced a different order for the test timings. To me, this indicates that there are further possible optimizations in both sets of code (test1 and test2), and that the two HotSpot VMs are managing to apply two different (overlapping) sets of optimizations. Looking at the code, I would not be at all surprised to be able to tease out a 10% improvement by some re-factoring of nested tests. The time taken to format numbers depends on the number of digits being printed. I have used a format with four decimal places, but a separate test formatting to two decimal places showed test2 always running faster than test1 for all VMs.

Secondly, all the tests clearly show my algorithms outperforming the SDK conversion methods by a factor of two to four. Although the tests did not show the proprietary formatting algorithm to be consistently faster than the proprietary non-formatting algorithm, which I had actually expected, nevertheless the tests do show that both the proprietary algorithms are always significantly faster than the SDK provided algorithms.

Related Files:

DoubleToString.java

DoubleToString.class

Finally, it is worth noting that to convert floats to strings, you should not simply use the double methods. Although that is technically possible, the smaller float data structure is sufficiently different from double that the methods should be re-implemented for floats, and the smaller range taken account of by using ints to hold the scaled values.

Jack Shirazi is the author of Java Performance Tuning. He was an early adopter of Java, and for the last few years has consulted mainly for the financial sector, focusing on Java performance.


Return to ONJava.com.