Tuesday, July 28, 2009

Order matters when adding doubles

I recently discovered (or rather, was told) that summing a set of doubles from a database query may produce different values, even if the database is in a read only state and no programmatic or system errors occurred. Struck me as a bit odd until I realized that order in which the results returned from a query are not guaranteed unless an order by is specified. As a result the sequence in which the doubles are added together may change, thus the sum of those doubles may be different!

An easy way to explain this is that if the doubles are returned in an order where large doubles appear first, the small doubles do not have time to cumulate and add up to anything meaningful. Instead these doubles may be truncated or even ignored because they individually can not impact the current rolling sum. I wanted to explore this a bit further, so i wrote a java program to generate some doubles and add them up in random, ascending and descending order.

Lesson of the day? If you're creating an application that requires all the significant figures available to you in a double either very careful in your arithmetic or use a datatype with precision to spare!

java AddingDoubles 10000 10
Sum in random order: 5.022355325590349E12
Sum in decending order: 5.022355325590318E12
Sum in increasing order: 5.022355325590351E12

Code for AddingDoubles.java below.


import java.util.Random;
import java.util.Date;

/** Adding doubles demonstrates how precision may be lost
 depending on the order in which doubles are added. This effect
 is especially true when some doubles are small relative to others.
 */
 
public class AddingDoubles {
    
    public double m_data[];
    
    public AddingDoubles(int nSmall, int nBig, int nMultiplier) {
        m_data = new double[nSmall + nBig];
        Random r = new Random();
        
        // create small doubles
        for(int i = 0; i < nSmall; i++) {
            m_data[i] = r.nextDouble();
        }
        
        // create 'large' doubles
        for(int i = 0; i < nBig; i++) {
            m_data[i + nSmall] = r.nextDouble() * nMultiplier;
        }
        
        // randomize the array
        shuffle(m_data);
    }
    
    private static void shuffle(double[] a) {
        int n = a.length;
        
        for (int i = 0; i < n; i++) {
            int r = i + (int) (Math.random() * (n-i)); 
            double temp = a[i];
            a[i] = a[r];
            a[r] = temp;
        }
    }
    
    public void sort( ) {
        java.util.Arrays.sort( m_data );
    }

    public double sum( boolean reverse ) {
        double total = 0;
        
        int i = reverse ? m_data.length - 1: 0;
        int inc = reverse ? -1: 1;
        
        while( i >= 0 && i < m_data.length ) {
            total += m_data[i];
            i += inc;
        }
        
        return total;
    }
    
    public static void main(String[] args) {
        int s = Integer.parseInt(args[0]);
        int b = Integer.parseInt(args[0]);
        AddingDoubles ad = new AddingDoubles(s,b,1000000000);
        
        System.out.println( "Sum in random order: " + ad.sum( true ) );
        ad.sort();
        System.out.println( "Sum in decending order: " +ad.sum( true ) );
        System.out.println( "Sum in increasing order: " + ad.sum( false ) );
    }
    
}

0 comments:

Post a Comment