Random Thoughts in a Random Universe

Fun with Floating Point Precision in numpy

I recently had a bug in my code that obviously was caused by an issue with floating point precision but had me scratching my head how it came about. Here it is:

In [1]:
import numpy as np
from astropy.table import Table
from astropy import cosmology

cosmo = cosmology.WMAP9

Table lets me read a FITS table, the standard data format in Astronomy. I draw a random sample from a table column into a numpy.ndarray.

In [2]:
tab = Table.read("https://github.com/joergdietrich/joergdietrich.github.io/raw/master/data/float.fits")
z = np.random.choice(tab['x'], 500)
Downloading https://github.com/joergdietrich/joergdietrich.github.io/raw/master/data/float.fits [Done]

I needed to call some astropy code (angular_diameter_distance_z1z2(z1, z2)), which takes two arrays are argument and requires that all values in z1 are less or equal than the values in z2. In my case z1 is a constant and I enforce this constraint by setting by setting all smaller values equal to this constant.

In [3]:
zl = 0.32
idx = z < zl
z[idx] = zl
np.any(z < zl)
Out[3]:
False

So far, so good. The code actually refused to compare a scalar with an array, so I called it like this:

In [4]:
dls = cosmo.angular_diameter_distance_z1z2(np.repeat(zl, z.size), z)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-4-f71b9851644c> in <module>()
----> 1 dls = cosmo.angular_diameter_distance_z1z2(np.repeat(zl, z.size), z)

/home/joerg/applications/anaconda3/lib/python3.5/site-packages/astropy-1.1.dev14570-py3.5-linux-x86_64.egg/astropy/cosmology/core.py in angular_diameter_distance_z1z2(self, z1, z2)
   1333 
   1334         if (z1 > z2).any():
-> 1335             raise ValueError('z2 must greater than z1')
   1336 
   1337         dm1 = self.comoving_transverse_distance(z1).value

ValueError: z2 must greater than z1

Oops! What happended? I had just set everything smaller than zl to zl and our check above confirmed that there are no values smaller than zl in z. Let’s check again:

In [5]:
np.any(np.repeat(zl, z.size) > z), np.any(zl > z)
Out[5]:
(True, False)

Weird. What ared these values?

The check works for scalar/array but not for array/array. Actually the check of course works, but there’s a subtlety going on, as we’ll see in a moment.

In [7]:
np.all(np.equal(np.where(np.repeat(zl, z.size) > z), np.where(idx)))
Out[7]:
True

So culprits are the elements we just set to zl. Lets see what they are.

In [8]:
z[idx], np.repeat(zl, z.size)[idx]
Out[8]:
(array([ 0.31999999,  0.31999999,  0.31999999,  0.31999999,  0.31999999,
         0.31999999,  0.31999999,  0.31999999,  0.31999999,  0.31999999,
         0.31999999,  0.31999999,  0.31999999,  0.31999999,  0.31999999,
         0.31999999,  0.31999999,  0.31999999,  0.31999999,  0.31999999,
         0.31999999,  0.31999999,  0.31999999,  0.31999999,  0.31999999,
         0.31999999,  0.31999999,  0.31999999,  0.31999999,  0.31999999,
         0.31999999,  0.31999999,  0.31999999,  0.31999999,  0.31999999,
         0.31999999,  0.31999999,  0.31999999,  0.31999999,  0.31999999,
         0.31999999,  0.31999999,  0.31999999,  0.31999999,  0.31999999,
         0.31999999,  0.31999999,  0.31999999,  0.31999999], dtype=float32),
 array([ 0.32,  0.32,  0.32,  0.32,  0.32,  0.32,  0.32,  0.32,  0.32,
         0.32,  0.32,  0.32,  0.32,  0.32,  0.32,  0.32,  0.32,  0.32,
         0.32,  0.32,  0.32,  0.32,  0.32,  0.32,  0.32,  0.32,  0.32,
         0.32,  0.32,  0.32,  0.32,  0.32,  0.32,  0.32,  0.32,  0.32,
         0.32,  0.32,  0.32,  0.32,  0.32,  0.32,  0.32,  0.32,  0.32,
         0.32,  0.32,  0.32,  0.32]))

Clearly a round-off error. But why? We just set the array elements to exactly this value. Wait. Notice that the first array has a dtype=float32 but the second array does not have a dtype specified, i.e., it is a standard float64 array. This is where the problem is. z came from a FITS table, which has a 32 bit wide float column and that’s just what astropy gave us. Comparing to a 64 bit float array led to inevitable round-off errors, while apparently the correct cast is made when comparing a 32 bit float array to a 64 bit scalar.

In [9]:
z = np.float64(z)
z[idx] = zl
np.any(np.repeat(zl, z.size) > np.float64(z))
Out[9]:
False

Converting to float64 before the assignment fixes this problem. This took me a while to figure out.

This blog post is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This post was written as a Jupyter Notebook in Python. You can download the original notebook.

Comments