Random Thoughts in a Random Universe

Python and numpy Bool Types

This blog post is triggered by a colleague stopping me in the hall and asking “What does ~ do in Python?” She was surprised by the behavior of the ~ operator when applied to Python bool types and I was surprised that it behaved differently on numpy bools than on Python bools. All in all enough surprises to write a short blog post about the difference between the two variable types.

The ~ Operator

Let’s start with the original question, what does ~n do in Python? Answer: It inverts the bits of n, where n is an integer. So for example:

In [1]:
n = 85
print(" {0:d} in binary: {0:+b}".format(n))
print("~{0:d} in binary: {1:+b} is {1:d} in integer".format(n, ~n))
 85 in binary: +1010101
~85 in binary: -1010110 is -86 in integer

You may find it surprising that ~85 does not return the bit pattern 0101010 but this is just due to the two’s complement representation of integers in Python.

Python bool

Understanding two’s complement and knowing that Python bools are a subclass of int, it is not surprising that

In [2]:
print(" True in binary:  {:s}".format(bin(True)))
print("~True in binary: {:s} is {:d} in integer".format(bin(~True), ~True))
 True in binary:  0b1
~True in binary: -0b10 is -2 in integer

and so the truth value of ~True is True:

In [3]:
print(bool(~True))
True

because any integer other than zero evaluates to True. This may come as a surprise if you are not aware that bools in Python are in fact integers, which use two’s complement. It’s even a little bit more confusing because

In [4]:
print(~False, bool(~False))
-1 True

~False in fact evaluates to True. If you want to correctly negate Python boolean values use logical not and not bitwise not (~):

In [5]:
print(not True, not False)
False True

Numpy bool

What surprised me was that numpy bools show a different behavior:

In [6]:
import numpy as np
a = np.ones(10, dtype=bool)
print(a)
print(~a)
[ True  True  True  True  True  True  True  True  True  True]
[False False False False False False False False False False]

The reason for this is that numpy bools are an entirely different type. They are not an subclass of Python bools and they are also not a subclass of any numeric type. This is all clearly stated in the numpy reference manual even with the following warning

The bool_ type is not a subclass of the int_ type (the bool_ is not even a number type). This is different than Python’s default implementation of bool as a sub-class of int.

yet reading this without this example I didn’t fully understand the consequences.

In numpy we can make things even a little more convoluted if we mix Python bools and numpy.bool_ in an object array.

In [7]:
b = np.array([True, True, False, np.True_], dtype=object)
print(b.astype(np.bool))
print((~b).astype(np.bool))
[ True  True False  True]
[ True  True  True False]

My advise above to use logical not also does not work for numpy arrays because not is not applied element-wise but tries to evaluate the boolean value of the entire array.

In [8]:
print(not b)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-8-c236fbab03dc> in <module>()
----> 1 print(not b)

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

The correct thing to do for numpy arrays is to use the ufunc logical_not, which gives the expected result for both our arrays a and b

In [9]:
print("Array a")
print(a)
print(np.logical_not(a))
print("\nArray b")
print(b)
print(np.logical_not(b))
Array a
[ True  True  True  True  True  True  True  True  True  True]
[False False False False False False False False False False]

Array b
[True True False True]
[False False True False]

Comments