Question 1

I was playing with comparing data types of two different arrays to pick one that is suitable for combining the two. I was happy to discover that I could perform comparison operations, but in the process discovered the following strange behavior:

In [1]: numpy.int16 > numpy.float32
Out[1]: TrueIn [2]: numpy.dtype('int16') > numpy.dtype('float32')
Out[2]: False

Can anyone explain what is going on here? This is NumPy 1.8.2.

Question 2

The first comparison is not meaningful, the second is meaningful.

With numpy.int16 > numpy.float32 we are comparing two type objects:

>>> type(numpy.int16)
type
>>> numpy.int16 > numpy.float32 # I'm using Python 3
TypeError: unorderable types: type() > type()

In Python 3 this comparison fails immediately since there is no defined ordering for type instances. In Python 2, a boolean is returned but cannot be relied upon for consistency (it falls back to comparing memory addresses or other implementation-level stuff).

The second comparison does work in Python 3, and it works consistently (same in Python 2). This is because we're now comparing dtype instances:

>>> type(numpy.dtype('int16'))
numpy.dtype
>>> numpy.dtype('int16') > numpy.dtype('float32')
False
>>> numpy.dtype('int32') < numpy.dtype('|S10')
False
>>> numpy.dtype('int32') < numpy.dtype('|S11')
True

What's the logic behind this ordering?

dtype instances are ordered according to whether one can be cast (safely) to another. One type is less than another if it can be safely cast to that type.

For the implementation of the comparison operators, look at descriptor.c; specifically at the arraydescr_richcompare function.

Here's what the < operator maps to:

switch (cmp_op) {case Py_LT:if (!PyArray_EquivTypes(self, new) && PyArray_CanCastTo(self, new)) {result = Py_True;}else {result = Py_False;}break;

Essentially, NumPy just checks that the two types are (i) not equivalent, and (ii) that the first type can be cast to the second type.

This functionality is also exposed in the NumPy API as np.can_cast:

>>> np.can_cast('int32', '|S10')
False
>>> np.can_cast('int32', '|S11')
True

NumPy data type comparison

Related Q&A

A simple method for rotating images in reportlab

XML header getting removed after processing with elementtree

How to ntp server time down to millisecond precision using Python ntplib?

Control 2 separate Excel instances by COM independently... can it be done?

nested Python numpy arrays dimension confusion

how to reuse tests written using unittest.testcase

Randomized stratified k-fold cross-validation in scikit-learn?

Find all paths through a tree (nested dicts) from top to bottom

How to show process state (blocking, non-blocking) in Linux

Removing quotation marks from list items