index

Interpretations of Integer Division

spylangspy#251

--

-- -- --

The above PR forced me to take a closer look at integer division and its various implementations. Static Python (SPy) requires Python semantics whilst generating or lowering to C backend code and so issues arise when the semantics are different like with integer division. It’s not immediately obvious that this is the case until we start dealing with negative numbers in signed integer types.

ExpressionPython1C2
7 / 322
-7 / 3-3-2
7 / -3-3-2
-7 / -322

We have a similar situation with the related modulus operator as follows:

ExpressionPythonC
7 % 311
-7 % 32-1
7 % -3-21
-7 % -3-1-1

Why have different rules though?

For integers a and b where q is the quotient and r is the remainder, integer division is defined by a = b * q + r with the following constraint |r| < |b|. The catch is that this equation alone does not uniquely determine q and r. For -7 and 3, the following is valid:

-7 = 3 * (-2) + (-1)

-7 = 3 * (-3) + 2

Compare again with the above tables.

Floor It

Incredibly, I came across a post by the (benevolent) dictator of Python himself. In summary, Guido states it’s more useful to hold the more specific constraint 0 <= r < |b| for negative a and positive b because this allows behaviour people often want when they are doing wrapping arithmetic, bucket indexing, cyclic structures, or number theory. To hold this constraint, you need to floor q to negative infinity.

To get a better idea of this, let’s have a look at the normal division operation which returns a float in Python.

-7 / 3 == -2.333...

Let’s now show a Euclidean division result with a Python function equivalent.

(-3, 2) = divmod(-7, 3)

So the first result is q and the second is r - quotient and remainder, respectively. The reason flooring works here is because b * q <= a which when instantiated is 3 * (-3) <= -7 and so for the relation a = bq + r to be true r must be positive which upholds the aforementioned constraint 0 <= r < |b|.

No True Modulus

In C, integer division truncates toward zero so:

-7 / 3 == -2
-7 % 3 == -1

Truncation results in the following relation b * q >= a which instantiated is 3 * (-2) >= -7. If this is the case then r must be negative and so the constraint 0 <= r < |b| does not hold. This brings us to a difference in terminology as it relates to % operator. In the C case, it’s more accurately referred to as the remainder operator because it takes the sign of a the dividend. By contrast, in Python the operator is referred to as the modulus operator and as you may have guessed it takes the sign of b the divisor.

I suppose the next question is if the modulus operator is so useful as stated by Guido then why use truncation for integer division at all? Guido points the finger at old hardware using one’s complement as opposed to the more modern two’s complement method of representing signed integers. I’m not completely convinced by this because modern languages like Go and Rust use truncation as well. From a quick search it looks like truncation just matches how modern hardware works for example the idiv instruction on x86 and so I guess it just boils down to simple hardware efficiency.

What Have We Learned?

It’s interesting that the interpretation of integer division is pulled in different directions based on mathematics, utility, and hardware considerations. I do lean towards flooring for integer division even with the efficiency hit because it’s generally more useful in my opinion.

Footnotes

  1. Python 3 // = floor division

  2. C99+ / with int operands = truncate towards zero