--
The above PR forced me to take a closer look at integer division and its various implementations. Static Python (SPy) requires Python semantics whilst generating or lowering to C backend code and so issues arise when the semantics are different like with integer division. It’s not immediately obvious that this is the case until we start dealing with negative numbers in signed integer types.
| Expression | Python1 | C2 |
|---|---|---|
7 / 3 | 2 | 2 |
-7 / 3 | -3 | -2 |
7 / -3 | -3 | -2 |
-7 / -3 | 2 | 2 |
We have a similar situation with the related modulus operator as follows:
| Expression | Python | C |
|---|---|---|
7 % 3 | 1 | 1 |
-7 % 3 | 2 | -1 |
7 % -3 | -2 | 1 |
-7 % -3 | -1 | -1 |
Why have different rules though?
For integers a and b where q is the quotient and r is the remainder, integer division is
defined by a = b * q + r with the following constraint |r| < |b|. The catch is that this
equation alone does not uniquely determine q and r. For -7 and 3, the following is valid:
-7 = 3 * (-2) + (-1)
-7 = 3 * (-3) + 2Compare again with the above tables.
Floor It
Incredibly, I came across a
post by the
(benevolent) dictator of Python himself. In summary, Guido states it’s more useful to hold the more
specific constraint 0 <= r < |b| for negative a and positive b because this allows behaviour
people often want when they are doing wrapping arithmetic, bucket indexing, cyclic structures, or
number theory. To hold this constraint, you need to floor q to negative infinity.
To get a better idea of this, let’s have a look at the normal division operation which returns a float in Python.
-7 / 3 == -2.333...Let’s now show a Euclidean division result with a Python function equivalent.
(-3, 2) = divmod(-7, 3)So the first result is q and the second is r - quotient and remainder, respectively. The reason
flooring works here is because b * q <= a which when instantiated is 3 * (-3) <= -7 and so for
the relation a = bq + r to be true r must be positive which upholds the aforementioned
constraint 0 <= r < |b|.
No True Modulus
In C, integer division truncates toward zero so:
-7 / 3 == -2
-7 % 3 == -1Truncation results in the following relation b * q >= a which instantiated is 3 * (-2) >= -7. If
this is the case then r must be negative and so the constraint 0 <= r < |b| does not hold. This
brings us to a difference in terminology as it relates to % operator. In the C case, it’s more
accurately referred to as the remainder operator because it takes the sign of a the dividend. By
contrast, in Python the operator is referred to as the modulus operator and as you may have guessed
it takes the sign of b the divisor.
I suppose the next question is if the modulus operator is so useful as stated by Guido then why use
truncation for integer division at all? Guido points the finger at old hardware using one’s
complement as opposed to the more modern two’s complement method of representing signed integers.
I’m not completely convinced by this because modern languages like Go and Rust use truncation as
well. From a quick search it looks like truncation just matches how modern hardware works for
example the idiv instruction on x86 and so I guess it
just boils down to simple hardware efficiency.
What Have We Learned?
It’s interesting that the interpretation of integer division is pulled in different directions based on mathematics, utility, and hardware considerations. I do lean towards flooring for integer division even with the efficiency hit because it’s generally more useful in my opinion.