Zephyrnet Logo

Binary Math Tricks: Shifting to Divide by Ten Ain’t Easy

Date:

On small CPUs, you often don’t have a multiply or divide instruction. Of course, good programmers know that shifting right and left will multiply or divide by a power of two. But there are always cases where you need to use something that isn’t a power of two. Sometimes you can work it out for multiplication.

For example, multiplying by 10 is common when dealing with conversion between binary and decimal. But since 10n is equal to 8n+2n, you can express that as a bunch of left shift three times to multiply by eight, adding that value to your original value shifted left once to multiply by two.

But division is a different problem. n/10 does not equal n/8-n/2 or anything else simple like that. The other day a friend showed me a very convoluted snippet of code on Stack Overflow by user [realtime] that divides a number by 10 and wanted to know how it worked. It is pretty straightforward if you just stick with the math and I’ll show you what I mean in this post. Turns out the post referenced the venerable Hacker’s Delight book, which has a wealth of little tricks like this.

First Multiply

First, let’s do some shifts to multiply. Each left shift is a power of two, so n<<1 is 2*n and n<<8 is 256*n. That’s easy. The hard part is decomposing it for non-powers of two: n<<3 + n<<1 is the same as n*10

If you are in assembly doing one shift at a time, you can often save a little time by combining the two shifts:

SHL A
MOV A,T ; Move A to temporary storage
SHL A
SHL A
ADD A, T ; Add T to A

That’s pretty efficient even on processors that can multiply.

Division Shifts Work but Composites Are Tricky

Division is the same way, but it doesn’t combine nicely. So n>>1 is n/2 and n>>8 is n/256. But there’s no easy way to combine divisions like you can multiplications.

The code we looked at looked like this:

unsigned divu10(unsigned n) { unsigned q, r; q = (n >> 1) + (n >> 2); q = q + (q >> 4); q = q + (q >> 8); q = q + (q >> 16); q = q >> 3; r = n - (((q << 2) + q) << 1); return q + (r > 9);
}

That’s a mouthful! But it does work. The secret to understanding this is to treat each shift as taking a fraction of the number. Look at the first working line:

q=(n>>1)+(n>>2)

This is really n/2 + n/4. If you remember your high school math, that’s the same as 3n/4. Of course, this is the same as multiplying by 0.75. If you look ahead to the last assignment of q, you’ll get a clue:q=q>>3;

That’s saying q = q/8. So if our goal is to divide by 10 it may be easier to think of it as multiplying by 0.1. To fit well with powers of two, we really want to think about multiply the whole thing by 0.8 and then divide by 8.

So adding one right shift to two right shifts gets us 0.75, which isn’t too far away from 0.8, but it isn’t quite there. The next line adds a little bit more to our 0.75 factor. How much more? The amount is 3n/64 and the total is now 51n/64. That equates to 0.797 or so. Getting closer to 0.8 already. Each term adds a little bit less and gets us a little bit closer. Here’s how it breaks down: The last term gets you only a little bit closer. The 13107n/16384 term is almost as close.

Expression Base Value Delta Total Value Ratio
(n>>1)+n(>>2) 3n/4 0 3n/4 0.75
q+(q>>4) 3n/4 + (3n/4)/16 3n/64 51n/64 0.7969
q+(q>>8) 51n/64+(51n/64)/256 51n/16384 13107n/16384 0.7999
q+(q>>16) 13107n/16384+(13107n/16384)/65536 13107n/1073741824 858993458n/1073741824 0.8

I couldn’t help but think that this would be pretty easy to implement in an FPGA.

Commenting the Code

Here’s the code with comments, which is a bit easier to follow:

unsigned divu10(unsigned n) { unsigned q, r; q = (n >> 1) + (n >> 2); // q=n/2+n/4 = 3n/4 q = q + (q >> 4); // q=3n/4+(3n/4)/16 = 3n/4+3n/64 = 51n/644 q = q + (q >> 8); // q=51n/64+(51n/64)/256 = 51n/64 + 51n/16384 = 13107n/16384 q = q + (q >> 16); // q= 13107n/16384+(13107n/16384)/65536=13107n/16348+13107n/1073741824=858993458n/1073741824 // note: q is now roughly 0.8n q = q >> 3; // q=n/8 = (about 0.1n or n/10) r = n - (((q << 2) + q) << 1); // rounding: r= n-2*(n/10*4+n/10)=n-2*5n/10=n-10n/10 return q + (r > 9); // adjust answer by error term
}

Once you break it down, it isn’t that hard to understand. This method dates back a while, and it appears the original source is the book Hacker’s Delight (see Figure 10-12 in the second edition). I couldn’t find the associated web site, but there is an archival copy. The only difference in that code is the last line is commented out and replaced with: return q+((r+6)>>4);

The book explains how to figure out the optimal shifts and adds and gives several other examples. There are also many other division tricks in that chapter. If you like this sort of thing, have a look at the bit twiddling hacks page. If your division dreams tend to floating point, you might find this interesting.

Source: https://hackaday.com/2020/06/12/binary-math-tricks-shifting-to-divide-by-ten-aint-easy/

spot_img

Latest Intelligence

spot_img