============================================================

Binary Integer Mathematics, unsigned, two's complement, etc.

============================================================

- Ian! D. Allen - idallen@idallen.ca - http://www.idallen.com

How the computer adds integer numbers

-------------------------------------

When the ALU inside the CPU adds one to any integer value, signed or

unsigned, the same thing always happens. Basic ALU integer math doesn't

care about sign or negative numbers. There is no concept, at the ALU

level, of "signed" or "unsigned". It's all just bits being added,

and the simple rules for binary addition apply.

In four bits binary, the math (adding one to each value) looks like this:

CPU Addition Logic for Four Bit Binary Numbers

------------------------------------------------

0000 plus one (0001) is

0001 plus one is

0010 plus one is

0011 plus one is

0100 plus one is

0101 plus one is

0110 plus one is

0111 plus one is (also causes the Overflow flag to be set in the CPU)

1000 plus one is

1001 plus one is

1010 plus one is

1011 plus one is

1100 plus one is

1101 plus one is

1110 plus one is

1111 plus one is (also causes Carry flag to be set in the CPU)

0000 plus one is

0001 plus one is

0010 plus one is ...etc...

The list is is really a ring or wheel, joined at the ends. There is no

real beginning or end to the repeating sequence of bit patterns; adding

one just moves to the next bit pattern around the ring. Adding two

moves two places around the ring, etc. Subtraction moves the other way.

The list works like a car odometer; when the maximum value (1111) is

reached, the odometer rolls over to zero (0000) and starts over again.

Rolling backward from 0000 gets you to 1111.

(For another view of how this works, see "Odometers" and

"Positive and Negative in Binary", showing the number ring, in:

http://www.cs.nmsu.edu/~pfeiffer/classes/273/notes/neg.html )

As Pfeifer says, there is no way to tell whether the bit pattern 0101 is

the result of adding eight to 0010 (i.e. adding one eight times) or the

result of subtracting eight from 0010 (i.e. subtracting one eight times).

Both move the same distance around the ring and end up at the same place.

Four bits have only 2**4 (16) possible bit patterns; that means they can

represent only 16 different things. Eight bits can only represent 2**8

(256) different numbers, etc. Ten bits, 1024 numbers.

Unsigned Numbers

----------------

If all 16 of these four-bit patterns represent positive numbers, the range

of numbers is 0 to 15 decimal. The smallest number is zero with binary

bit pattern 0000 and the largest number is 15 with bit pattern 1111:

UNSIGNED 4-BIT NUMBERS

------------------------

0000 = 0 # unsigned numbers start at all bits turned off (zero)

0001 = 1

0010 = 2

0011 = 3

0100 = 4

0101 = 5

0110 = 6

0111 = 7

1000 = 8

1001 = 9

1010 = 10

1011 = 11

1100 = 12

1101 = 13

1110 = 14

1111 = 15 # unsigned numbers end with all bits turned on (15 decimal)

In four bits, adding one to 15 doesn't give 16, it wraps around to start

over at zero (0000) again. Subtracting one from zero wraps around to

give the bit pattern for 15 (1111). Wrapping around zero means the math

is wrong. As long as the adding/subtracting stays within the numbers 0

(0000) to 15 (1111) without crossing zero, the unsigned math is correct.

Any four-bit unsigned math that goes below zero or above 15 is wrong.

Signed Numbers

--------------

If we choose to have some of these 16 four-bit patterns represent negative

numbers, we can't use as many bit patterns for positive numbers. We have

to steal some bit patterns to use as negative numbers.

Most systems for negative numbers divide up the patterns into half

positive and half negative. If the top (leftmost, most significant) bit

is on, we can decide that the bit pattern represents a negative number.

We call this top, leftmost bit the "sign" bit.

If half the four-bit patterns have the sign bit on and will be used

for negative numbers, that leaves only eight bit patterns to represent

positive numbers (with zero being called "positive"), and the range of

eight positive numbers is now only 0 to +7 (a total of eight numbers):

0000 0001 0010 0011 0100 0101 0110 0111

The other eight bit patterns, with sign bit on, represent eight negative

numbers, usually (but not always) the numbers from -8 up to -1:

1000 1001 1010 1011 1100 1101 1110 1111

Where the 16 unsigned four-bit numbers range from 0 to 15, the 16 signed

four-bit numbers are half-negative and half-positive and range from

-8 up to zero up to +7. (Note how the range includes -8 but only +7,

since zero is considered a "positive" number here.)

Two's Complement Signed Numbers:

Which bit patterns represent each of the negative numbers from -8 up

to -1? If we subtract one from zero (0000), the result is the bit

pattern 1111; so, 1111 must be the bit pattern for -1. (Adding one to

1111 gives zero, as expected.) If we subtract one from 1111 (-1) we get

1110; so, 1110 must be the bit pattern for -2, etc., down to -8 with bit

pattern 1000. This is the "two's complement" system for negative numbers.

Subtracting one from -8 (1000) doesn't give -9, it wraps around to give

the bit pattern for +7 (0111). Adding one to +7 (0111) doesn't give +8,

it wraps around to start over at -8 (1000) again.

Two's Complement Signed 4-bit Numbers

---------------------------------------

1000 = -8 # signed numbers start with only the sign bit turned on

1001 = -7

1010 = -6

1011 = -5

1100 = -4

1101 = -3

1110 = -2

1111 = -1

0000 = 0

0001 = +1

0010 = +2

0011 = +3

0100 = +4

0101 = +5

0110 = +6

0111 = +7 # signed numbers end with everything *except* the sign bit on

With the introduction of two's complement negative numbers, the smallest

four-bit number is -8 with bit pattern 1000 and the largest number is

+7 with bit pattern 0111. Adding one to -1 (1111) correctly gives zero

(0000). As long as the adding/subtracting stays within the range of

numbers -8 (1000) to +7 (0111), the signed math is correct. Any four-bit

signed math that goes below -8 or above +7 is wrong.

The wrapping-around is true for all two's complement number systems -

adding one to the highest positive number wraps around to give the most

negative number. The most negative number always has an absolute value

one larger than the largest positive number, since zero is included with

the positive numbers. Thus, if you know that the largest positive 8-bit

two's complement value is +127, you immediately know that the smallest

negative value must be -128.

For four-bit unsigned numbers, the range of 16 bit patterns starts at zero

(0000) and goes up to 15 (1111). Unsigned math that steps outside this

range (goes below 0000 or above 1111) is wrong - the answer wraps around

incorrectly. Crossing from 0111 to 1000 is not an error in unsigned math.

Crossing from 1111 to 0000 is an error in unsigned math.

For four-bit signed two's complement numbers, the range of 16 bit patterns

starts at -8 (1000) and goes up to +7 (0111). Signed math that steps

outside this range (goes below 1000 or above 0111) is wrong - the answer

wraps around incorrectly. Crossing from 1111 to 0000 is not an error

in signed math. Crossing from 0111 to 1000 is an error in signed math.

Note how unsigned math treats the list of 16 consecutive bit patterns

as going from 0 to 15 (0000 to 1111) and signed math treats the list as

the 16 patterns going from -8 to +7 (1000 to 0111). It's the same list

of sixteen bit patterns, but the start and end points for "right answer"

mathematics is different for signed and unsigned.

Inside the CPU, for addition (or subtraction) of these 16 four-bit

numbers, it doesn't matter if the bit pattern is considered signed or

unsigned - when you say "x = x + 1" the ALU inside the CPU just adds

one to the bits in memory and possibly sets some CPU flags. Signed,

unsigned, same thing. The CPU knows nothing about negative numbers;

it only does binary addition to bit patterns.

Rule: When adding binary, octal, or hexadecimal bit patterns together,

just do the binary math and express the result in the allowed number of

bits, throwing away the carry if it doesn't fit.

Examples of binary integer math:

binary 4 bits: 0000 + 0111 = 0111

hexadecimal 4 bits: 0h + 7h = 7h

binary 4 bits: 0100 + 0100 = 1000

hexadecimal 4 bits: 4h + 4h = 8h

binary 4 bits: 1010 + 1010 = 0100 (no room for the carry in 4 bits)

hexadecimal 4 bits: Ah + Ah = 4h (no room for the carry in 4 bits)

binary 4 bits: 1100 + 1100 = 1000 (no room for the carry in 4 bits)

hexadecimal 4 bits: Ch + Ch = 8h (no room for the carry in 4 bits)

Hex math calculator: http://www.csgnetwork.com/hexaddsubcalc.html

hex 16 bits: 1ABCh + 0001h = 1ABDh

hex 16 bits: 1ABCh + 0009h = 1AC5h

hex 16 bits: 1ABCh + 9C3Eh = B6FAh

hex 16 bits: 1ABCh + FFFFh = 1ABBh (no room for carry in 16 bits)

hex 16 bits: 1ABCh + F000h = 0ABCh (no room for carry in 16 bits)

You must pay careful attention to the number of bits used in the

mathematics and never generate an answer that has more than this number

of bits. If the math generates a carry out of the leftmost (top, most

significant) bit, that carry is "thrown away" and a Carry flag inside

the CPU is turned on indicating that this happened.

Status Flags for binary integer mathematics

===========================================

The ALU in the CPU has two math status flags in it: the "Carry" flag and

the "Overflow" flag. These flags are set or unset after every integer

math operation. The flags can be interpreted by you, the programmer,

to have meaning depending on whether you are treating the bit patterns

as unsigned (all-positive), or signed (positive and negative numbers).

The CPU doesn't know or care about signed/unsigned - it always does the

math and always sets the two flags. You (your program) have to care.

Carry Flag - for unsigned integer math

--------------------------------------

The "Carry" flag is set in the ALU whenever any binary mathematics

causes a "carry" out of the top (leftmost, sign) bit, e.g. in four bit

binary math: 1000+1000=0000 (and the carry flag is set). The carry flag

indicates that the answer didn't fit in the number of bits we are using -

there was "carry" that would have required one more bit to represent.

You can think of that "carry" out of the top bit as setting the Carry

flag to ON. If there is no carry, the Carry flag is OFF.

If we treat the bits in a word as "unsigned" numbers (all positive,

no negative), we must watch to see if our arithmetic causes the "carry"

flag to be set, indicating the (unsigned) result is wrong. (We don't

care about the ALU Overflow flag when doing unsigned math. The overflow

flag is only relevant to signed numbers, not unsigned. See below.)

4-bit example: 0100 + 0100 = 1000 (no carry - the answer fits in 4 bits)

- The carry flag is turned OFF, no carry, since no carry was generated

- If our program asked the CPU to do unsigned math, the fact that the

Carry flag is off means that the answer is correct for unsigned math.

- In unsigned decimal: 4 + 4 = 8 (carry OFF == correct answer)

4-bit example: 1000 + 1000 = 0000 (and the carry flag comes on in the CPU)

- The carry flag is turned ON, since carry was generated

- If our program asked the CPU to do unsigned math, the fact that the

Carry flag is ON means that the answer is wrong for unsigned math.

- In unsigned decimal: 8 + 8 = 0 (carry ON == wrong answer)

- If we were allowed 5 bits, the answer would be correct as 10000(2) = 16(10)

4-bit example: 1111 + 0011 = 0010 (and the carry flag comes on in the CPU)

- The carry flag is turned ON, since carry was generated

- If our program asked the CPU to do unsigned math, the fact that the

Carry flag is ON means that the answer is wrong for unsigned math.

- In unsigned decimal: 15 + 3 = 2 (carry ON == wrong answer)

- If we were allowed 5 bits, the answer would be correct as 10010(2) = 18(10)

The carry flag indicates that the unsigned binary math answer needs more bits.

Overflow Flag - for two's complement integer math

-------------------------------------------------

The "Overflow" flag is set by the CPU whenever the ALU adds two numbers

together with the same sign bit, and the result has the opposite sign

bit, e.g. in four bit binary math: 0100+0100=1000 (and the overflow flag

is set). Just as with the Carry flag, the CPU always sets the overflow

flag after a binary math operation.

Just as the Carry flag tells your program that the binary math is wrong

for unsigned arithmetic, the Overflow flag has the corresponding meaning

for signed arithmetic. The overflow flag indicates that the signed

math is wrong - you should never add two numbers of the same sign get

an answer with the opposite sign. The math went outside the valid range

of signed numbers.

The ALU always sets both the Carry and Overflow flags when doing integer

mathematics on two bit patterns. Your program must choose which flag to

check depending on whether your program was doing signed or unsigned math.

The sign bit (leftmost bit) and overflow flag only have meaning if we

choose to interpret the bit patterns using the two's complement number

system, which assigns half the bit patterns to negative numbers.

If we treat the bits in a word as "two's complement" signed values,

we must watch to see if our arithmetic causes the "overflow" flag to

be set, indicating the (two's complement) result is wrong. (We don't

care about the carry flag when doing signed, two's complement math.

The carry flag is only relevant to unsigned numbers, not signed.)

4-bit example: 0100 + 0100 = 1000 (and the overflow flag comes on in the CPU)

- The overflow flag is turned ON, since the sign bit changed

- If our program asked the CPU to do signed math, the fact that the

Overflow flag is ON means that the answer is wrong for signed math.

- In signed decimal: 4 + 4 = -8 (overflow flag ON == wrong answer)

- Note that for this same math, if our program was doing unsigned math,

the fact that the Carry flag is OFF means that the answer would be right!

- In unsigned decimal: 4 + 4 = 8 (carry OFF == right answer)

Some mathematics on bit patterns gives answers that are wrong both for

unsigned and two's complement numbers, in which case *both* the carry

and overflow flags will be set in the CPU:

4-bit example: 1010 + 1010 = 0100 (and both carry and overflow come on)

- In unsigned decimal (no negative numbers): 10 + 10 = 4 (WRONG)

- In signed decimal (two's complement): -6 + -6 = +4 (WRONG)

The ALU always sets both the carry and overflow flags when doing integer

mathematics on two bit patterns. Your program must choose which flag to

check depending on whether your program was doing signed or unsigned math.

When the CPU does math on bit patterns, it doesn't know or care whether

the bits represent signed or unsigned integers. The CPU just does the

math and sets the flags. Your program knows whether it was using the

CPU for signed or unsigned math, and so your program must check the

correct flag.

=========================

C Language Implementation

=========================

In C language with 32-bit integers:

signed int sx = 2000000000;

signed int sy = 2000000001;

signed int sz = sx + sy;

unsigned int ux = 2000000000;

unsigned int uy = 2000000001;

unsigned int uz = ux + uy;

printf("sz is 0x%x which could be %d or %u\n", sz, sz, sz);

printf("uz is 0x%x which could be %d or %u\n", uz, uz, uz);

Output (first "0x" output is in hexadecimal):

sz is 0xEE6B2801 which could be -294967295 or 4000000001

uz is 0xEE6B2801 which could be -294967295 or 4000000001

The hex value 0xEE6B2801 when turned into 32-bit binary looks like this:

11101110011010110010100000000001

E E 6 B 2 8 0 1

See - absolutely no difference in the two bit patterns in memory. The bit

patterns in memory for signed and unsigned math are the same; because,

there is no difference in what the ALU does. The ALU doesn't care.

Adding/subtracting don't care about sign; it is all pure binary math on

bit patterns.

As you see above, once we have some bits in memory, we can tell our

program to interpret the bits in either an unsigned manner ("%u" - all

the bit patterns represent non-negative numbers) or in a signed manner

("%d" - about half of the bit patterns will be displayed as negative,

with leading minus).

Signed/unsigned doesn't matter at the ALU level in computer math. Where a

signed/unsigned declaration plays a big part is in *comparing* numbers

(and in bit-shifting; but, save that for another day). If we compare

the bit pattern 0xEE6B2801 (above) with zero, is the bit pattern less

than zero or much greater than zero? *That* answer depends on whether

we declared the memory location (the variable) to be signed or unsigned.

If we declare a 32-bit variable as unsigned, numeric comparisons

treat all 4,294,967,296 bit patterns from zero up to 0xFFFFFFFF

(11111111111111111111111111111111) as non-negative numbers (zero to

4,294,967,295 for 32-bit numbers). Comparisons (unsigned) will never

say any of these patterns are less than zero:

unsigned int x = 0xFFFFFFFF;

if ( x < 0 ) ( ... /* this is always FALSE for unsigned numbers */

If we declare a 32-bit variable as signed, we are saying to our compiler

that numeric comparisons using that variable will treat about half of

those 4,294,967,296 bit patterns as negative numbers. Any bit patterns

in signed variables that have the leftmost bit (the sign bit) set,

will be interpreted and compared as "less than zero":

signed int x = 0xFFFFFFFF;

if ( x < 0 ) ( ... /* TRUE because x is signed and the sign bit is on */

More C code:

signed int sz = 0xEE6B2801;

unsigned int uz = 0xEE6B2801;

/*

* Both sz and uz contain the same bit pattern 0xEE6B2801.

* Signed sz interprets the bits as -294967295(10).

* Unsigned uz interprets the bits as 4000000001(10).

*/

if ( sz < 0 )

printf("sz is less than zero\n");

else

printf("sz is not less than zero\n");

if ( uz < 0 )

printf("uz is less than zero\n");

else

printf("uz is not less than zero\n");

Output:

sz is less than zero

uz is not less than zero

Remember that sz and uz contain exactly the same bit pattern! It is

our program that interprets the results.

Because we declared the sz variable to be signed, the code generated

by the compiler to test whether the bit pattern in sz is less than zero

tests the sign bit (leftmost bit) on the sz memory, notices that is is on

('1') and declares the bit pattern as "less than zero" (signed). All the

bit patterns with the sign bit on will be interpreted as negative numbers.

Because we declared the uz variable to be unsigned, the code generated

by the compiler to test whether the bit pattern in uz is less than zero

has no work to do at all - unsigned numbers are never less than zero.

The compiler arranges to print: "uz is not less than zero"

As a programmer, declaring integer variables as signed/unsigned is

just a convenient way of having the compiler conspire with us to treat

half the bit patterns as "less than zero". Inside the CPU doing math

(adding/subtracting), the declaration of signed/unsigned doesn't make

any difference - the CPU just does the math and sets the Carry and

Overflow flags. For mathematical comparisons (and bit-shifting),

declaring signed or unsigned makes a big difference.

--

| Ian! D. Allen - idallen@idallen.ca - Ottawa, Ontario, Canada

| Home Page: http://idallen.com/ Contact Improv: http://contactimprov.ca/

| College professor (Free/Libre GNU+Linux) at: http://teaching.idallen.com/

| Defend digital freedom: http://eff.org/ and have fun: http://fools.ca/