Binary Integer Mathematics, unsigned, two's complement, etc.

Website / Source

============================================================
Binary Integer Mathematics, unsigned, two's complement, etc.
============================================================
- Ian! D. Allen - idallen@idallen.ca - http://www.idallen.com
 
How the computer adds integer numbers
-------------------------------------
 
When the ALU inside the CPU adds one to any integer value, signed or
unsigned, the same thing always happens.  Basic ALU integer math doesn't
care about sign or negative numbers.  There is no concept, at the ALU
level, of "signed" or "unsigned".  It's all just bits being added,
and the simple rules for binary addition apply.
 
In four bits binary, the math (adding one to each value) looks like this:
 
    CPU Addition Logic for Four Bit Binary Numbers
   ------------------------------------------------
    0000 plus one (0001) is
    0001 plus one is
    0010 plus one is
    0011 plus one is
    0100 plus one is
    0101 plus one is
    0110 plus one is
    0111 plus one is (also causes the Overflow flag to be set in the CPU)
    1000 plus one is
    1001 plus one is
    1010 plus one is
    1011 plus one is
    1100 plus one is
    1101 plus one is
    1110 plus one is
    1111 plus one is (also causes Carry flag to be set in the CPU)
    0000 plus one is
    0001 plus one is
    0010 plus one is  ...etc...
 
The list is is really a ring or wheel, joined at the ends.  There is no
real beginning or end to the repeating sequence of bit patterns; adding
one just moves to the next bit pattern around the ring.  Adding two
moves two places around the ring, etc.  Subtraction moves the other way.
The list works like a car odometer; when the maximum value (1111) is
reached, the odometer rolls over to zero (0000) and starts over again.
Rolling backward from 0000 gets you to 1111.
 
(For another view of how this works, see "Odometers" and
"Positive and Negative in Binary", showing the number ring, in:
http://www.cs.nmsu.edu/~pfeiffer/classes/273/notes/neg.html )
 
As Pfeifer says, there is no way to tell whether the bit pattern 0101 is
the result of adding eight to 0010 (i.e. adding one eight times) or the
result of subtracting eight from 0010 (i.e. subtracting one eight times).
Both move the same distance around the ring and end up at the same place.
 
Four bits have only 2**4 (16) possible bit patterns; that means they can
represent only 16 different things.  Eight bits can only represent 2**8
(256) different numbers, etc.  Ten bits, 1024 numbers.
 
Unsigned Numbers
----------------
 
If all 16 of these four-bit patterns represent positive numbers, the range
of numbers is 0 to 15 decimal.  The smallest number is zero with binary
bit pattern 0000 and the largest number is 15 with bit pattern 1111:
 
    UNSIGNED 4-BIT NUMBERS
   ------------------------
    0000 =  0 # unsigned numbers start at all bits turned off (zero)
    0001 =  1
    0010 =  2
    0011 =  3
    0100 =  4
    0101 =  5
    0110 =  6
    0111 =  7
    1000 =  8
    1001 =  9
    1010 = 10
    1011 = 11
    1100 = 12
    1101 = 13
    1110 = 14
    1111 = 15 # unsigned numbers end with all bits turned on (15 decimal)
 
In four bits, adding one to 15 doesn't give 16, it wraps around to start
over at zero (0000) again.  Subtracting one from zero wraps around to
give the bit pattern for 15 (1111).  Wrapping around zero means the math
is wrong.  As long as the adding/subtracting stays within the numbers 0
(0000) to 15 (1111) without crossing zero, the unsigned math is correct.
Any four-bit unsigned math that goes below zero or above 15 is wrong.
 
Signed Numbers
--------------
 
If we choose to have some of these 16 four-bit patterns represent negative
numbers, we can't use as many bit patterns for positive numbers.  We have
to steal some bit patterns to use as negative numbers.
 
Most systems for negative numbers divide up the patterns into half
positive and half negative.  If the top (leftmost, most significant) bit
is on, we can decide that the bit pattern represents a negative number.
We call this top, leftmost bit the "sign" bit.
 
If half the four-bit patterns have the sign bit on and will be used
for negative numbers, that leaves only eight bit patterns to represent
positive numbers (with zero being called "positive"), and the range of
eight positive numbers is now only 0 to +7 (a total of eight numbers):
 
    0000 0001 0010 0011 0100 0101 0110 0111
 
The other eight bit patterns, with sign bit on, represent eight negative
numbers, usually (but not always) the numbers from -8 up to -1:
 
    1000 1001 1010 1011 1100 1101 1110 1111
 
Where the 16 unsigned four-bit numbers range from 0 to 15, the 16 signed
four-bit numbers are half-negative and half-positive and range from
-8 up to zero up to +7.  (Note how the range includes -8 but only +7,
since zero is considered a "positive" number here.)
 
Two's Complement Signed Numbers:
 
Which bit patterns represent each of the negative numbers from -8 up
to -1?  If we subtract one from zero (0000), the result is the bit
pattern 1111; so, 1111 must be the bit pattern for -1.  (Adding one to
1111 gives zero, as expected.)  If we subtract one from 1111 (-1) we get
1110; so, 1110 must be the bit pattern for -2, etc., down to -8 with bit
pattern 1000.  This is the "two's complement" system for negative numbers.
Subtracting one from -8 (1000) doesn't give -9, it wraps around to give
the bit pattern for +7 (0111).  Adding one to +7 (0111) doesn't give +8,
it wraps around to start over at -8 (1000) again.
 
    Two's Complement Signed 4-bit Numbers
   ---------------------------------------
    1000 = -8 # signed numbers start with only the sign bit turned on
    1001 = -7
    1010 = -6
    1011 = -5
    1100 = -4
    1101 = -3
    1110 = -2
    1111 = -1
    0000 =  0
    0001 = +1
    0010 = +2
    0011 = +3
    0100 = +4
    0101 = +5
    0110 = +6
    0111 = +7 # signed numbers end with everything *except* the sign bit on
 
With the introduction of two's complement negative numbers, the smallest
four-bit number is -8 with bit pattern 1000 and the largest number is
+7 with bit pattern 0111.  Adding one to -1 (1111) correctly gives zero
(0000).  As long as the adding/subtracting stays within the range of
numbers -8 (1000) to +7 (0111), the signed math is correct.  Any four-bit
signed math that goes below -8 or above +7 is wrong.
 
The wrapping-around is true for all two's complement number systems -
adding one to the highest positive number wraps around to give the most
negative number.  The most negative number always has an absolute value
one larger than the largest positive number, since zero is included with
the positive numbers.  Thus, if you know that the largest positive 8-bit
two's complement value is +127, you immediately know that the smallest
negative value must be -128.
 
For four-bit unsigned numbers, the range of 16 bit patterns starts at zero
(0000) and goes up to 15 (1111).  Unsigned math that steps outside this
range (goes below 0000 or above 1111) is wrong - the answer wraps around
incorrectly.  Crossing from 0111 to 1000 is not an error in unsigned math.
Crossing from 1111 to 0000 is an error in unsigned math.
 
For four-bit signed two's complement numbers, the range of 16 bit patterns
starts at -8 (1000) and goes up to +7 (0111).  Signed math that steps
outside this range (goes below 1000 or above 0111) is wrong - the answer
wraps around incorrectly.  Crossing from 1111 to 0000 is not an error
in signed math.  Crossing from 0111 to 1000 is an error in signed math.
 
Note how unsigned math treats the list of 16 consecutive bit patterns
as going from 0 to 15 (0000 to 1111) and signed math treats the list as
the 16 patterns going from -8 to +7 (1000 to 0111).  It's the same list
of sixteen bit patterns, but the start and end points for "right answer"
mathematics is different for signed and unsigned.
 
Inside the CPU, for addition (or subtraction) of these 16 four-bit
numbers, it doesn't matter if the bit pattern is considered signed or
unsigned - when you say "x = x + 1" the ALU inside the CPU just adds
one to the bits in memory and possibly sets some CPU flags.  Signed,
unsigned, same thing.  The CPU knows nothing about negative numbers;
it only does binary addition to bit patterns.
 
Rule: When adding binary, octal, or hexadecimal bit patterns together,
just do the binary math and express the result in the allowed number of
bits, throwing away the carry if it doesn't fit.
 
Examples of binary integer math:
 
   binary 4 bits:      0000 + 0111 = 0111
   hexadecimal 4 bits:   0h +   7h =   7h
 
   binary 4 bits:      0100 + 0100 = 1000
   hexadecimal 4 bits:   4h +   4h =   8h
 
   binary 4 bits:      1010 + 1010 = 0100  (no room for the carry in 4 bits)
   hexadecimal 4 bits:   Ah +   Ah =   4h  (no room for the carry in 4 bits)
 
   binary 4 bits:      1100 + 1100 = 1000  (no room for the carry in 4 bits)
   hexadecimal 4 bits:   Ch +   Ch =   8h  (no room for the carry in 4 bits)
 
   Hex math calculator: http://www.csgnetwork.com/hexaddsubcalc.html
 
   hex 16 bits: 1ABCh + 0001h = 1ABDh
   hex 16 bits: 1ABCh + 0009h = 1AC5h
   hex 16 bits: 1ABCh + 9C3Eh = B6FAh
   hex 16 bits: 1ABCh + FFFFh = 1ABBh (no room for carry in 16 bits)
   hex 16 bits: 1ABCh + F000h = 0ABCh (no room for carry in 16 bits)
 
You must pay careful attention to the number of bits used in the
mathematics and never generate an answer that has more than this number
of bits.  If the math generates a carry out of the leftmost (top, most
significant) bit, that carry is "thrown away" and a Carry flag inside
the CPU is turned on indicating that this happened.
 
Status Flags for binary integer mathematics
===========================================
 
The ALU in the CPU has two math status flags in it: the "Carry" flag and
the "Overflow" flag.  These flags are set or unset after every integer
math operation.  The flags can be interpreted by you, the programmer,
to have meaning depending on whether you are treating the bit patterns
as unsigned (all-positive), or signed (positive and negative numbers).
The CPU doesn't know or care about signed/unsigned - it always does the
math and always sets the two flags.  You (your program) have to care.
 
Carry Flag - for unsigned integer math
--------------------------------------
 
The "Carry" flag is set in the ALU whenever any binary mathematics
causes a "carry" out of the top (leftmost, sign) bit, e.g. in four bit
binary math: 1000+1000=0000 (and the carry flag is set).  The carry flag
indicates that the answer didn't fit in the number of bits we are using -
there was "carry" that would have required one more bit to represent.
You can think of that "carry" out of the top bit as setting the Carry
flag to ON.  If there is no carry, the Carry flag is OFF.
 
If we treat the bits in a word as "unsigned" numbers (all positive,
no negative), we must watch to see if our arithmetic causes the "carry"
flag to be set, indicating the (unsigned) result is wrong.  (We don't
care about the ALU Overflow flag when doing unsigned math.  The overflow
flag is only relevant to signed numbers, not unsigned.  See below.)
 
4-bit example:   0100 + 0100 = 1000 (no carry - the answer fits in 4 bits)
 - The carry flag is turned OFF, no carry, since no carry was generated
 - If our program asked the CPU to do unsigned math, the fact that the
   Carry flag is off means that the answer is correct for unsigned math.
   - In unsigned decimal: 4 + 4 = 8 (carry OFF == correct answer)
 
4-bit example:   1000 + 1000 = 0000 (and the carry flag comes on in the CPU)
 - The carry flag is turned ON, since carry was generated
 - If our program asked the CPU to do unsigned math, the fact that the
   Carry flag is ON means that the answer is wrong for unsigned math.
   - In unsigned decimal: 8 + 8 = 0 (carry ON == wrong answer)
   - If we were allowed 5 bits, the answer would be correct as 10000(2) = 16(10)
 
4-bit example:   1111 + 0011 = 0010 (and the carry flag comes on in the CPU)
 - The carry flag is turned ON, since carry was generated
 - If our program asked the CPU to do unsigned math, the fact that the
   Carry flag is ON means that the answer is wrong for unsigned math.
   - In unsigned decimal: 15 + 3 = 2 (carry ON == wrong answer)
   - If we were allowed 5 bits, the answer would be correct as 10010(2) = 18(10)
 
The carry flag indicates that the unsigned binary math answer needs more bits.
 
Overflow Flag - for two's complement integer math
-------------------------------------------------
 
The "Overflow" flag is set by the CPU whenever the ALU adds two numbers
together with the same sign bit, and the result has the opposite sign
bit, e.g. in four bit binary math: 0100+0100=1000 (and the overflow flag
is set).  Just as with the Carry flag, the CPU always sets the overflow
flag after a binary math operation.
 
Just as the Carry flag tells your program that the binary math is wrong
for unsigned arithmetic, the Overflow flag has the corresponding meaning
for signed arithmetic.  The overflow flag indicates that the signed
math is wrong - you should never add two numbers of the same sign get
an answer with the opposite sign.  The math went outside the valid range
of signed numbers.
 
The ALU always sets both the Carry and Overflow flags when doing integer
mathematics on two bit patterns.  Your program must choose which flag to
check depending on whether your program was doing signed or unsigned math.
 
The sign bit (leftmost bit) and overflow flag only have meaning if we
choose to interpret the bit patterns using the two's complement number
system, which assigns half the bit patterns to negative numbers.
 
If we treat the bits in a word as "two's complement" signed values,
we must watch to see if our arithmetic causes the "overflow" flag to
be set, indicating the (two's complement) result is wrong.  (We don't
care about the carry flag when doing signed, two's complement math.
The carry flag is only relevant to unsigned numbers, not signed.)
 
4-bit example:   0100 + 0100 = 1000 (and the overflow flag comes on in the CPU)
 - The overflow flag is turned ON, since the sign bit changed
 - If our program asked the CPU to do signed math, the fact that the
   Overflow flag is ON means that the answer is wrong for signed math.
   - In signed decimal: 4 + 4 = -8 (overflow flag ON == wrong answer)
 - Note that for this same math, if our program was doing unsigned math,
   the fact that the Carry flag is OFF means that the answer would be right!
   - In unsigned decimal: 4 + 4 = 8 (carry OFF == right answer)
 
Some mathematics on bit patterns gives answers that are wrong both for
unsigned and two's complement numbers, in which case *both* the carry
and overflow flags will be set in the CPU:
 
4-bit example: 1010 + 1010 = 0100 (and both carry and overflow come on)
  - In unsigned decimal (no negative numbers): 10 + 10 =  4 (WRONG)
  - In signed decimal (two's complement):      -6 + -6 = +4 (WRONG)
 
The ALU always sets both the carry and overflow flags when doing integer
mathematics on two bit patterns.  Your program must choose which flag to
check depending on whether your program was doing signed or unsigned math.
 
When the CPU does math on bit patterns, it doesn't know or care whether
the bits represent signed or unsigned integers.  The CPU just does the
math and sets the flags.  Your program knows whether it was using the
CPU for signed or unsigned math, and so your program must check the
correct flag.
 
=========================
C Language Implementation
=========================
 
In C language with 32-bit integers:
 
    signed int sx = 2000000000;
    signed int sy = 2000000001;
    signed int sz = sx + sy;
 
    unsigned int ux = 2000000000;
    unsigned int uy = 2000000001;
    unsigned int uz = ux + uy;
 
    printf("sz is 0x%x which could be %d or %u\n", sz, sz, sz);
    printf("uz is 0x%x which could be %d or %u\n", uz, uz, uz);
 
Output (first "0x" output is in hexadecimal):
 
    sz is 0xEE6B2801 which could be -294967295 or 4000000001
    uz is 0xEE6B2801 which could be -294967295 or 4000000001
 
The hex value 0xEE6B2801 when turned into 32-bit binary looks like this:
 
    11101110011010110010100000000001
       E   E   6   B   2   8   0   1
 
See - absolutely no difference in the two bit patterns in memory.  The bit
patterns in memory for signed and unsigned math are the same; because,
there is no difference in what the ALU does.  The ALU doesn't care.
Adding/subtracting don't care about sign; it is all pure binary math on
bit patterns.
 
As you see above, once we have some bits in memory, we can tell our
program to interpret the bits in either an unsigned manner ("%u" - all
the bit patterns represent non-negative numbers) or in a signed manner
("%d" - about half of the bit patterns will be displayed as negative,
with leading minus).
 
Signed/unsigned doesn't matter at the ALU level in computer math.  Where a
signed/unsigned declaration plays a big part is in *comparing* numbers
(and in bit-shifting; but, save that for another day).  If we compare
the bit pattern 0xEE6B2801 (above) with zero, is the bit pattern less
than zero or much greater than zero?  *That* answer depends on whether
we declared the memory location (the variable) to be signed or unsigned.
 
If we declare a 32-bit variable as unsigned, numeric comparisons
treat all 4,294,967,296 bit patterns from zero up to 0xFFFFFFFF
(11111111111111111111111111111111) as non-negative numbers (zero to
4,294,967,295 for 32-bit numbers).  Comparisons (unsigned) will never
say any of these patterns are less than zero:
 
    unsigned int x = 0xFFFFFFFF;
    if ( x < 0 ) ( ...   /* this is always FALSE for unsigned numbers */
 
If we declare a 32-bit variable as signed, we are saying to our compiler
that numeric comparisons using that variable will treat about half of
those 4,294,967,296 bit patterns as negative numbers.  Any bit patterns
in signed variables that have the leftmost bit (the sign bit) set,
will be interpreted and compared as "less than zero":
 
    signed int x = 0xFFFFFFFF;
    if ( x < 0 ) ( ...   /* TRUE because x is signed and the sign bit is on */
 
More C code:
 
    signed   int sz = 0xEE6B2801;
    unsigned int uz = 0xEE6B2801;
    /*
     * Both sz and uz contain the same bit pattern 0xEE6B2801.
     * Signed sz interprets the bits as -294967295(10).
     * Unsigned uz interprets the bits as 4000000001(10).
     */
    if ( sz < 0 )
        printf("sz is less than zero\n");
    else
        printf("sz is not less than zero\n");
 
    if ( uz < 0 )
        printf("uz is less than zero\n");
    else
        printf("uz is not less than zero\n");
 
Output:
 
    sz is less than zero
    uz is not less than zero
 
Remember that sz and uz contain exactly the same bit pattern!  It is
our program that interprets the results.
 
Because we declared the sz variable to be signed, the code generated
by the compiler to test whether the bit pattern in sz is less than zero
tests the sign bit (leftmost bit) on the sz memory, notices that is is on
('1') and declares the bit pattern as "less than zero" (signed).  All the
bit patterns with the sign bit on will be interpreted as negative numbers.
 
Because we declared the uz variable to be unsigned, the code generated
by the compiler to test whether the bit pattern in uz is less than zero
has no work to do at all - unsigned numbers are never less than zero.
The compiler arranges to print:  "uz is not less than zero"
 
As a programmer, declaring integer variables as signed/unsigned is
just a convenient way of having the compiler conspire with us to treat
half the bit patterns as "less than zero".  Inside the CPU doing math
(adding/subtracting), the declaration of signed/unsigned doesn't make
any difference - the CPU just does the math and sets the Carry and
Overflow flags.  For mathematical comparisons (and bit-shifting),
declaring signed or unsigned makes a big difference.
 
--
| Ian! D. Allen  -  idallen@idallen.ca  -  Ottawa, Ontario, Canada
| Home Page: http://idallen.com/   Contact Improv: http://contactimprov.ca/
| College professor (Free/Libre GNU+Linux) at: http://teaching.idallen.com/
| Defend digital freedom:  http://eff.org/  and have fun:  http://fools.ca/

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379

380

381

382

383

384

385

386

387

388

389

390

391

392

393

394

395

396

397

398

399

400

401

402

403

404

405

406

407

408

409

410

411

412

413

414

415

416

417

418

419

420

421

422

423

424

425

426

427

428

429

430

431

432

433

434

435

436

437

438

439

440

441

442

443

444

445

446

============================================================

Binary Integer Mathematics, unsigned, two's complement, etc.

============================================================

- Ian! D. Allen - idallen@idallen.ca - http://www.idallen.com

How the computer adds integer numbers

-------------------------------------

When the ALU inside the CPU adds one to any integer value, signed or

unsigned, the same thing always happens. Basic ALU integer math doesn't

care about sign or negative numbers. There is no concept, at the ALU

level, of "signed" or "unsigned". It's all just bits being added,

and the simple rules for binary addition apply.

In four bits binary, the math (adding one to each value) looks like this:

CPU Addition Logic for Four Bit Binary Numbers

------------------------------------------------

0000 plus one (0001) is

0001 plus one is

0010 plus one is

0011 plus one is

0100 plus one is

0101 plus one is

0110 plus one is

0111 plus one is (also causes the Overflow flag to be set in the CPU)

1000 plus one is

1001 plus one is

1010 plus one is

1011 plus one is

1100 plus one is

1101 plus one is

1110 plus one is

1111 plus one is (also causes Carry flag to be set in the CPU)

0000 plus one is

0001 plus one is

0010 plus one is ...etc...

The list is is really a ring or wheel, joined at the ends. There is no

real beginning or end to the repeating sequence of bit patterns; adding

one just moves to the next bit pattern around the ring. Adding two

moves two places around the ring, etc. Subtraction moves the other way.

The list works like a car odometer; when the maximum value (1111) is

reached, the odometer rolls over to zero (0000) and starts over again.

Rolling backward from 0000 gets you to 1111.

(For another view of how this works, see "Odometers" and

"Positive and Negative in Binary", showing the number ring, in:

http://www.cs.nmsu.edu/~pfeiffer/classes/273/notes/neg.html )

As Pfeifer says, there is no way to tell whether the bit pattern 0101 is

the result of adding eight to 0010 (i.e. adding one eight times) or the

result of subtracting eight from 0010 (i.e. subtracting one eight times).

Both move the same distance around the ring and end up at the same place.

Four bits have only 2**4 (16) possible bit patterns; that means they can

represent only 16 different things. Eight bits can only represent 2**8

(256) different numbers, etc. Ten bits, 1024 numbers.

Unsigned Numbers

----------------

If all 16 of these four-bit patterns represent positive numbers, the range

of numbers is 0 to 15 decimal. The smallest number is zero with binary

bit pattern 0000 and the largest number is 15 with bit pattern 1111:

UNSIGNED 4-BIT NUMBERS

------------------------

0000 = 0 # unsigned numbers start at all bits turned off (zero)

0001 = 1

0010 = 2

0011 = 3

0100 = 4

0101 = 5

0110 = 6

0111 = 7

1000 = 8

1001 = 9

1010 = 10

1011 = 11

1100 = 12

1101 = 13

1110 = 14

1111 = 15 # unsigned numbers end with all bits turned on (15 decimal)

In four bits, adding one to 15 doesn't give 16, it wraps around to start

over at zero (0000) again. Subtracting one from zero wraps around to

give the bit pattern for 15 (1111). Wrapping around zero means the math

is wrong. As long as the adding/subtracting stays within the numbers 0

(0000) to 15 (1111) without crossing zero, the unsigned math is correct.

Any four-bit unsigned math that goes below zero or above 15 is wrong.

Signed Numbers

--------------

If we choose to have some of these 16 four-bit patterns represent negative

numbers, we can't use as many bit patterns for positive numbers. We have

to steal some bit patterns to use as negative numbers.

Most systems for negative numbers divide up the patterns into half

positive and half negative. If the top (leftmost, most significant) bit

is on, we can decide that the bit pattern represents a negative number.

We call this top, leftmost bit the "sign" bit.

If half the four-bit patterns have the sign bit on and will be used

for negative numbers, that leaves only eight bit patterns to represent

positive numbers (with zero being called "positive"), and the range of

eight positive numbers is now only 0 to +7 (a total of eight numbers):

0000 0001 0010 0011 0100 0101 0110 0111

The other eight bit patterns, with sign bit on, represent eight negative

numbers, usually (but not always) the numbers from -8 up to -1:

1000 1001 1010 1011 1100 1101 1110 1111

Where the 16 unsigned four-bit numbers range from 0 to 15, the 16 signed

four-bit numbers are half-negative and half-positive and range from

-8 up to zero up to +7. (Note how the range includes -8 but only +7,

since zero is considered a "positive" number here.)

Two's Complement Signed Numbers:

Which bit patterns represent each of the negative numbers from -8 up

to -1? If we subtract one from zero (0000), the result is the bit

pattern 1111; so, 1111 must be the bit pattern for -1. (Adding one to

1111 gives zero, as expected.) If we subtract one from 1111 (-1) we get

1110; so, 1110 must be the bit pattern for -2, etc., down to -8 with bit

pattern 1000. This is the "two's complement" system for negative numbers.

Subtracting one from -8 (1000) doesn't give -9, it wraps around to give

the bit pattern for +7 (0111). Adding one to +7 (0111) doesn't give +8,

it wraps around to start over at -8 (1000) again.

Two's Complement Signed 4-bit Numbers

---------------------------------------

1000 = -8 # signed numbers start with only the sign bit turned on

1001 = -7

1010 = -6

1011 = -5

1100 = -4

1101 = -3

1110 = -2

1111 = -1

0000 = 0

0001 = +1

0010 = +2

0011 = +3

0100 = +4

0101 = +5

0110 = +6

0111 = +7 # signed numbers end with everything *except* the sign bit on

With the introduction of two's complement negative numbers, the smallest

four-bit number is -8 with bit pattern 1000 and the largest number is

+7 with bit pattern 0111. Adding one to -1 (1111) correctly gives zero

(0000). As long as the adding/subtracting stays within the range of

numbers -8 (1000) to +7 (0111), the signed math is correct. Any four-bit

signed math that goes below -8 or above +7 is wrong.

The wrapping-around is true for all two's complement number systems -

adding one to the highest positive number wraps around to give the most

negative number. The most negative number always has an absolute value

one larger than the largest positive number, since zero is included with

the positive numbers. Thus, if you know that the largest positive 8-bit

two's complement value is +127, you immediately know that the smallest

negative value must be -128.

For four-bit unsigned numbers, the range of 16 bit patterns starts at zero

(0000) and goes up to 15 (1111). Unsigned math that steps outside this

range (goes below 0000 or above 1111) is wrong - the answer wraps around

incorrectly. Crossing from 0111 to 1000 is not an error in unsigned math.

Crossing from 1111 to 0000 is an error in unsigned math.

For four-bit signed two's complement numbers, the range of 16 bit patterns

starts at -8 (1000) and goes up to +7 (0111). Signed math that steps

outside this range (goes below 1000 or above 0111) is wrong - the answer

wraps around incorrectly. Crossing from 1111 to 0000 is not an error

in signed math. Crossing from 0111 to 1000 is an error in signed math.

Note how unsigned math treats the list of 16 consecutive bit patterns

as going from 0 to 15 (0000 to 1111) and signed math treats the list as

the 16 patterns going from -8 to +7 (1000 to 0111). It's the same list

of sixteen bit patterns, but the start and end points for "right answer"

mathematics is different for signed and unsigned.

Inside the CPU, for addition (or subtraction) of these 16 four-bit

numbers, it doesn't matter if the bit pattern is considered signed or

unsigned - when you say "x = x + 1" the ALU inside the CPU just adds

one to the bits in memory and possibly sets some CPU flags. Signed,

unsigned, same thing. The CPU knows nothing about negative numbers;

it only does binary addition to bit patterns.

Rule: When adding binary, octal, or hexadecimal bit patterns together,

just do the binary math and express the result in the allowed number of

bits, throwing away the carry if it doesn't fit.

Examples of binary integer math:

binary 4 bits: 0000 + 0111 = 0111

hexadecimal 4 bits: 0h + 7h = 7h

binary 4 bits: 0100 + 0100 = 1000

hexadecimal 4 bits: 4h + 4h = 8h

binary 4 bits: 1010 + 1010 = 0100 (no room for the carry in 4 bits)

hexadecimal 4 bits: Ah + Ah = 4h (no room for the carry in 4 bits)

binary 4 bits: 1100 + 1100 = 1000 (no room for the carry in 4 bits)

hexadecimal 4 bits: Ch + Ch = 8h (no room for the carry in 4 bits)

Hex math calculator: http://www.csgnetwork.com/hexaddsubcalc.html

hex 16 bits: 1ABCh + 0001h = 1ABDh

hex 16 bits: 1ABCh + 0009h = 1AC5h

hex 16 bits: 1ABCh + 9C3Eh = B6FAh

hex 16 bits: 1ABCh + FFFFh = 1ABBh (no room for carry in 16 bits)

hex 16 bits: 1ABCh + F000h = 0ABCh (no room for carry in 16 bits)

You must pay careful attention to the number of bits used in the

mathematics and never generate an answer that has more than this number

of bits. If the math generates a carry out of the leftmost (top, most

significant) bit, that carry is "thrown away" and a Carry flag inside

the CPU is turned on indicating that this happened.

Status Flags for binary integer mathematics

===========================================

The ALU in the CPU has two math status flags in it: the "Carry" flag and

the "Overflow" flag. These flags are set or unset after every integer

math operation. The flags can be interpreted by you, the programmer,

to have meaning depending on whether you are treating the bit patterns

as unsigned (all-positive), or signed (positive and negative numbers).

The CPU doesn't know or care about signed/unsigned - it always does the

math and always sets the two flags. You (your program) have to care.

Carry Flag - for unsigned integer math

--------------------------------------

The "Carry" flag is set in the ALU whenever any binary mathematics

causes a "carry" out of the top (leftmost, sign) bit, e.g. in four bit

binary math: 1000+1000=0000 (and the carry flag is set). The carry flag

indicates that the answer didn't fit in the number of bits we are using -

there was "carry" that would have required one more bit to represent.

You can think of that "carry" out of the top bit as setting the Carry

flag to ON. If there is no carry, the Carry flag is OFF.

If we treat the bits in a word as "unsigned" numbers (all positive,

no negative), we must watch to see if our arithmetic causes the "carry"

flag to be set, indicating the (unsigned) result is wrong. (We don't

care about the ALU Overflow flag when doing unsigned math. The overflow

flag is only relevant to signed numbers, not unsigned. See below.)

4-bit example: 0100 + 0100 = 1000 (no carry - the answer fits in 4 bits)

- The carry flag is turned OFF, no carry, since no carry was generated

- If our program asked the CPU to do unsigned math, the fact that the

Carry flag is off means that the answer is correct for unsigned math.

- In unsigned decimal: 4 + 4 = 8 (carry OFF == correct answer)

4-bit example: 1000 + 1000 = 0000 (and the carry flag comes on in the CPU)

- The carry flag is turned ON, since carry was generated

- If our program asked the CPU to do unsigned math, the fact that the

Carry flag is ON means that the answer is wrong for unsigned math.

- In unsigned decimal: 8 + 8 = 0 (carry ON == wrong answer)

- If we were allowed 5 bits, the answer would be correct as 10000(2) = 16(10)

4-bit example: 1111 + 0011 = 0010 (and the carry flag comes on in the CPU)

- The carry flag is turned ON, since carry was generated

- If our program asked the CPU to do unsigned math, the fact that the

Carry flag is ON means that the answer is wrong for unsigned math.

- In unsigned decimal: 15 + 3 = 2 (carry ON == wrong answer)

- If we were allowed 5 bits, the answer would be correct as 10010(2) = 18(10)

The carry flag indicates that the unsigned binary math answer needs more bits.

Overflow Flag - for two's complement integer math

-------------------------------------------------

The "Overflow" flag is set by the CPU whenever the ALU adds two numbers

together with the same sign bit, and the result has the opposite sign

bit, e.g. in four bit binary math: 0100+0100=1000 (and the overflow flag

is set). Just as with the Carry flag, the CPU always sets the overflow

flag after a binary math operation.

Just as the Carry flag tells your program that the binary math is wrong

for unsigned arithmetic, the Overflow flag has the corresponding meaning

for signed arithmetic. The overflow flag indicates that the signed

math is wrong - you should never add two numbers of the same sign get

an answer with the opposite sign. The math went outside the valid range

of signed numbers.

The ALU always sets both the Carry and Overflow flags when doing integer

mathematics on two bit patterns. Your program must choose which flag to

check depending on whether your program was doing signed or unsigned math.

The sign bit (leftmost bit) and overflow flag only have meaning if we

choose to interpret the bit patterns using the two's complement number

system, which assigns half the bit patterns to negative numbers.

If we treat the bits in a word as "two's complement" signed values,

we must watch to see if our arithmetic causes the "overflow" flag to

be set, indicating the (two's complement) result is wrong. (We don't

care about the carry flag when doing signed, two's complement math.

The carry flag is only relevant to unsigned numbers, not signed.)

4-bit example: 0100 + 0100 = 1000 (and the overflow flag comes on in the CPU)

- The overflow flag is turned ON, since the sign bit changed

- If our program asked the CPU to do signed math, the fact that the

Overflow flag is ON means that the answer is wrong for signed math.

- In signed decimal: 4 + 4 = -8 (overflow flag ON == wrong answer)

- Note that for this same math, if our program was doing unsigned math,

the fact that the Carry flag is OFF means that the answer would be right!

- In unsigned decimal: 4 + 4 = 8 (carry OFF == right answer)

Some mathematics on bit patterns gives answers that are wrong both for

unsigned and two's complement numbers, in which case *both* the carry

and overflow flags will be set in the CPU:

4-bit example: 1010 + 1010 = 0100 (and both carry and overflow come on)

- In unsigned decimal (no negative numbers): 10 + 10 = 4 (WRONG)

- In signed decimal (two's complement): -6 + -6 = +4 (WRONG)

The ALU always sets both the carry and overflow flags when doing integer

mathematics on two bit patterns. Your program must choose which flag to

check depending on whether your program was doing signed or unsigned math.

When the CPU does math on bit patterns, it doesn't know or care whether

the bits represent signed or unsigned integers. The CPU just does the

math and sets the flags. Your program knows whether it was using the

CPU for signed or unsigned math, and so your program must check the

correct flag.

=========================

C Language Implementation

=========================

In C language with 32-bit integers:

signed int sx = 2000000000;

signed int sy = 2000000001;

signed int sz = sx + sy;

unsigned int ux = 2000000000;

unsigned int uy = 2000000001;

unsigned int uz = ux + uy;

printf("sz is 0x%x which could be %d or %u\n", sz, sz, sz);

printf("uz is 0x%x which could be %d or %u\n", uz, uz, uz);

Output (first "0x" output is in hexadecimal):

sz is 0xEE6B2801 which could be -294967295 or 4000000001

uz is 0xEE6B2801 which could be -294967295 or 4000000001

The hex value 0xEE6B2801 when turned into 32-bit binary looks like this:

11101110011010110010100000000001

E E 6 B 2 8 0 1

See - absolutely no difference in the two bit patterns in memory. The bit

patterns in memory for signed and unsigned math are the same; because,

there is no difference in what the ALU does. The ALU doesn't care.

Adding/subtracting don't care about sign; it is all pure binary math on

bit patterns.

As you see above, once we have some bits in memory, we can tell our

program to interpret the bits in either an unsigned manner ("%u" - all

the bit patterns represent non-negative numbers) or in a signed manner

("%d" - about half of the bit patterns will be displayed as negative,

with leading minus).

Signed/unsigned doesn't matter at the ALU level in computer math. Where a

signed/unsigned declaration plays a big part is in *comparing* numbers

(and in bit-shifting; but, save that for another day). If we compare

the bit pattern 0xEE6B2801 (above) with zero, is the bit pattern less

than zero or much greater than zero? *That* answer depends on whether

we declared the memory location (the variable) to be signed or unsigned.

If we declare a 32-bit variable as unsigned, numeric comparisons

treat all 4,294,967,296 bit patterns from zero up to 0xFFFFFFFF

(11111111111111111111111111111111) as non-negative numbers (zero to

4,294,967,295 for 32-bit numbers). Comparisons (unsigned) will never

say any of these patterns are less than zero:

unsigned int x = 0xFFFFFFFF;

if ( x < 0 ) ( ... /* this is always FALSE for unsigned numbers */

If we declare a 32-bit variable as signed, we are saying to our compiler

that numeric comparisons using that variable will treat about half of

those 4,294,967,296 bit patterns as negative numbers. Any bit patterns

in signed variables that have the leftmost bit (the sign bit) set,

will be interpreted and compared as "less than zero":

signed int x = 0xFFFFFFFF;

if ( x < 0 ) ( ... /* TRUE because x is signed and the sign bit is on */

More C code:

signed int sz = 0xEE6B2801;

unsigned int uz = 0xEE6B2801;

* Both sz and uz contain the same bit pattern 0xEE6B2801.

* Signed sz interprets the bits as -294967295(10).

* Unsigned uz interprets the bits as 4000000001(10).

if ( sz < 0 )

printf("sz is less than zero\n");

else

printf("sz is not less than zero\n");

if ( uz < 0 )

printf("uz is less than zero\n");

else

printf("uz is not less than zero\n");

Output:

sz is less than zero

uz is not less than zero

Remember that sz and uz contain exactly the same bit pattern! It is

our program that interprets the results.

Because we declared the sz variable to be signed, the code generated

by the compiler to test whether the bit pattern in sz is less than zero

tests the sign bit (leftmost bit) on the sz memory, notices that is is on

('1') and declares the bit pattern as "less than zero" (signed). All the

bit patterns with the sign bit on will be interpreted as negative numbers.

Because we declared the uz variable to be unsigned, the code generated

by the compiler to test whether the bit pattern in uz is less than zero

has no work to do at all - unsigned numbers are never less than zero.

The compiler arranges to print: "uz is not less than zero"

As a programmer, declaring integer variables as signed/unsigned is

just a convenient way of having the compiler conspire with us to treat

half the bit patterns as "less than zero". Inside the CPU doing math

(adding/subtracting), the declaration of signed/unsigned doesn't make

any difference - the CPU just does the math and sets the Carry and

Overflow flags. For mathematical comparisons (and bit-shifting),

declaring signed or unsigned makes a big difference.

| Ian! D. Allen - idallen@idallen.ca - Ottawa, Ontario, Canada

| Home Page: http://idallen.com/ Contact Improv: http://contactimprov.ca/

| College professor (Free/Libre GNU+Linux) at: http://teaching.idallen.com/

| Defend digital freedom: http://eff.org/ and have fun: http://fools.ca/

Zoeken

Binary Integer Mathematics, unsigned, two’s complement, etc.