Integer, ASCII and Unicode

This lecture covers programming based on computation machines. Even mechanical devices allow for calculations:

Figure 62. Manual calculation: Abacus Slide presentation Create comment in forum
Manual calculation: Abacus

Figure 63. Mechanical calculation: Cash register Slide presentation Create comment in forum
Mechanical calculation: Cash register

Figure 64. Electromechanical calculation: Zuse Z3 Slide presentation Create comment in forum
Electromechanical calculation: Zuse Z3

Figure 65. Vacuum Tube: Eniac Slide presentation Create comment in forum
Vacuum Tube: Eniac

So far all machines being described are based on non-semiconductor technologies. Inventing the transistor in the fifties gave rise to a rapid development of microprocessor chips:

Figure 66. Transistor: Microprocessor ICs Slide presentation Create comment in forum
Transistor: Microprocessor ICs

These sample devices differ heavily with respect to addressable memory, data size, supported arithmetic operations / speed and other features. We take a closer look to Zilog's Z80 processor:

Figure 67. Z80 8-bit data bus Slide presentation Create comment in forum
Z80 8-bit data bus

Following technological advances processors have been categorized by the length the so called address- and data-bus:

Figure 68. Progress in hardware Slide presentation Create comment in forum
Processor Year Address/ data bus Transistors Clock rate
Intel 4004 1971 12 / 4 2,300 740 kHz
Zilog Z80 1976 16 / 8 8,500 2.5 MHz
Motorola 68020 1984 32 / 32 190,000 12.5 MHz
Six-core Opteron 2009 64 / 64 904,000,000 1.8 GHz
Core i7 Broadwell 2016 64 / 64 3,200,000,000 3.6 GHz

Figure 69. Simple facts: Slide presentation Create comment in forum

There are only 10 types of people in the world:

Those who understand binary and those who don't.


We remind the reader to the binary representation of signed integer values. Details will be discussed in your math lectures. Our first example features three bit signed integer values:

Figure 70. Unsigned 3 bit integer representation Slide presentation Create comment in forum

Figure 71. Binary system addition Slide presentation Create comment in forum
Within limits: o.K. Caution: Overflow!
   010       2
  +011      +3
  ----     ---
   101       5
                  100       4
                  101      +5
               ------     ---
discarded ━━━▶ (1)001       1
by 3 bit
representation

Figure 72. 3 bit two-complement representation Slide presentation Create comment in forum

Figure 73. 3 bit two complement rationale: Usual addition Slide presentation Create comment in forum
Within limits: o.K. Caution: Overflow!
   101      -3
  +010      +2
  ----     ---
   111      -1
  100     -4
  101     -3
 ----    ---
 1001      1

Signed byte values are being represented accordingly:

Figure 74. Signed 8 bit integer binary representation Slide presentation Create comment in forum
Signed 8 bit integer binary representation

exercise No. 11

Hotel key cards Create comment in forum

Q:

A hotel supplies the following type of cards for opening room doors:

A customer is worried concerning the impact of loosing his card. For security reasons the corresponding pattern can never be issued again. Thus the hotel may eventually run short on available combinations.

Discuss this argument by estimating the number of distinct patterns.

Hint: Consider a keycard's (likely?) grid of possible punch positions:

A:

No need to be worried: The 32 possible punch positions may be arranged in a linear fashion:

Since each position may either contain a hole or be solid we have 2 32 = 4.294.967.296 distinct possibilities. Thus a lot of keycards may get lost before a hotel manager needs to start worrying.

Regarding language characters we start with one of the oldest and widespread character encoding schemes:

Figure 75. 7-bit ASCII Slide presentation Create comment in forum
7-bit ASCII

ASCII by design is limited to US characters not including characters . ASCII requires only seven bits. A byte consisting of eight bits allowed to introduce a parity bit for data integrity check purposes:

Figure 76. 7-bit ASCII with even parity bit Slide presentation Create comment in forum
7-bit ASCII with even parity bit

A byte's parity bit may instead be used for 8-bit encodings providing non- ASCII supplementary characters like e.g. the ñ in Señor. One such example is the ISO 8859-1 (ISO Latin 1) standard representing Western European character sets:

Figure 77. Western European characters: ISO Latin 1 encoding Slide presentation Create comment in forum
Western European characters: ISO Latin 1 encoding

Supporting additional languages comes at a price: We have to increase the number of bytes representing a single character:

Figure 78. Unicode UTF-8 samples Slide presentation Create comment in forum
Unicode UTF-8 samples

Notice the representation's differing byte count: UTF-8 Unicode encoding allows for one-, two-, three- and four- byte encodings. See Unicode and You for further details.