The TOY Machine Introduction to Computer Science Robert
The TOY Machine Introduction to Computer Science • Robert Sedgewick and Kevin Wayne • Copyright © 2005 • http: //www. cs. Princeton. EDU/Intro. CS
Basic Characteristics of TOY Machine TOY is a general-purpose computer. u u Sufficient power to perform ANY computation. Limited only by amount of memory and time. Stored-program computer. (von Neumann memo, 1944) u u John von Neumann Data and instructions encoded in binary. Data and instructions stored in SAME memory. All modern computers are general-purpose computers and have same (von Neumann/Princeton) architecture. Maurice Wilkes (left) EDSAC (right) 2
What is TOY? An imaginary machine similar to: u Ancient computers. (PDP-8, world's first commercially successful minicomputer. 1960 s) 12 -bit words – 2 K words of memory – Used in Apollo project – 3
What is TOY? An imaginary machine similar to: u u Ancient computers. Today's microprocessors. Pentium Celeron 4
What is TOY? An imaginary machine similar to: u u Ancient computers. Today's microprocessors. Why study TOY? u Machine language programming. how do high-level programs relate to computer? – a favor of assembly programming – u Computer architecture. how is a computer put together? – how does it work? – u Optimized for understandability, not cost or performance. 5
Inside the Box Switches. Input data and programs. Lights. View data. Memory. u u u Stores data and programs. 256 "words. " (16 bits each) Special word for stdin / stdout. Program counter (PC). u u An extra 8 -bit register. Keeps track of next instruction to be executed. Registers. u u Fastest form of storage. Scratch space during computation. 16 registers. (16 bits each) Register 0 is always 0. Arithmetic-logic unit (ALU). Manipulate data stored in registers. Standard input, standard output. Interact with outside world. 6
Machine "Core" Dump Machine contents at a particular place and time. u u Record of what program has done. Completely determines what machine will do. Registers R 0 R 1 R 2 R 3 pc Main Memory 10 00: 0008 0005 0000 0000 08: 0000 0000 0000 R 4 R 5 R 6 R 7 0000 R 8 R 9 RA 10: 8 A 00 8 B 01 1 CAB 9 C 02 0000 next instruction 18: 0000 0000 data RB 28: 0000 0000 0000 RC RD RE RF 20: 0000 0000 program 0000 . . E 8: 0000 0000 variables F 0: 0000 0000 F 8: 0000 0000 7
Program and Data Instructions Program: Sequence of instructions. 0: halt 16 instruction types: 2: subtract u u 16 -bit word (interpreted one way). Changes contents of registers, memory, and PC in specified, well-defined ways. Data: u 16 -bit word (interpreted other way). Program counter (PC): u u Stores memory address of "next instruction. “ TOY usually starts at address 10. 1: add 3: and 4: xor 5: shift left 6: shift right 7: load address 8: load 9: store A: load indirect B: store indirect C: branch zero D: branch positive E: jump register F: jump and link 8
TOY Reference Card Format 1 15 14 13 12 11 10 9 opcode dest d Format 2 # opcode Operation 8 7 6 5 46 3 2 1 0 source s source t dest d Fmt addr Pseudocode 0: halt 1 exit(0) 1: add 1 R[d] R[s] + R[t] 2: subtract 1 R[d] R[s] - R[t] 3: and 1 R[d] R[s] & R[t] 4: xor 1 R[d] R[s] ^ R[t] 5: shift left 1 R[d] R[s] << R[t] 6: shift right 1 R[d] R[s] >> R[t] 7: load addr 2 R[d] addr 8: load 2 R[d] mem[addr] 9: store 2 mem[addr] R[d] A: load indirect 1 R[d] mem[R[t]] B: store indirect 1 mem[R[t]] R[d] C: branch zero 2 if (R[d] == 0) pc addr D: branch positive 2 if (R[d] > 0) E: jump register 1 pc R[t] F: jump and link 2 R[d] pc; pc addr Register 0 always 0. Loads from mem[FF] from stdin. Stores to mem[FF] to stdout. pc addr 9
TOY Architecture (level 1) 10
level 0 11
Programming in TOY Hello, World. Add two numbers. u Adds 8 + 5 = D. 12
A Sample Program A sample program. u Adds 8 + 5 = D. RA RB RC pc 0000 10 Registers add. toy 00: 0008 01: 0005 8 5 10: 11: 12: 13: 14: RA mem[00] RB mem[01] RC RA + RB mem[FF] RC halt 8 A 00 8 B 01 1 CAB 9 CFF 0000 Memory Since PC = 10, machine interprets 8 A 00 as an instruction. 13
Load. (opcode 8) u u Loads the contents of some memory location into a register. 8 A 00 means load the contents of memory cell 00 into register A. add. toy 00: 0008 8 RA RB RC pc 0000 10 Registers 01: 0005 5 10: 11: 12: 13: 14: RA mem[00] RB mem[01] RC RA + RB mem[FF] RC halt 8 A 00 8 B 01 1 CAB 9 CFF 0000 15 14 13 12 11 10 9 8 7 6 5 46 3 2 1 0 1 0 0 0 0 0? 0 0 816 A 16 0016 opcode dest d addr 14
Load. (opcode 8) u u Loads the contents of some memory location into a register. 8 B 01 means load the contents of memory cell 01 into register B. add. toy 00: 0008 8 RA RB RC pc 0008 0000 11 Registers 01: 0005 5 10: 11: 12: 13: 14: RA mem[00] RB mem[01] RC RA + RB mem[FF] RC halt 8 A 00 8 B 01 1 CAB 9 CFF 0000 15 14 13 12 11 10 9 8 7 6 5 46 3 2 1 0 0 0 1 1 0 0? 0 0 0 1 816 B 16 0116 opcode dest d addr 15
Add Add. (opcode 1) u u Add contents of two registers and store sum in a third. 1 CAB adds the contents of registers A and B and put the result into register C. 00: 0008 8 add. toy RA RB RC pc 0008 0005 0000 12 Registers 01: 0005 5 10: 11: 12: 13: 14: RA mem[00] RB mem[01] RC RA + RB mem[FF] RC halt 8 A 00 8 B 01 1 CAB 9 CFF 0000 15 14 13 12 11 10 9 8 7 6 5 46 3 2 1 0 0 1 1 1 0 0 1 0? 1 0 1 1 116 C 16 A 16 B 16 opcode dest d source s source t 16
Store. (opcode 9) u u Stores the contents of some register into a memory cell. 9 CFF means store the contents of register C into memory cell FF (stdout). add. toy 00: 0008 8 RA RB RC pc 0008 0005 000 D 13 Registers 01: 0005 5 10: 11: 12: 13: 14: RA mem[00] RB mem[01] RC RA + RB mem[FF] RC halt 8 A 00 8 B 01 1 CAB 9 CFF 0000 15 14 13 12 11 10 9 8 7 6 5 46 3 2 1 0 0 1 1 1 0 0 0? 0 0 1 0 916 C 16 0216 opcode dest d addr 17
Halt. (opcode 0) u Stop the machine. RA RB RC pc 0008 0005 000 D 14 Registers add. toy 00: 0008 01: 0005 8 5 10: 11: 12: 13: 14: RA mem[00] RB mem[01] RC RA + RB mem[FF] RC halt 8 A 00 8 B 01 1 CAB 9 CFF 0000 18
Simulation Consequences of simulation. u Test out new machine or microprocessor using simulator. – u cheaper and faster than building actual machine Easy to add new functionality to simulator. trace, single-step, breakpoint debugging – simulator more useful than TOY itself – u Reuse software from old machines. Ancient programs still running on modern computers. u u Lode Runner on Apple IIe. Gameboy simulator on PCs. 19
Interfacing with the TOY Machine To enter a program or data: u u u Set 8 memory address switches. Set 16 data switches. Press LOAD. – data written into addressed word of memory To view the results of a program: u u Set 8 memory address switches. Press LOOK: contents of addressed word appears in lights. 20
Using the TOY Machine: Run To run the program: u u u Set 8 memory address switches to address of first instruction. Press LOOK to set PC to first instruction. Press RUN button to repeat fetch-execute cycle until halt opcode. Fetch Execute 21
Branch in TOY To harness the power of TOY, need loops and conditionals. u Manipulate PC to control program flow. Branch if zero. (opcode C) u u Changes PC depending of value of some register. Used to implement: for, while, if-else. Branch if positive. (opcode D) u Analogous. 22
An Example: Multiplication Multiply. u u u No direct support in TOY hardware. Load in integers a and b, and store c = a b. Brute-force algorithm: initialize c = 0 – add b to c, a times – int a = 3; int b = 9; int c = 0; while (a != 0) { c = c + b; a = a - 1; } Java Issues ignored: slow, overflow, negative numbers. 23
Multiply int a = 3; int b = 9; int c = 0; while (a != 0) { c = c + b; a = a - 1; } 24
Multiply loop 0 A: 0003 0 B: 0009 0 C: 0000 3 9 0 inputs 0 D: 0000 0 E: 0001 0 1 constants 10: 8 A 0 A 11: 8 B 0 B 12: 8 C 0 D RA mem[0 A] RB mem[0 B] RC mem[0 D] a b c = 0 13: 810 E R 1 mem[0 E] always 1 14: 15: 16: 17: if RC RA pc CA 18 1 CCB 2 AA 1 C 014 18: 9 CFF 19: 0000 output (RA == 0) pc 18 RC + RB RA - R 1 14 while (a != 0) { c = c + b a = a - 1 } mem[FF] RC halt multiply. toy 25
Step-By-Step Trace 10: 11: 12: 13: 14: 15: 16: 17: 14: 18: 19: 8 A 0 A 8 B 0 B 8 C 0 D 810 E CA 18 1 CCB 2 AA 1 C 014 CA 18 9 CFF 0000 RA mem[0 A] RB mem[0 B] RC mem[0 D] R 1 mem[0 E] if (RA == 0) pc RC RC + RB RA RA – R 1 pc 14 if (RA == 0) pc RC + RB RA – R 1 pc 14 if (RA == 0) pc mem[FF] RC halt R 1 RA 0003 RB RC 0009 0000 18 0001 0009 0002 18 0012 0001 18 001 B 0000 18 multiply. toy 26
An Efficient Multiplication Algorithm Inefficient multiply. u u Brute force multiplication algorithm loops a times. In worst case, 65, 535 additions! "Grade-school" multiplication. u Always 16 additions to multiply 16 -bit integers. 1 2 3 4 Decimal * 1 5 1 2 2 4 6 8 1 2 3 4 6 1 7 0 1 2 3 4 0 1 8 6 5 8 0 8 Binary 1 0 1 1 * 1 1 0 0 1 0 1 1 1 0 0 0 1 1 27
Binary Multiplication Grade school binary multiplication algorithm to compute c = a b. 1 0 1 1 u u Initialize c = 0. Loop over i bits of b. * 1 1 0 1 1 if bi = 0, do nothing bi = ith bit of b – if bi = 1, shift a left i bits and add to c – b a << 0 0 0 1 1 1 0 1 1 Implement with built-in TOY shift instructions. int c = 0; for (int i = 15; i >= 0; i--) if (((b >> i) & 1) == 1) c = c + (a << i); a 1 0 0 0 1 1 a << 2 a << 3 c bi = ith bit of b 28
Shift Left Shift left. (opcode 5) u u Move bits to the left, padding with zeros as needed. 123416 << 716 = 1 A 0016 discard 0 0 0 116 1 0 0 0 216 1 0 0 116 1 1 0 1 A 16 0 1 316 0 0 0 416 << 7 0 1 pad with 0’s 0 0 016 29
Shift Right Shift right. (opcode 6) u u Move bits to the right, padding with sign bit as needed. 123416 >> 716 = 002416 discard sign bit 0 0 0 116 1 0 0 016 0 0 216 0 0 0 1? 0 1 0 0 416 >> 7 0 016 1 316 pad with 0’s 0 0 0 1 216 0 0 1 416 30
Shift Right (Sign Extension) Shift right. (opcode 6) u u u Move bits to the right, padding with sign bit as needed. FFCA 16 >> 216 = FFF 216 -5310 >> 210 = -1310 discard sign bit 1 1 1 F 16 0 0 1 0 C 16 1 0 A 16 >> 2 pad with 1 s 1 1 1 F 16 1 0 0 216 31
Bitwise AND Logical AND. (opcode 3) u u Logic operations are BITWISE. 002416 & 000116 = 000016 0 0 016 1 0? 0 x y AND 0 0 1 1 216 0 0 0 1 0 0 416 & 0 0 0 016 0 0? 0 0 016 116 = 0 0 0 016 0 0 0 0 016 32
Shifting and Masking Shift and mask: get the 7 th bit of 1234. u u Compute 123416 >> 716 = 002416. Compute 002416 && 116 = 016. 0 0 0 116 1 0 0 0 216 1 1? 0 1 316 0 0 0 1 0 0 416 >> 7 0 0 0 016 1 0 0 1 216 416 & 0 0 0 016 0 0? 0 0 016 116 = 0 0 0 016 0 0 0 0 016 33
Binary Multiplication int c = 0; for (int i = 15; i >= 0; i--) if (((b >> i) & 1) == 1) c = c + (a << i); 34
Binary Multiplication loop 0 A: 0 B: 0 C: 0 D: 0 E: 0 F: 0003 0009 0000 0001 0010 3 9 0 0 1 16 10: 11: 12: 13: 14: 8 A 0 A 8 B 0 B 8 C 0 D 810 E 820 F RA RB RC R 1 R 2 15: 16: 17: branch 18: 19: 1 A: 1 B: 2221 53 A 2 64 B 2 3441 C 41 B 1 CC 3 D 215 1 C: 9 CFF R 2 R 3 R 4 if RC if inputs output constants mem[0 A] mem[0 B] mem[0 D] mem[0 E] mem[0 F] R 2 - R 1 RA << R 2 RB >> R 2 R 4 & R 1 (R 4 == 0) goto 1 B RC + R 3 (R 2 > 0) goto 15 mem[FF] RC a b c = 0 always 1 i = 16 16 bit words do { i-a << i b >> i bi = ith bit of b if bi is 1 add a << i to sum } while (i > 0); multiply-fast. toy 35
Useful TOY "Idioms" Jump absolute. u Jump to a fixed memory address. branch if zero with destination – register 0 is always 0 – 17: C 014 pc 14 Register assignment. u u No instruction that transfers contents of one register into another. Pseudo-instruction that simulates assignment: – add with register 0 as one of two source registers No-op. u u 17: 1230 R[2] R[3] Instruction that does nothing. Plays the role of whitespace in C programs. – numerous other possibilities! 17: 1000 no-op 36
Standard Input and Output: Implications Standard input and output enable you to: u u Process more information than fits in memory. Interact with the computer while it is running. Standard output. u u Writing to memory location FF sends one word to TOY stdout. 9 AFF writes the integer in register A to stdout. Standard input. u u Loading from memory address FF loads one word from TOY stdin. 8 AFF reads in an integer from stdin and store it in register A. 37
Fibonacci Numbers Fibonacci sequence: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, . . . Reference: http: //www. mcs. surrey. ac. uk/Personal/R. Knott/Fibonacci/fibnat. html 38
Standard Output 00: 0000 01: 0001 0 1 10: 8 A 00 11: 8 B 01 RA mem[00] RB mem[01] 12: 13: 14: 15: 16: print RA RA + RB RB RA - RB if (RA > 0) goto 12 halt 9 AFF 1 AAB 2 BAB DA 12 0000 fibonacci. toy a = 0 b = 1 do { print a a = a + b b = a - b } while (a > 0) 0000 0001 0002 0003 0005 0008 000 D 0015 0022 0037 0059 0090 00 E 9 0179 0262 03 DB 063 D 0 A 18 1055 1 A 6 D 2 AC 2 452 F 6 FF 1 39
Standard Input Ex: read in a sequence of integers and print their sum. u u In Java, stop reading when EOF. In TOY, stop reading when user enters 0000. while(!Std. In. is. Empty()) { a = Std. In. read. Int(); sum = sum + a; } System. out. println(sum); 00 AE 0046 0003 0000 00 F 7 00: 0000 0 10: 11: 12: 13: 14: 15: 16: RC <- mem[00] read RA if (RA == 0) pc 15 RC + RA pc 11 write RC halt 8 C 00 8 AFF CA 15 1 CCA C 011 9 CFF 0000 40
Load Address (a. k. a. Load Constant) Load address. (opcode 7) u u Loads an 8 -bit integer into a register. 7 A 30 means load the value 30 into register A. Applications. u u a = 30; Load a small constant into a register. Java code Load a 8 -bit memory address into a register. – register stores "pointer" to a memory cell 15 14 13 12 11 10 9 8 7 6 5 4 6 3 2 1 0 0 1 1 0 1 0 0 0 1 1 ? 0 0 716 A 16 opcode dest d 316 016 addr 41
Arrays in TOY main memory is a giant array. u u u Can access memory cell 30 using load and store. 8 C 30 means load mem[30] into register C. Goal: access memory cell i where i is a variable. Load indirect. (opcode A) u AC 06 means load mem[R 6] into register C. a variable index (like a pointer) Store indirect. (opcode B) u BC 06 means mem[R 6]. a variable index store contents of register C into for (int i = 0; i < N; i++) a[i] = Std. In. read. Int(); for (int i = 0; i < N; i++) System. out. println(a[N-i-1]); Reverse. java 42
TOY Implementation of Reverse TOY implementation of reverse. u u u Read in a sequence of integers and store in memory 30, 31, 32, … Stop reading if 0000. Print sequence in reverse order. 43
TOY Implementation of Reverse TOY implementation of reverse. u u u Read in a sequence of integers and store in memory 30, 31, 32, … Stop reading if 0000. Print sequence in reverse order. 10: 7101 11: 7 A 30 12: 7 B 00 13: 14: 15: 16: 17: 18: 8 CFF CC 19 16 AB BC 06 1 BB 1 C 013 R 1 0001 RA 0030 RB 0000 constant 1 a[] n read RC if (RC == 0) goto 19 R 6 RA + RB mem[R 6] RC RB + R 1 goto 13 while(true) { c = Std. In. read. Int(); if (c == 0) break; address of a[n] = c; n++; } read in the data 44
TOY Implementation of Reverse TOY implementation of reverse. u u u Read in a sequence of integers and store in memory 30, 31, 32, … Stop reading if 0000. Print sequence in reverse order. 19: 1 A: 1 B: 1 C: 1 D: 1 E: 1 F: 20: CB 20 16 AB 2661 AC 06 9 CFF 2 BB 1 C 019 0000 if (RB == 0) goto 20 while (n > 0) { R 6 RA + RB address of a[n] R 6 – R 1 address of a[n-1] RC mem[R 6] c = a[n-1]; write RC System. out. println(c); RB – R 1 n--; goto 19 } halt print in reverse order 45
Unsafe Code at any Speed What happens if we make array start at 00 instead of 30? u u Self modifying program. Exploit buffer overrun and run arbitrary code! 10: 7101 11: 7 A 00 12: 7 B 00 13: 14: 15: 16: 17: 18: 8 CFF CC 19 16 AB BC 06 1 BB 1 C 013 R 1 0001 RA 0000 RB 0000 constant 1 a[] n read RC if (RC == 0) goto 19 R 6 RA + RB mem[R 6] RC RB + R 1 goto 13 while(true) { c = Std. In. read. Int(); if (c == 0) break; address of a[n] = c; n++; } Crazy 8 s Input 1 1 1 1 8888 8810 98 FF C 011 46
What Can Happen When We Lose Control? Buffer overrun. u u #include <stdio. h> int main(void) { char buffer[100]; scanf("%s", buffer); printf("%sn", buffer); return 0; } Array buffer[] has size 100. User might enter 200 characters. Might lose control of machine behavior. Majority of viruses and worms caused by similar errors. unsafe C program Robert Morris Internet Worm. u u Cornell grad student injected worm into Internet in 1988. Exploited buffer overrun in finger daemon fingerd. 47
Function Call: A Failed Attempt Goal: x y z. u Need two multiplications: x y, (x y) z. ! Solution 1: write multiply code 2 times. ! Solution 2: write a TOY function. A failed attempt: u u u Write multiply loop at 30 -36. Calling program agrees to store arguments in registers A and B. Function agrees to leave result in register C. Call function with jump absolute to 30. Return from function with jump absolute. Reason for failure. ! Need to return to a VARIABLE memory address. function? 10: 11: 12: 13: 14: 15: 16: 17: 8 AFF 8 BFF C 030 1 AC 0 8 BFF C 030 9 CFF 0000 30: 31: 32: 33: 34: 35: 36: 7 C 00 7101 CA 36 1 CCB 2 AA 1 C 032 C 013? 48
Multiplication Function Calling convention. u u u Jump to line 30. Store a and b in registers A and B. Return address in register F. Put result c = a b in register C. Register 1 is scratch. Overwrites registers A and B. function. toy 30: 31: 32: 33: 34: 35: 36: 7 C 00 7101 CA 36 1 CCB 2 AA 1 C 032 EF 00 R[C] 00 R[1] 01 if (R[A] == 0) goto 36 R[C] += R[B] R[A]-opcode E goto 32 jump register pc R[F] return function 10: 11: 12: 13: 14: 15: 16: 17: 8 AFF 8 BFF FF 30 1 AC 0 8 BFF FF 30 9 CFF 0000 30: 31: 32: 33: 34: 35: 36: 7 C 00 7101 CA 36 1 CCB 2 AA 1 C 032 EF 00 49
Multiplication Function Call Client program to compute x y z. u u Read x, y, z from standard input. Note: PC is incremented before instruction is executed. – value stored in register F is correct return address opcode F jump and link function. toy (cont) 10: 11: 12: 13: 14: 15: 16: 17: 8 AFF 8 BFF FF 30 1 AC 0 8 BFF FF 30 9 CFF 0000 read R[A] read R[B] R[F] pc; goto 30 R[A] R[C] read R[B] R[F] pc; goto 30 write R[C] halt x y x * y (x * y) z (x * y) * z R[F] 13 R[F] 16 50
Function Call: One Solution Contract between calling program and function: u u u Calling program stores function parameters in specific registers. Calling program stores return address in a specific register. – jump-and-link – jump register Calling program sets PC to address of function. Function stores return value in specific register. Function sets PC to return address when finished. What if you want a function to call another function? u u Use a different register for return address. More general: store return addresses on a stack. 51
Virtual machines Abstractions for computers 52
Problems with programming using machine code u u Difficult to remember instructions Difficult to remember variables Hard to calculate addresses/relocate variables or functions Need to handle instruction encoding 53
Virtual machines Abstractions for computers 54
- Slides: 54