BASIC STRUCTURE OF COMPUTERS What is a computer

BASIC STRUCTURE OF COMPUTERS

What is a computer? q Simply put, a computer is a sophisticated electronic calculating machine that: u Accepts input information, u Processes the information according to a list of internally stored instructions and u Produces the resulting output information. q Functions performed by a computer are: u Accepting information to be processed as input. u Storing a list of instructions to process the information. u Processing the information according to the list of instructions. u Providing the results of the processing as output. q What are the functional units of a computer? 1

Functional units of a computer Arithmetic and logic unit(ALU): • Performs the desired operations on the input information as determined by instructions in the memory Input unit accepts information: • Human operators, • Electromechanical devices • Other computers Memory Input Instr 1 Instr 2 Instr 3 Data 1 Data 2 Output I/O Output unit sends results of processing: • To a monitor display, • To a printer Stores information: • Instructions, • Data Arithmetic & Logic Control Processor Control unit coordinates various actions • Input, • Output • Processing 2

Information in a computer -- Instructions q Instructions specify commands to: Transfer information within a computer (e. g. , from memory to ALU) u Transfer of information between the computer and I/O devices (e. g. , from keyboard to computer, or computer to printer) u Perform arithmetic and logic operations (e. g. , Add two numbers, Perform a logical AND). u q A sequence of instructions to perform a task is called a program, which is stored in the memory. q Processor fetches instructions that make up a program from the memory and performs the operations stated in those instructions. q What do the instructions operate upon? 3

Information in a computer -- Data q Data are the “operands” upon which instructions operate. q Data could be: u Numbers, u Encoded characters. q Data, in a broad sense means any digital information. q Computers use data that is encoded as a string of binary digits called bits. 4

Input unit Binary information must be presented to a computer in a specific format. This task is performed by the input unit: - Interfaces with input devices. - Accepts binary information from the input devices. - Presents this binary information in a format expected by the computer. - Transfers this information to the memory or processor. Real world Computer Memory Keyboard Audio input …… Input Unit Processor 5

Memory unit q Memory unit stores instructions and data. u Recall, data is represented as a series of bits. u To store data, memory unit thus stores bits. q Processor reads instructions and reads/writes data from/to the memory during the execution of a program. In theory, instructions and data could be fetched one bit at a time. u In practice, a group of bits is fetched at a time. u Group of bits stored or retrieved at a time is termed as “word” u Number of bits in a word is termed as the “word length” of a computer. u q In order to read/write to and from memory, a processor should know where to look: u “Address” is associated with each word location. 6

Memory unit (contd. . ) q Processor reads/writes to/from memory based on the memory address: Access any word location in a short and fixed amount of time based on the address. u Random Access Memory (RAM) provides fixed access time independent of the location of the word. u Access time is known as “Memory Access Time”. u q Memory and processor have to “communicate” with each other in order to read/write information. In order to reduce “communication time”, a small amount of RAM (known as Cache) is tightly coupled with the processor. q Modern computers have three to four levels of RAM units with different speeds and sizes: u Fastest, smallest known as Cache u Slowest, largest known as Main memory. u 7

Memory unit (contd. . ) q Primary storage of the computer consists of RAM units. Fastest, smallest unit is Cache. u Slowest, largest unit is Main Memory. u q Primary storage is insufficient to store large amounts of data and programs. u Primary storage can be added, but it is expensive. q Store large amounts of data on secondary storage devices: Magnetic disks and tapes, u Optical disks (CD-ROMS). u Access to the data stored in secondary storage is slower, but takes advantage of the fact that some information may be accessed infrequently. u q Cost of a memory unit depends on its access time, lesser access time implies higher cost. 8

Arithmetic and logic unit (ALU) q Operations are executed in the Arithmetic and Logic Unit (ALU). u Arithmetic operations such as addition, subtraction. u Logic operations such as comparison of numbers. q In order to execute an instruction, operands need to be brought into the ALU from the memory. Operands are stored in general purpose registers available in the ALU. u Access times of general purpose registers are faster than the cache. u q Results of the operations are stored back in the memory or retained in the processor for immediate use. 9

Output unit • Computers represent information in a specific binary form. Output units: - Interface with output devices. - Accept processed results provided by the computer in specific binary form. - Convert the information in binary form to a form understood by an output device. Computer Memory Output Unit Real world Printer Graphics display Speakers …… Processor 10

Control unit q Operation of a computer can be summarized as: Accepts information from the input units (Input unit). u Stores the information (Memory). u Processes the information (ALU). u Provides processed results through the output units (Output unit). u q Operations of Input unit, Memory, ALU and Output unit are coordinated by Control unit. q Instructions control “what” operations take place (e. g. data transfer, processing). q Control unit generates timing signals which determines “when” a particular operation takes place. 11

Execution of an instruction q Recall the steps involved in the execution of an instruction by a processor: Fetch an instruction from the memory. u Fetch the operands. u Execute the instruction. u Store the results. u q Several issues: Where is the address of the memory location from which the present instruction is to be fetched? u Where is the present instruction stored while it is executed? u Where and what is the address of the memory location from which the data is fetched? u. . . u q Basic processor architecture has several registers to assist in the execution of the instructions. 12

Basic processor architecture(Fig A) Address of the memory location to be accessed Memory Address of the next instruction to be fetched and executed. Data to be read into or read out of the current location MAR MDR Control PC R 0 General purpose registers R 1 IR Instruction that is currently being executed ALU R(n-1) - n general purpose registers Processor 13

How are the functional units connected? • For a computer to achieve its operation, the functional units need to communicate with each other. • In order to communicate, they need to be connected. Input Output Memory Processor Bus • Functional units may be connected by a group of parallel wires. • The group of parallel wires is called a bus. • Each wire in a bus can transfer one bit of information. • The number of parallel wires in a bus is equal to the word length of a computer 14

Basic processor architecture (contd. . ) Control Path MAR MDR Memory Data Path Processor Control path is responsible for: • Instruction fetch and execution sequencing • Operand fetch • Saving results Data path: • Contains general purpose registers • Contains ALU • Executes instructions 15

Registers in the control path q Instruction Register (IR): u Instruction that is currently being executed. q Program Counter (PC): u Address of the next instruction to be fetched and executed. q Memory Address Register (MAR): u Address of the memory location to be accessed. q Memory Data Register (MDR): u Data to be read into or read out of the current memory location, whose address is in the Memory Address Register (MAR). 16

Fetch/Execute cycle q Execution of an instruction takes place in two phases: Instruction fetch. u Instruction execute. u q Instruction fetch: Fetch the instruction from the memory location whose address is in the Program Counter (PC). u Place the instruction in the Instruction Register (IR). u q Instruction execute: Instruction in the IR is examined (decoded) to determine which operation is to be performed. u Fetch the operands from the memory or registers. u Execute the operation. u Store the results in the destination location. u q Basic fetch/execute cycle repeats indefinitely. 17

Memory organization q Recall: Information is stored in the memory as a collection of bits. u Collection of bits are stored or retrieved simultaneously is called a word. u Number of bits in a word is called word length. u Word length can be 16 to 64 bits. u q Another collection which is more basic than a word: u Collection of 8 bits known as a “byte” q Bytes are grouped into words, word length can also be expressed as a number of bytes instead of the number of bits: Word length of 16 bits, is equivalent to word length of 2 bytes. q Words may be 2 bytes (older architectures), 4 bytes (current architectures), or 8+ bytes (modern architectures). u 18

Memory organization (contd. . ) q Accessing the memory to obtain information requires specifying the “address” of the memory location. q Recall that a memory has a sequence of bits: u Assigning addresses to each bit is impractical and unnecessary. Typically, addresses are assigned to a single byte. u “Byte addressable memory” u q Suppose k bits are used to hold the address of a memory location: k Size of the memory in bytes is given by: 2 where k is the number of bits used to hold a memory address. 16 E. g. , for a 16 -bit address, size of the memory is 2 = 65536 bytes What is the size of the memory for a 24 -bit address? 19

Memory organization (contd. . ) Byte 0 • Memory is viewed as a sequence of bytes. • Address of the first byte is 0 k • Address of the last byte is 2 - 1, where k is the number of bits used to hold memory address • E. g. when k = 16, Address of the first byte is 0 Address of the last byte is 65535 • E. g. when k = 2, Address of the first byte is ? Address of the last byte is ? k Byte 2 -1 20

Memory organization (contd. . ) Word #0 Byte 1 Byte 2 Word #1 Word #? Byte 3 Byte 4 Consider a memory organization: 16 -bit memory addresses Size of the memory is ? Word length is 4 bytes Number of words = Memory size(bytes) = ? Word length(bytes) Word #0 starts at Byte #0. Word #1 starts at Byte #4. Last word (Word #? ) starts at Byte#? Byte 65532 Byte 65533 Byte 65534 Byte 65535 21

Memory organization (contd. . ) Byte 0 Byte 1 Word #0 Byte 2 Byte 3 Byte 4 MAR register contains the address of the memory location addressed Addr 65532 Byte 65533 Byte 65534 Byte 65535 Word #1 MDR Word #16383 MDR contains either the data to be written to that address or read from that address. 22

Byte Addressability • Byte addressable memory • Memory is organized in 16 -bit words. • Two consecutive 16 -bit words constitute one 32 -bit long word. • Word address must be an even number, that is, words must be aligned on an even boundary. • Byte in the high-order position has the same address as the word, the byte in the low-order position has the next higher address. This is the big endian address assignment. 23

Big Endian : (e. g. , in IBM, Motorolla, Sun, HP) q lower byte addresses are used for the most significant bytes of the word q high order byte stored at lowest address q byte 3 byte 2 byte 1 byte 0 q Programmers/protocols should be careful when transferring binary data between Big Endian and Little Endian machines Eg: 46, 78, 96, 54 (32 bit data) H BYTE <-------- L BYTE 8000 8001 8002 8003 46 78 96 54

Little-Endian Assignment • Little Endian (e. g. , in DEC, Intel) » low order byte stored at lowest address » byte 0 byte 1 byte 2 byte 3 Eg: 46, 78, 96, 54 (32 bit data) H BYTE <-------- L BYTE 8000 8001 8002 8003 54 96 78 46

q In case of 16 bit data, aligned words begin at byte addresses of 0, 2, 4, ……………. q In case of 32 bit data, aligned words begin at byte address of 0, 4, 8, ……………. q In case of 64 bit data, aligned words begin at byte addresses of 0, 8, 16, ……………. . q In some cases words can start at an arbitrary byte address also then, we say that word locations are unaligned

Memory operations q Memory read or load: Place address of the memory location to be read from into MAR. u Issue a Memory_read command to the memory. u Data read from the memory is placed into MDR automatically (by control logic). u q Memory write or store: Place address of the memory location to be written to into MAR. u Place data to be written into MDR. u Issue Memory_write command to the memory. u Data in MDR is written to the memory automatically (by control logic). u 27

Instruction types q Computer instructions must be capable of performing 4 types of operations. q Data transfer/movement between memory and processor registers. u E. g. , memory read, memory write q Arithmetic and logic operations: u E. g. , addition, subtraction, comparison between two numbers. q Program sequencing and flow of control: u Branch instructions q Input/output transfers to transfer data to and from the real world. 28

Instruction types (contd. . ) q Examples of different types of instructions in assembly language notation. q Data transfers between processor and memory. u Move A, B (B = A). u Move A, R 1 (R 1 = A). q Arithmetic and logic operation: u Add A, B, C (C = A + B) q Sequencing: u Jump Label (Jump to the subroutine which starts at Label). q Input/output data transfer: u Input PORT, R 5 (Read from i/o port “PORT” to register R 5). 29

Specifying operands in instructions q Operands are the entities operated upon by the instructions. q Recall that operands may have to be fetched from a memory location to execute an operation. u Memory locations have addresses using which they can be accessed. q Operands may also be stored in the general purpose registers. Intermediate value of some computation which will be required immediately for subsequent computations. u Registers also have addresses. u q Specifying the operands on which the instruction is to operate involves specifying the addresses of the operands. u Address can be of a memory location or a register. 30

Source and destination operands q Operation may be specified as: u Operation source 1, source 2, destination q An operand is called a source operand if: u It appears on the right-hand side of an expression • E. g. , Add A, B, C (C = A+ B) – A and B are source operands. q An operand is called a destination operand if: u It appears on the left-hand side of an expression. • E. g. , Add A, B, C (C = A + B) – C is a destination operand. 31

Source and destination operands (contd. . ) q In case of some instructions, the same operand serves as both the source and the destination. u Same operand appears on the right and left side of an expression. • E. g. Add A, B (B = A + B) • B is both the source and the destination operand. q Another classification of instructions is based on the number of operand addresses in the instruction. 32

Instruction types q Instructions can also be classified based on the number of operand addresses they include. u 3, 2, 1, 0 operand addresses. q 3 -address instructions are almost always instructions that implement binary operations. u E. g. Add A, B, C (C = A + B) k bits are used to specify the address of a memory location, then 3 -address instructions need 3*k bits to specify the operand addresses. u 3 -address instructions, where operand addresses are memory locations are too big to fit in one word. u 33

Instruction types (contd. . ) q 2 -address instructions one operand serves as a source and destination: u E. g. Add A, B (B = A + B) 2 -address instructions need 2*k bits to specify an instruction. u This may also be too big to fit into a word. u q 2 -address instructions, where at least one operand is a processor register: u E. g. Add A, R 1 (R 1 = A + R 1) q 1 -address instructions require only one operand. u E. g. Clear A (A = 0) q 0 -address instructions do not operate on operands. u E. g. Halt (Halt the computer) q How are addresses of operands specified in the instructions? 34

Problems(Q 1. 1 in Text book) 1. List the steps needed to execute the machine instruction Add LOCA, R 0 in terms of transfer between the components shown in Fig A • • • Transfer the contents of register PC to register MAR Issue a Read command to memory, and then wait until it has transferred the requested word into register MDR Transfer the instruction from MDR into IR and decode it Transfer the address LOCA from IR to MAR Issue a Read command wait until MDR is loaded Transfer contents of MDR to the ALU Transfer contents of R 0 to the ALU Perform addition of the two operands in the ALU and transfer result into R 0 Transfer contents of PC to ALU Add 1 to operand in ALU and transfer incremented address to PC 35

Repeat the above problem for the machine instruction Add R 1, R 2, R 3: Problems(Q 1. 2 in Text book) • Transfer the contents of register PC to register MAR • Issue a Read command to memory, and then wait until it has transferred the requested word into register MDR • Transfer the instruction from MDR into IR and decode it • Transfer contents of R 1 and R 2 to the ALU • Perform Addition of two operands in the ALU and transfer result into R 3 • Transfer contents of PC to ALU • Add 1 to operand in ALU and transfer incremented address to PC 36

List the steps needed to execute the machine instruction Add LOCA, LOCB • Transfer the contents of register PC to register MAR • Issue a Read command to memory, and then wait until it has transferred the requested word into register MDR • Transfer the instruction from MDR into IR and decode it • Transfer the address LOCA from IR to MAR • Issue a Read command wait until MDR is loaded • Transfer contents of MDR to the ALU • Transfer the address LOCB from IR to MAR • Issue a Read command wait until MDR is loaded • Transfer contents of MDR to the ALU • Perform addition of the two operands in the ALU and transfer result into LOCB • Transfer contents of PC to ALU • Add 1 to operand in ALU and transfer incremented address to PC 37

Problems(Q 1. 3 in Text book) Give a short sequence of machine instructions for the task: “Add the contents of memory location A to those of location B and place the answer in location C”. a) Only Load LOC, Ri and Store Ri, LOC are available b) Suppose that Move and Add instructions are available with the format Move/Add Location 1, Location 2, is it possible to use fewer instructions to accomplish the task in Part a? If yes, give the sequence Solution: (a) Load A, R 0 Load B, R 1 Add R 0, R 1 Store R 1, C (b) Yes; Move B, C Add A, C 38

Performance q The speed with which a computer executes programs is affected by the design of its hardware and its machine language instructions. q Because programs are usually written in a high-level language, performance is also affected by the compiler that translates programs into machine language. q For best performance, it is necessary to design the compiler, the machine instruction set and the hardware in a coordinated way. q Just as the elapsed time for the execution of a program depends on all units in a computer system, the processor time depends on the hardware involved in the execution of individual machine instructions. q This hardware comprises the processor and the memory, which are usually connected by a bus as shown in Fig. 39

Organization of cache and main memory Main memory Cache memory Processor Bus Why is the access time of the cache memory lesser than the access time of the main memory? 40

Processor Clock u Processor circuits are controlled by a timing signal called a clock. u The clock defines regular time intervals, called clock cycles. u To execute a machine instruction, the processor divides the action to be performed into a sequence of basic steps, such that each step can be completed in one clock cycle. u The length P of one clock cycle is an important parameter that affects processor performance. u Its inverse is the clock rate, R = 1/P, which is measured in cycles per second. 41

Basic Performance Equation q Let T be the processor time required to execute a program that has been prepared in some high-level language. q The compiler generates a machine language object program that corresponds to the source program. q Assume that complete execution of the program requires the execution of N machine language instructions. q The number N is the actual number of instruction executions and is not necessarily equal to the number of machine instructions in the object program. q Suppose that the average number of basic steps needed to execute one machine instruction is S, where each basic step is completed in one clock cycle. q If the clock rate is R cycles per second, the program execution time is given by This is referred to as the basic performance equation 42

Performance Measurement q. Computer performance can be measured using benchmark programs. q To make comparisons possible, standardized programs must be used. q The performance measure is the time it takes a computer to execute a given benchmark. q. A nonprofit organization called System Performance Evaluation Corporation (SPEC) selects and publishes representative application programs for different application domains, together with test results for many commercially available computers. q. The selected program is compiled for the computer under test and the running time on a real computer is measured. q. The same program is also compiled and run on one computer selected as a reference. The SPEC rating is computed as follows: 43

The test is repeated for all the programs in the SPEC suite, and the geometric mean of the results is computed. Let be the rating for program I in the suite. The overall SPEC rating for the computer is given by Where n is the number of programs in the suite. 44

Instruction Set : RISC and CISC • RISC (Reduced Instruction Set Computer) Architectures – Memory accesses are restricted to load and store instruction, and data manipulation instructions are register to register. – Addressing modes are limited in number. – Instruction formats are all of the same length. – Instructions perform elementary operations • CISC (Complex Instruction Set Computer) Architectures – Memory access is directly available to most types of instruction. – Addressing mode are substantial in number. – Instruction formats are of different lengths. – Instructions perform both elementary and complex operations. 45

Instruction Set Architecture • RISC (Reduced Instruction Set Computer) Architectures – Large register file – Control unit: simple and hardwired – pipelining • CISC (Complex Instruction Set Computer) Architectures – Register file: smaller than in a RISC – Control unit: often micro‐programmed – Current trend • CISC operation a sequence of RISC‐like operations 46

CISC Examples • Examples of CISC processors are the – System/360(excluding the 'scientific' Model 44), – VAX, – PDP‐ 11, – Motorola 68000 family – Intel x 86 architecture based processors. 47

RISC Examples • Apple i. Pods (custom ARM 7 TDMI So. C) • Apple i. Phone (Samsung ARM 1176 JZF) • Palm and Pocket. PC PDAs and smartphones (Intel. XScale family, Samsung SC 32442 ‐ ARM 9) • Nintendo Game Boy Advance (ARM 7) • Nintendo DS (ARM 7, ARM 9) • Sony Network Walkman (Sony in‐house ARM based chip) • Some Nokia and Sony Ericsson mobile phones 48

Characteristics of RISC Vs CISC processors No RISC CISC 1 Simple instructions taking one cycle Complex instructions taking multiple cycles 2 Instructions are executed by hardwired control unit Instructions are executed by microprogramed control unit 3 Few instructions Many instructions 4 Fixed format instructions Variable format instructions 5 Few addressing mode, and most instructions have register to register addressing mode Many addressing modes 6 Multiple register set Single register set 7 Highly pipelined Not pipelined or less pipelined 49

Problem Q 1. 5 a) Program execution time T (performance equation) is to be examined for a certain high-level language program. The program can be run on a RISC or a CISC computer. Both computers use pipelined instruction execution, but pipelining in RISC machine is more effective than in CISC machine. Specifically the effective value of S in the T expression for the RISC machine is 1. 2 but it is only 1. 5 for the CISC machine. Both machines have the same clock rate, R. what is the largest allowable value for N, the number of instructions executed on the CISC machine, expressed as a percentage of the N value for the RISC machine, if time for execution on the CISC machine is to be no longer than that on the RISC machine? b) Repeat part (a) if the clock rate R for the RISC machine is 15 percent higher than that for the CISC machine. 50

Solution: (a) Let TR = (NRX SR) /RR and TC = (NCX SC)/RC be execution times on the RISC and CISC processors, respectively. Equating execution times and clock rates, we have Then 1. 2 NR = 1. 5 NC NC/NR = 1. 2 / 1. 5 = 0. 8 Therefore, the largest allowable value for NC is 80% of NR. (b) 1. 2 NR /1. 15 = 1. 5 NC = 1. 00 Then NC /NR = 1. 2 / (1. 15 X 1. 5) = 0. 696 Therefore, the largest allowable value for NC is 69. 6% of NR. 51

Problem a) A processor cache is as shown in Fig. Suppose that execution time for a program is directly proportional to instruction access time and that access to an instruction in the cache is 20 times faster than access to an instruction in the main memory. Assume that a requested instruction is found in the cache with probability 0. 96 and also assume that if an instruction is not found in the cache, it must first be fetched from main memory to the cache and then fetched from the cache to be executed. Compute the ratio of program execution time without the cache to program execution time with the cache. This ratio is usually defined as the speedup factor resulting from the presence of the cache. b) If the size of the cache is doubled, assume that the probability of not finding a requested instruction there is cut in half. Repeat Part a for doubled cache size. 52

a) Let cache access time be 1 unit of time and main memory access time be 20 units of time. Cache hit = 0. 96 and cache miss = 0. 04 Program execution time without cache = 1 x 20 = 20 time units Program execution time with cache = (0, 96 x 1)+(0. 04 x 20) = 1. 76 time units Hence speedup = 20/1. 76 = 11. 36 b) Cache hit = 0. 98 and cache miss = 0. 02 Program execution time without cache = 1 x 20 = 20 time units Program execution time with cache = (0. 98 x 1)+(0. 02 x 20) = 1. 38 time units Hence speedup = 20/1. 38 = 14. 5 53

54

55

56

57

58

59

60

61

62

63

64

65

66

Instruction execution and sequencing • Recall the fetch/execute cycle of instruction execution. • In order to complete a meaningful task, a number of instructions need to be executed. • During the fetch/execution cycle of one instruction, the Program Counter (PC) is updated with the address of the next instruction: – PC contains the address of the memory location from which the next instruction is to be fetched. • When the instruction completes its fetch/execution cycle, the contents of the PC point to the next instruction. • Thus, a sequence of instructions can be executed to complete a task. 67

Instruction execution and sequencing (contd. . ) q Simple processor model q Processor has a number of general purpose registers. q Word length is 32 bits (4 bytes). q Memory is byte addressable. q Each instruction is one word long. q Instructions allow one memory operand per instruction. u One register operand is allowed in addition to one memory operand. q Simple task: Add two numbers stored in memory locations A and B. u Store the result of the addition in memory location C. u Move A, R 0 (Move the contents of location A to register R 0) Add B, R 0 (Add the contents of location B to register R 0) Move R 0, C (Move the contents of register R 0 to location C) 68

Instruction execution and sequencing (contd. . ) 0 Move A, R 0 4 Add 8 Move R 0, C B, R 0 Execution steps: Step I: -PC holds address 0. -Fetches instruction at address 0. -Fetches operand A. -Executes the instruction. -Increments PC to 4. Step II: -PC holds address 4. A B C -Fetches instruction at address 4. -Fetches operand B. -Executes the instruction. -Increments PC to 8. Step III: -PC holds address 8. -Fetches instruction at address 8. -Executes the instruction. -Stores the result in location C. Instructions are executed one at a time in order of increasing addresses. “Straight line sequencing” 69

Instruction execution and sequencing (contd. . ) q Consider the following task: Add 10 numbers. u Number of numbers to be added (in this case 10) is stored in location N. u Numbers are located in the memory at NUM 1, . . NUM 10 u Store the result in SUM. u Move NUM 1, R 0 Add NUM 2, R 0 Add NUM 3, R 0 Add NUM 4, R 0 Add NUM 5, R 0 Add NUM 6, R 0 Add NUM 7, R 0 Add NUM 8, R 0 Add NUM 9, R 0 Add NUM 10, R 0 Move R 0, SUM (Move the contents of location NUM 1 to register R 0) (Add the contents of location NUM 2 to register R 0) (Add the contents of location NUM 3 to register R 0) (Add the contents of location NUM 4 to register R 0) (Add the contents of location NUM 5 to register R 0) (Add the contents of location NUM 6 to register R 0) (Move the contents of register R 0 to location SUM) 70

Instruction sequencing and execution (contd. . ) q Separate Add instruction to add each number in a list, leading to a long list of Add instructions. q Task can be accomplished in a compact way, by using the Branch instruction. Move N, R 1 (Move the contents of location N, which is the number of numbers to be added to register R 1) Clear R 0 (This register holds the sum as the numbers are added) LOOP Determine the address of the next number. Add the next number to R 0. Decrement R 1 (Counter which indicates how many numbers have been added so far). Branch>0 LOOP (If all the numbers haven’t been added, go to LOOP) Move R 0, SUM 71

Instruction execution and sequencing (contd. . ) • Decrement R 1: Initially holds the number of numbers that is to be added (Move N, R 1). – Decrements the count each time a new number is added (Decrement R 1). – Keeps a count of the numbers added so far. – • Branch>0 LOOP: Checks if the count in register R 1 is 0 (Branch > 0) – If it is 0, then store the sum in register R 0 at memory location SUM (Move R 0, SUM). – If not, then get the next number, and repeat (go to LOOP). Go to is specified implicitly. – • Note that the instruction (Branch > 0 LOOP) has no explicit reference to register R 1. 72

Instructions execution and sequencing (contd. . ) q Processor keeps track of the information about the results of previous operation. q Information is recorded in individual bits called “condition code flags”. Common flags are: N (negative, set to 1 if result is negative, else cleared to 0) u Z (zero, set to 1 if result is zero, else cleared to 0) u V (overflow, set to 1 if arithmetic overflow occurs, else cleared) u C (carry, set to 1 if a carry-out results, else cleared) u q Flags are grouped together in a special purpose register called “condition code register” or “status register”. If the result of Decrement R 1 is 0, then flag Z is set. Branch> 0, tests the Z flag. If Z is 1, then the sum is stored. Else the next number is added. 73

Instruction execution and sequencing (contd. . ) • Branch instructions alter the sequence of program execution Recall that the PC holds the address of the next instruction to be executed. – Do so, by loading a new value into the PC. – Processor fetches and executes instruction at this new address, instead of the instruction located at the location that follows the branch. – New address is called a “branch target”. – • Conditional branch instructions cause a branch only if a specified condition is satisfied – Otherwise the PC is incremented in a normal way, and the next sequential instruction is fetched and executed. • Conditional branch instructions use condition code flags to check if the various conditions are satisfied. 74

Instruction sequencing and execution (contd. . ) q How to determine the address of the next number? q Recall the addressing modes: Initialize register R 2 with the address of the first number using Immediate addressing. u Use Indirect addressing mode to add the first number. u Increment register R 2 by 4, so that it points to the next number. u Move N, R 1 Move #NUM 1, R 2 (Initialize R 2 with address of NUM 1) Clear R 0 LOOP Add (R 2), R 0 (Indirect addressing) Add #4, R 2 (Increment R 2 to point to the next number) Decrement R 1 Branch>0 LOOP Move R 3, SUM 75

Instruction execution and sequencing (contd. . ) q Note that the same can be accomplished using “autoincrement mode”: Move N, R 1 Move #NUM 1, R 2 (Initialize R 2 with address of NUM 1) Clear R 0 LOOP Add (R 2)+, R 0 (Autoincrement) Decrement R 1 Branch>0 LOOP Move R 3, SUM 76