Chapter Goals Describe the system bus and bus

Chapter Goals • Describe the system bus and bus protocol • Describe how the CPU and bus interact with peripheral devices • Describe the purpose and function of device controllers • Describe how interrupt processing coordinates the CPU with secondary storage and I/O devices Systems Architecture, Fifth Edition 2

Chapter Goals (continued) • Describe how buffers, caches, and data compression improve computer system performance Systems Architecture, Fifth Edition 3

Systems Architecture, Fifth Edition 4

System Bus • Connects CPU with main memory and peripheral devices • Set of data lines, control lines, and status lines • Bus protocol – Number and use of lines – Procedures for controlling access to the bus • Subsets of bus lines: data bus, address bus, control bus Systems Architecture, Fifth Edition 5

Systems Architecture, Fifth Edition 6

Bus Clock and Data Transfer Rate • Bus clock pulse – Common timing reference for all attached devices – Frequency measured in MHz • Bus cycle – Time interval from one clock pulse to the next • Data transfer rate – Measure of communication capacity – Bus capacity = data transfer unit x clock rate Systems Architecture, Fifth Edition 7

Bus Protocol • Governs format, content, timing of data, memory addresses, and control messages sent across bus • We can’t let two devices put data on the bus at the same time. So we need access control. • Approaches for access control – Master-slave approach – traditional – CPU is bus master and all other devices are slaves Systems Architecture, Fifth Edition 8

Bus Protocol • Approaches for access control (transferring data without CPU): – Direct memory access (DMA) – DMA controller gets data from device and stores in RAM – Peer-to-peer buses – any device can become master via bus arbitration protocol Systems Architecture, Fifth Edition 9

Local Bus vs External Bus • Traditionally, the local bus is connected to CPU and cache and RAM and other internal devices • External bus connects the main processing unit to I/O devices • Differences between local and external buses is getting fuzzy – new bus protocols can support both Systems Architecture, Fifth Edition 10

Parallel vs Serial Bus • Parallel bus is older technology in which bus is a connection of wires that a devices “plugs into” • Serial bus interconnects one device after another and creates a daisy-chain of devices • Timing skew has become a problem with parallel bus design Systems Architecture, Fifth Edition 11

Serial Bus vs Parallel Bus Parallel Serial (daisychain) Systems Architecture, Fifth Edition 12

Example System Buses • IBM PC Bus - 8 bit data 20 bit address, used in all early IBM PCs and clones. • PC-AT bus (ISA) - Compatible with PC bus, but has second strip of connectors with an additional 36 lines. These lines give a 16 bit data bus for 80286 chip • VESA Local bus (VL-bus or VLB) – found alongside ISA bus in pcs; acted as a high-speed bus for DMA and memory-mapped I/O; aka Very Long Bus! Systems Architecture, Fifth Edition 13

Example System Buses • IBM Microchannel - Bus for IBM PS/2 computer; closed architecture with high licensing costs • EISA (Extended industry standard architecture) Several non-IBM companies reacted to Microchannel and designed EISA. Provides for 32 bit data bus. Systems Architecture, Fifth Edition 14

VME Bus • Used in SGI systems • Begun by Motorola, became an IEEE standard (IEEE P 1014) • 32 bit bus, asynchronous design (see next slide) • No circuitry on motherboard • Hundreds of companies design board for VME, 300 page set of VME definitions, very stable • Bus lines provide automatic self-testing and status reporting • Now VME 64 with a 64 -bit bus Systems Architecture, Fifth Edition 15

Systems Architecture, Fifth Edition 16

PCI Bus • (Peripheral component interconnect) - used in pc and Mac systems • Well defined and fast • Is the local bus in a machine with other buses • Intel based; CPU bus and peripherals plug directly into PCI bus • Allows devices to talk to each other without CPU intervention Systems Architecture, Fifth Edition 17

Older architecture Systems Architecture, Fifth Edition 18

Backside bus CPU Front side bus (system bus) Cache RAM northbridge chip Video card PCI bus Real time clock USB Power management Other devices southbridge chip Newer architecture Systems Architecture, Fifth Edition 19

PCI Bus • Plug-in boards have software settings, not DIP switches • 532 Mbps transfer speed (PCI v. 3. 0) • Synchronous bus (see figure on next slide) • Initiator and target design (master/slave) • Address and data lines multiplexed Systems Architecture, Fifth Edition 20

Systems Architecture, Fifth Edition 21

PCI Bus • OS queries all PCI buses at boot time to find out what devices are present and what system resources (interrupt lines, memory, etc. ) each needs. It then allocates the resources and tells each device what its allocation is. • Each device can request up to six areas of memory space or I/O port space Systems Architecture, Fifth Edition 22

PCI Versions • • 32 -bit, 33 MHz (5 V, added in Rev. 2. 0) 64 -bit, 33 MHz (5 V, added in Rev. 2. 0) 32 -bit, 66 MHz (3. 3 V only, added in Rev. 2. 1) 64 -bit, 66 MHz (3. 3 V only, added in Rev. 2. 1)

PCI-X • PCI-extended • Twice as fast as PCI – 1. 06 GB/s • Designed for servers to support Gigabit Ethernet cards, Fibre Channel and Ultra 320 SCSI controllers • PCI-X backwards compatible with older PCI standards (except the 5 v ones) • PCI-X only runs as fast as the slowest device • In 2003 PCI SIG ratified PCI-X 2. 0 which added 266 MHz and 533 MHz options, or roughly 2. 15 GB/s and 4. 3 GB/s throughput (but losing ground to PCIe) • PCI-X 3. 0 in development, but how far with popularity of PCIe?

PCI-Express • PCIe or PCI-E • Not the same as PCI is a parallel bus, where PCIe is a serial bus (like USB) • Hub on motherboard acts as crossbar switch allowing multiple simultaneous full-duplex connections • Serial format starting to win out over parallel format due in part to timing skew • PCIe is a layered protocol, consisting of a Transaction Layer, a Data Link Layer, and a Physical Layer (fairly complex, like USB)

From top to bottom – PCIe x 4, x 16, and an older PCI connector from Wikipedia A PCIe card will fit in any slot that is at least wide enough

SCSI (Small Computer System Interface) • Family of standard buses designed primarily for secondary storage devices • Most often used for disk drives but can interface pretty much any device • Implements both a low-level physical I/O protocol and a high-level logical device control protocol Systems Architecture, Fifth Edition 27

SCSI Interfaces – Parallel • Still common is the older parallel SCSI (aka SPI) • Popular forms include – – SCSI-1 Fast SCSI Fast-Wide SCSI Ultra Wide SCSI • See handout on parallel SCSI specs Systems Architecture, Fifth Edition 28

SCSI Interfaces – Serial • Serial SCSI – modern addition to SCSI system • Faster data rates, hot swapping, and improved fault isolation among the advantages of serial SCSI • Once again clock skew issue of high speed parallel interfaces is driving the change from parallel to serial Systems Architecture, Fifth Edition 29

SCSI Interfaces – i. SCSI • SCSI command set stays the same, its just that the physical specifications essentially no longer exist • Physical specs are TCP/IP • SCSI-3 implemented over a network • i. SCSI competing with Fibre Channel • Many felt i. SCSI would not be as fast as Fibre Channel due to TCP/IP overhead, but now systems are using TCP Offload Engine and 10 G Ethernet Systems Architecture, Fifth Edition 30

Systems Architecture, Fifth Edition 31

Systems Architecture, Fifth Edition 32

Desirable Characteristics of a SCSI Bus • • Non-proprietary standard High data transfer rate Peer-to-peer capability High-level (logical) data access commands Multiple command execution Interleaved command execution But typically quite a bit more expensive. Systems Architecture, Fifth Edition 33

I/O Ports • I/O ports are the pathways between the CPU and a peripheral device • Logical and Physical Access – Usually a memory address that can be read/written by the CPU and a single peripheral device – Also a logical abstraction that enables CPU and bus to interact with each peripheral device as if the device were a storage device with linear address space Systems Architecture, Fifth Edition 34

Physical access: System bus is usually physically implemented on a large printed circuit board with attachment points for devices. Systems Architecture, Fifth Edition 35

Logical access: The device, or its controller, translates linear sector address into corresponding physical sector location on a specific track and platter. Systems Architecture, Fifth Edition 36

Device Controllers • Implement the bus interface and access protocols • Translate logical addresses into physical addresses • Enable several devices to share access to a bus connection Systems Architecture, Fifth Edition 37

Systems Architecture, Fifth Edition 38

Mainframe Channels • Advanced type of device controller used in mainframe controllers • Compared with device controllers: – Greater data transfer capacity – Larger maximum number of attached peripheral devices – Greater variability in types of devices that can be controlled Systems Architecture, Fifth Edition 39

Interrupt Processing • Used by application programs to coordinate data transfers to/from peripherals, notify CPU of errors, and call operating system service programs • When interrupt is detected, executing program is suspended; pushes current register values onto the stack and transfers control to an interrupt handler • When interrupt handler finishes executing, the stack is popped and suspended process resumes from point of interruption Systems Architecture, Fifth Edition 40

Interrupt Processing • Secondary storage and I/O devices are much slower than RAM, ROM, cache memory, and the CPU (see table on next slide) • When the CPU asks for data from an I/O device, what should the CPU do? – Sit in a wait cycle? – Go do something else? Systems Architecture, Fifth Edition 41

Interrupt Processing Systems Architecture, Fifth Edition 42

Multiple Types of Interrupts • Categories of interrupts – – I/O event Error condition Service request Processor to processor communication • Can one interrupt be interrupted by another type of interrupt? Systems Architecture, Fifth Edition 43

Systems Architecture, Fifth Edition 44

Buffers and Caches • Improve overall computer system performance by employing RAM to overcome mismatches in data transfer rate and data transfer unit size Systems Architecture, Fifth Edition 45

Buffers • Small storage areas (usually DRAM or SRAM) that hold data in transit from one device to another • Use interrupts to enable devices with different data transfer rates and unit sizes to efficiently coordinate data transfer • Buffer overflow Systems Architecture, Fifth Edition 46

Classic example of a buffer: a print buffer Systems Architecture, Fifth Edition 47

Computer system performance improves dramatically with larger buffer. Systems Architecture, Fifth Edition 48

Assumes a 32 -bit bus Computer system performance improves dramatically with larger buffer. Systems Architecture, Fifth Edition 49

2 interrupts each time we fill up the buffer. Buffer will be filled 64 KB/buffer size times Computer system performance improves dramatically with larger buffer. Systems Architecture, Fifth Edition 50

Sum of bus transfers and bus interrupts Computer system performance improves dramatically with larger buffer. Systems Architecture, Fifth Edition 51

Assumes 100 CPU cycles to handle an interrupt. Computer system performance improves dramatically with larger buffer. Systems Architecture, Fifth Edition 52

Diminishing Returns • When multiple resources are required to produce something useful, adding more and more of a single resource produces fewer and fewer benefits • Applicable to buffer size Systems Architecture, Fifth Edition 53

Similar chart to the last one, but now the amount to transfer is 64 B instead of 64 KB. Note how improvement stops once the buffer size equals the transfer amount. Law of diminishing returns affects both bus and CPU performance Systems Architecture, Fifth Edition 54

Cache • Differs from buffer: – – – Data content not automatically removed as used Used for bidirectional data Used only for storage device accesses Usually much larger Content must be managed intelligently • Achieves performance improvements differently for read and write accesses Systems Architecture, Fifth Edition 55

Write access: Sending confirmation (2) before data is written to secondary storage device (3) can improve program performance; program can immediately proceed with other processing tasks. Systems Architecture, Fifth Edition 56

Read accesses are routed to cache (1). If data is already in cache, it is accessed from there (2). If data is not in cache, it must be read from the storage device (3). Performance improvement realized only if requested data is already waiting in cache. Systems Architecture, Fifth Edition 57

Cache Controller • Processor that manages cache content • Guesses what data will be requested; loads it from storage device into cache before it is requested • Can be implemented in – A storage device storage controller or communication channel – Operating system Systems Architecture, Fifth Edition 58

Cache Primary storage cache Secondary storage cache • Can limit wait states by • Gives frequently accessed using SRAM cached files higher priority for between CPU and SDRAM cache retention primary storage • Uses read-ahead caching for • Level one (L 1): within CPU files that are read sequentially • Level two (L 2): on-chip • Gives files opened for • Level three (L 3): off-chip random access lower priority for cache retention Systems Architecture, Fifth Edition 59

Intel Itanium® 2 microprocessor uses three levels of primary storage caching. Systems Architecture, Fifth Edition 60

Processing Parallelism • Increases computer system computational capacity; breaks problems into pieces and solves each piece in parallel with separate CPUs • Techniques – Multicore processors – Multi-CPU architecture – Clustering Systems Architecture, Fifth Edition 61

Multicore Processors • Include multiple CPUs and shared memory cache in a single microchip • Typically share memory cache, memory interface, and off-chip I/O circuitry among the cores • Reduce total transistor count and cost and provide synergistic benefits Systems Architecture, Fifth Edition 62

Systems Architecture, Fifth Edition 63

Multi-CPU Architecture • Employs multiple single or multicore processors sharing main memory and the system bus within a single motherboard or computer system • Common in midrange computers, mainframe computers, and supercomputers • Cost-effective for – Single system that executes many different application programs and services – Workstations Systems Architecture, Fifth Edition 64

Scaling Up • Increasing processing by using larger and more powerful computers • Used to be most cost-effective • Still cost-effective when maximal computer power is required and flexibility is not as important Systems Architecture, Fifth Edition 65

Scaling Out • Partitioning processing among multiple systems • Speed of communication networks; diminished relative performance penalty • Economies of scale have lowered costs • Distributed organizational structures emphasize flexibility • Improved software for managing multiprocessor configurations Systems Architecture, Fifth Edition 66

High-Performance Clustering • Connects separate computer systems with highspeed interconnections • Used for the largest computational problems (e. g. , modeling three-dimensional physical phenomena) Systems Architecture, Fifth Edition 67

Partitioning the problem to match the cluster architecture ensures that most data exchange traverses high-speed paths. Systems Architecture, Fifth Edition 68

Compression • Reduces number of bits required to encode a data set or stream • Effectively increases capacity of a communication channel or storage device • Requires increased processing resources to implement compression/decompression algorithms while reducing resources needed for data storage and/or communication Trading data size against CPU time Systems Architecture, Fifth Edition 69

Compression Algorithms • Vary in: – – Type(s) of data for which they are best suited Whether information is lost during compression Amount by which data is compressed Computational complexity • Lossless versus lossy compression Systems Architecture, Fifth Edition 70

Compression can be used to reduce disk storage requirements (a) or to increase communication channel capacity (b). Systems Architecture, Fifth Edition 71

Exploits varying sensitivity of the ear to sounds to perform lossy compression MPEG standards address recording and encoding formats for both images and sound. Systems Architecture, Fifth Edition 72

Chip Interfacing • You are working for Nokia on a new cellphone • This phone will have a processor, one EPROM, one RAM, and an I/O chip to control display and keyboard • The processor has a 16 -bit address bus • With 16 bits, you can have 65, 536 bytes of storage Systems Architecture, Fifth Edition 73

Chip Interfacing • For the I/O chip, we could attach it as an I/O device, then set CS line on PIO to IORQ line on CPU • Or we could choose a particular address and have that address go into the CS line of the I/O chip • The latter form is called memory-mapped I/O Systems Architecture, Fifth Edition 74

Chip Interfacing • The I/O chip needs 4 bytes of address space (3 I/O ports and 1 status register) • The EPROM is an 8 K chip so it needs 8 K of address space (13 bits needed to select 8 K) • Likewise, the RAM needs 8 K of address space Systems Architecture, Fifth Edition 75

Chip Interfacing • You don’t want addresses of chips to overlap, so place the devices in memory as follows: – EPROM starts at address 0 (0000 h) and is 8 K (8192, or 2000 h) long so ends at 1 FFFh – RAM starts at address 32 K (32, 768, or 8000 h) and is 8 K long so ends at 9 FFFh – I/O starts at address 65532 (FFFCh) and is 4 bytes long so ends at 65535 (FFFFh) Systems Architecture, Fifth Edition 76

Chip Interfacing • So, hexadecimal address ranges for each chip are: – EPROM: – RAM: – I/O: 0000 – 1 FFF 8000 – 9 FFF FFFC – FFFF • That would place the devices at the following binary addresses: – EPROM: – RAM: – I/O: Systems Architecture, Fifth Edition 000 xxxxxxxxxxxxx 11111111 xx 77

Memory Allocation EPROM 0 K 8 K-1 Systems Architecture, Fifth Edition RAM 32 K I/O 40 K-1 65532 -65535 78

Interface A 0 A 12 A 13 A 14 A 15 : EPROM RAM I/O ~CS ~CS Systems Architecture, Fifth Edition 79

Summary • How the CPU uses the system bus and device controllers to communicate with secondary storage and input/output devices • Hardware and software techniques for improving data efficiency, and thus, overall computer system performance: bus protocols, interrupt processing, buffering, caching, and compression Systems Architecture, Fifth Edition 80