The Network Architecture of the Connection Machine CM5








- Slides: 8
The Network Architecture of the Connection Machine CM-5 Charles E. Leiserson et al (Thinking Machines Corporation) Presented by Eric Carty-Fickes 1/28/04
major themes of CM-5 n n good performance (measured how? ) ease of use (by programmers), flexibility q q n availability, reliability q n let programmer access nonpriveleged functions do not involve OS if possible use commodity parts and same part when possible – economy of mechanism split system into three separate networks (data, control, diagnostic)
network interface n n n same interface for data and control networks provides context switching capability, makes processor save state interface appears as memory-mapped FIFO registers q n n protection enforced by processor users access relative processor addresses only; easy protection and error checking users unaware of network topology
data network n fat-tree architecture used q q n n n keeps local traffic separate can be adapted to various bandwidth schemes keeps traffic balanced claimed near-optimal data routing modified fat-tree uses two input and two output FIFO's to guarantee no deadlock variable-length packets (fixed for control) bandwidth scales linearly to 16, 384 nodes
network protection n n flow control sent to message originator to protect buffers central clock synchronizes everything (good idea? ) messages tagged with routing and processing info plus error check errors traced to origin (how many simultaneous errors detected/masked? ) all-fall-down mode saves in-flight messages in random nodes
control network n n n n synchronizes processing nodes checks contract between processors and data network, reports errors hybrid MIMD architecture q combines SIMD's broadcasting with ability to run different parts of code barrier synchronization = line of code all processors must reach before continuing q improved with split-phase barriers broadcast = individual processors send out mass interrupts, code, data, etc. combining = select sets of nodes (only certain functions available) Kirchoff's law for messages assures at least no pair of messages lost
diagnostic network n n n goal of functionality independence, use JTAG individual chips and collections can be tested network tree inherently self-testing hierarchy
Questions n n are there any errors, glaring or minor, that you can see with CM-5? do you really agree with the authors that it is okay to allow a user to cause deadlock? q q n should there be a check in place to prevent it? might it not prevent an error in the network from being detected? would CM-5 really work just as well as technology progressed?