STARFIRE Extending the SMP Envelope Alan Charlesworth Sun










- Slides: 10
STARFIRE Extending the SMP Envelope Alan Charlesworth, Sun Microsystems Presented by Mahmut Yilmaz 01/30/2006
Introduction • Cache-coherent symmetric multi-processor architecture (24 -64 Ultra. Sparc II microprocessors running @ 250 MHz, sharing max 64 GB memory) • Goals: – Increase system memory bandwidth – Reduce memory latency as much as possible – Provide Unix server flexibility (Dynamic System Domains) – Improve system reliability, availability, and serviceability • How? Starfire cabinet 1 – Multiple outstanding cache misses, split-transaction buses, separate address and data paths, wider data paths, higher system clocks… 1 http: //www. filibeto. org/~aduritz/true/e 10000/starfire-interconnect. html
Improvements in Bus Capacity • Bus snooping rate increased from 2. 5 M/s (1991) to 167 M/s (1997). How? : – Bus clock rate: 40 MHz 100 MHz – Circuit switched protocol Packet switched protocol (separates requests from replies) – Separate address and data bus – Interleaved multiple snoop buses (4 address buses) – Cache block size: 32 bytes 64 bytes – Data bus: 8 bytes 16 bytes
System
STARFIRE Interconnect • Ultra Port Architecture interconnect (writeback MOESI – X-Modified, SModified, X-Clean, S-Clean, Invalid) – Packet-switched data transaction with ECC codes (16 bytes + 2 bytes) • 2 level interconnect – On-board: Processor, SBUS cards, memory Address & data ports – Centerplane: Transfers address & data between boards • 2 clock cycle address transaction – 2 low-order cache-block address bits determine the address bus to use • 8 clock cycle board to board data transfer • Buses vs Point-to-point routers? – Buses: Faster – Point-to-point routers: Bandwidth, partitioning, reliability, availability, and serviceability
Interconnect Reliability • ECC for both address and data bus (ASIC help) • Failed components: System attempts to recover without any service interruption • Redundant components: Optional • Crash recovery: Automatic System Recovery (ASR) – requires redundant components
Dynamic System Domains • Starfire can be easily subdivided into domains for boards using System Service Processor: • Each domain is a totally isolated SMP • Good for many applications: LAN consolidation; development, production and test; software migration; special I/O or network functions … Picture: http: //www. filibeto. org/~aduritz/true/e 10000/starfire-interconnect. html
Price/Performance • Throughput is higher than other HPC computers with similar cost • Easy administration, reliable and serviceable 2 Alan Starfire Cost 2 Charlesworth, Nicholas Aneshansley, Mark Haakmeester, Dan Drogichen, Gary Gilbert , Ricki Williams, Andrew Phelps , “The Starfire SMP Interconnect”
Questions • Is the price/performance comparison to other HPC computers fair? • What if we have a cluster of Starfire servers? – Uniform latencies? • What is the performance cost adding redundancy? • Centerplane: Single Point of Failure? Can we replace centerplane?
Fair Comparison? Graph: Alan Charlesworth, Nicholas Aneshansley, Mark Haakmeester, Dan Drogichen, Gary Gilbert , Ricki Williams, Andrew Phelps , “The Starfire SMP Interconnect”