By Nitin Bahadur Gokul Nadathur Department of Computer
By Nitin Bahadur Gokul Nadathur Department of Computer Sciences University of Wisconsin-Madison Spring 2000
Talk Outline • • Motivation and Goals General Architecture of the middleware Components of the middleware Providing reliability - handling of node failures Applications developed using the middleware Performance Conclusions and possible extensions Multicast / Reduction Trees Spring 2000 2
Motivation and Goals • A middleware for an application with Master - Worker paradigm • Scalable framework for communication and computing client response (“Reduction”) • Unicast does not scale - so use multicast • Introducing reduction operations dynamically in clients • A general framework for communication among clients Multicast / Reduction Trees Spring 2000 3
The Big Picture. . . Sends queries Reduces results Hands back results to application Master App ARTL Client App ARTL Multicast / Reduction Trees Execute responses to queries Forward queries downstream Reduces incoming results Sends reduced results to master Executes responses to queries Sends back results towards master Spring 2000 4
ART - Library Architecture Application specific callbacks Application API ARTL specific message Framework for processing messages Event Handler Outgoing message Incoming Packet Reduction functions ARTL Communication Layer Network ARTL messages : 1. Query from master 2. Response from downstream nodes Multicast / Reduction Trees Spring 2000 5
ART - Library Architecture Application specific callbacks Application API ARTL specific message Framework for processing messages Event Handler Outgoing message Incoming Packet Reduction functions ARTL Communication Layer Network ARTL messages : 1. Query from master 2. Response from downstream nodes Multicast / Reduction Trees Spring 2000 6
Communication Subsystem • Connection Setup – Connect nodes as a Binomial tree • Send and receive ARTL and application messages • Detect node failure and act accordingly • Integrate restarted node in current tree structure Multicast / Reduction Trees Spring 2000 7
Why use Binomial Tree Client App Master App 1 Client App 2 Client App 3 Master App Client App 2 1 2 Client App Binomial Tree Query Propagation time = 2 Multicast / Reduction Trees Unicast Mechanism Query Propagation time = 3 Spring 2000 8
Reduction at 5 and 3 5 7 1 3 6 4 Responses 8 Multicast / Reduction Trees Spring 2000 2 Example Reduction operations: Min(), Max() 9
Tree connection setup 1 5 7 3 6 2 4 8 Multicast / Reduction Trees Spring 2000 10
Tree Setup - Phase I 1 5 7 8 Multicast / Reduction Trees 3 6 2 4 TCP connection setup Spring 2000 11
Tree Setup - Phase II 1 5 7 8 Multicast / Reduction Trees 3 6 2 4 TCP connection setup Spring 2000 12
Tree Setup - Phase III 1 5 7 8 Multicast / Reduction Trees 3 6 2 4 TCP connection setup Spring 2000 13
Inter node communication ARTL Header Data • Unicast and multicast data transmission • ARTL receives application messages for which no receive has been posted – these are sent to a callback function registered by application • ARTL receives data on behalf of application when application explicitly posts a receive Multicast / Reduction Trees Spring 2000 14
ART - Library Architecture Application specific callbacks Application API ARTL Encapsulated message Framework for processing messages Event Handler Outgoing message Incoming Packet Reduction functions ARTL Communication Layer Network ARTL messages : 1. Query from master 2. Response from downstream nodes Multicast / Reduction Trees Spring 2000 15
Reduction Functions • Implemented as Shared objects • Sent to client during Setup phase • Each reduction function is associated with a particular response it reduces Multicast / Reduction Trees Spring 2000 16
Event Handler Responses for the shaded entry from down stream nodes Table containing Query id and Callback information for currently registered queries Run Queue of reduction/response operations Response Callback Multicast / Reduction Trees Network Reduced response sent upstream Thread Pool Event Handler Application Spring 2000 17
Multithreaded Architecture • No prior Knowledge about behavior of reduction function • Exploit concurrency - multiple processor per node • Static Pool of threads - Creation and destruction of threads is bad (Firefly RPC) Multicast / Reduction Trees Spring 2000 18
Crash Reconfiguration 1 5 7 3 6 2 4 8 Multicast / Reduction Trees Spring 2000 19
Crash Reconfiguration 1 5 7 8 3 6 4 Crash Reconfiguration at depth 1 Multicast / Reduction Trees Spring 2000 20
Crash Reconfiguration 1 5 7 8 3 4 6 Crash Reconfiguration at depth 2 Multicast / Reduction Trees Spring 2000 21
Crash Reconfiguration 1 5 7 8 3 6 2 4 Crash Reconfiguration at depth 1 Multicast / Reduction Trees Spring 2000 22
Crash Reconfiguration 1 3 7 8 6 2 4 Crash Reconfiguration at depth 1 Multicast / Reduction Trees Spring 2000 23
Crash Detection • Break in TCP connection with parent/child – a signal is received at the other end of connection • Use of periodic refresh messages to inform parent that child is up and running – useful in WAN environments Multicast / Reduction Trees Spring 2000 24
Crash Handling • Parent of node down informs master • All nodes are informed of a node failure • Master recomputes tree – If leaf node down, then no problem – If intermediate node down, some reconfiguration is required Multicast / Reduction Trees Spring 2000 25
Node Restart • Restarted node contacts master to tell it about restart • Master sends it current state of network and the shared object(s) • All nodes are informed of a node restart • Master recomputes tree and informs the new node’s parent about its new child • Parent and child establish connections Multicast / Reduction Trees Spring 2000 26
Sys. Mon - A System monitor Monitors the load average from /proc displays Min, Max and average loads Per-node load is also displayed ARTL Reduction operations : Min, Max and Average Multicast / Reduction Trees Spring 2000 27
Sys. Mon - A System monitor Node failures are detected and Sys. Mon pops up an alert Multicast / Reduction Trees Spring 2000 28
File Transfer Application • Transfers a file from master to all clients • File can be executed at clients (if required) – execution can be instantaneous on receiving file – execution can be delayed until all nodes have received the file Multicast / Reduction Trees Spring 2000 29
File Transfer Performance Multicast / Reduction Trees Spring 2000 30
Total Startup Time vs Number of Nodes Client processes started using ssh on different machines Multicast / Reduction Trees Spring 2000 31
Conclusions and Extensions • A middleware for dynamic operations • Support for crash detection, recovery and dynamic processes • Demonstrated near optimal speedup using real applications • Making response function dynamic - active services • Differential scheduling in thread scheduler for Qo. S • Making dynamic code secure Multicast / Reduction Trees Spring 2000 32
- Slides: 32