Tycho A Resource Discovery and Messaging Framework for

  • Slides: 26
Download presentation
Tycho: A Resource Discovery and Messaging Framework for Distributed Applications Matthew Grove matthew. grove@port.

Tycho: A Resource Discovery and Messaging Framework for Distributed Applications Matthew Grove matthew. grove@port. ac. uk Cluster 2006

Outline • Introduction and motivation, • The architecture of Tycho, • Implementation details, •

Outline • Introduction and motivation, • The architecture of Tycho, • Implementation details, • Updated benchmarking results, • Content distribution (Tycho swarm utility), • Summary. 1

Introduction • Tycho is a reference implementation of a combined extensible wide-area messaging framework

Introduction • Tycho is a reference implementation of a combined extensible wide-area messaging framework with a built in distributed registry: – The Tycho components are: • Mediators, • Clients (Producers and Consumers). – Tycho provides services to allow clients to discover each other using a Virtual Registry (VR) made up of a network of mediators – this also aids communication over both LAN and WAN. • Tycho aims to simplify and speed application development by freeing developers from the need to use combinations of software to provide discovery and messaging services. 2

Tycho’s Architecture 3

Tycho’s Architecture 3

General Design Philosophy • Reuse existing software components, if possible, rather than reinvent services

General Design Philosophy • Reuse existing software components, if possible, rather than reinvent services or functionality. • Try to make use of existing software infrastructure. • Make Tycho simple to install, configure and use. • Provide a ‘basic release’ with the ability to extend functionality with a further more sophisticated component (Tycho utilities). • Java was used for portability and interoperability with other distributed systems, plus rapid development. 4

The Two Parts of Tycho • Messaging: –Secure asynchronous communications between Consumers, Producers and

The Two Parts of Tycho • Messaging: –Secure asynchronous communications between Consumers, Producers and Mediators. • Virtual Registry: –Boot-strapping – allows mediators to discover each other and form the VR with minimal hardwiring. –Communications - secure routing of queries between Virtual Registries. –Caching: keep a temporary local copy of some information to reduce the amount of communications between peers. 5

Tycho Mediator Implementation • Tycho provides a choice of implementations for each core service.

Tycho Mediator Implementation • Tycho provides a choice of implementations for each core service. 6

Tycho Clients & Utilities • The Tycho Connector provides the API for building producers

Tycho Clients & Utilities • The Tycho Connector provides the API for building producers and consumers. • Extra functionality can be added as utilities. 7

Tycho Core Services • Transport handler, allows different protocols to be used for communications

Tycho Core Services • Transport handler, allows different protocols to be used for communications (HTTP(S), Sockets, IRC). • Local store, for a mediator and VR information (JDBC, Java simple store). • Boot service, used by the VR within a mediator to locate and join the rest of the VR (HTTP(S), IRC). • Query parser and result annotator, to support different query and markup languages (SQL, LDIF). 8

Example Tycho Setup 9

Example Tycho Setup 9

Tycho Benchmarks • Three rounds of benchmarking were performed: –Communications (A) - measured the

Tycho Benchmarks • Three rounds of benchmarking were performed: –Communications (A) - measured the performance of inter-client and inter-mediator messaging for Tycho and Narada. Brokering. –Virtual Registry tests (B) - measured and compared the performance of the Tycho VR to MDS 4 and R-GMA. –Component Tests (C) - different components of the VR were tested in various configurations – these tests are discussed elsewhere. 10

Communications - Latency • The latency of communication for LAN and simulated WAN messaging

Communications - Latency • The latency of communication for LAN and simulated WAN messaging was measured. • The tests used two clients with varying message size (pingpong tests). • An eight node cluster was used to run the tests. 11

Communication Tests - Summary • Tycho has a lower latency and higher bandwidth than

Communication Tests - Summary • Tycho has a lower latency and higher bandwidth than Narada. Brokering in all the tests. • With respect to scalability of producers and consumers, when either systems is saturated, the performance is stable under heavy load, however: –Narada. Brokering needs the JVM heap size to be increased as the number of clients increases (due to internal buffers): • Tycho used the default heap for all of the tests. 12

Virtual Registry Tests (B) • Two tests were used to measure aspects of the

Virtual Registry Tests (B) • Two tests were used to measure aspects of the performance of Tycho’s VR, MDS 4 and R-GMA: 1. Number of records in a registry (100, 000 records), 2. Number of simultaneous client queries (1000 clients). • The tests were repeated with two different queries: – (S 1) a single random record was selected, – (S 2) all of the records were selected (worst case scenario). 13

VR Tests - Records (S 1) MDS 4 out of memory 14

VR Tests - Records (S 1) MDS 4 out of memory 14

VR Tests - Records (S 2) MDS 4 out of memory 15

VR Tests - Records (S 2) MDS 4 out of memory 15

VR Tests - Clients (S 1) R-GMA out of memory 16

VR Tests - Clients (S 1) R-GMA out of memory 16

Summary of VR Tests • Tycho has a better performance and clientscalability than both

Summary of VR Tests • Tycho has a better performance and clientscalability than both R-GMA and MDS 4. • The heap R-GMA and MDS 4 has to be set to 1. 5 Gbytes (the max we could set) to carry out the tests. • Memory management in Java is an issue: – Without limited buffering or flow control, consuming the Java heap is a problem. • Storing information internally using XML seems to be a source for some of these memory problems: – Java database solutions such as HSQDLB can provide a high-performance solution for off-loading some of the storage requirements to disk. 17

Tycho Core – Future Work • Some performance improvements: – Caching of local mediator

Tycho Core – Future Work • Some performance improvements: – Caching of local mediator queries to reduce response times, – Use of a hybrid VR-interconnect to use IRC for query routing and HTTP for transporting large responses. • Additional functionality can be added to provide advanced services: – WS-based transport handlers for interoperability. 18

Tycho Applications • We developed a number of applications to further validate the implementation.

Tycho Applications • We developed a number of applications to further validate the implementation. • These include: – Demonstrations of publishing and discovering distributed webcams, – Remote resource discovery for the VOTech. Broker project, • Part of the European Virtual Observatory project, Tycho provides automatic resource discovery for job submission. – Binding components for the Semantic Log Analyser (Slogger) project together: • Here Tycho helps locate and gather distributed logs for analysis. 19

The Tycho Swarm Utility • The swarm utility is a tool for distributed content

The Tycho Swarm Utility • The swarm utility is a tool for distributed content distribution. • The utility was developed to test the potential of Tycho utilities and also further stress test the overall infrastructure: – By simultaneously utilising the VR and messaging functions, – Storing and updating thousands of entries records in the VR, – Sending thousands of multi-megabyte messages between clients. • Its potential uses include: – Distributing files for collaboration purposes, – Staging data for computation, – Mirroring and managing large data sets. 20

Swarm Utility Overview • The swarm utility provides distributed content distribution similar to Bit.

Swarm Utility Overview • The swarm utility provides distributed content distribution similar to Bit. Torrent. • Content is split into ‘chunks’ and the VR is used to store chunk availability. • Peers use the VR to locate each other and decide what chunks to download. • Tycho messages are used to transfer the chunks between peers and peers cooperate to distribute the content throughout the swarm. 21

Swarm Utility Architecture 22

Swarm Utility Architecture 22

Summary • The initial reference implementation of Tycho has been completed. • It can

Summary • The initial reference implementation of Tycho has been completed. • It can be downloaded from: – http: //dsg. port. ac. uk/projects/tycho/ • Both the messaging code and VR have been benchmarked and perform well. • The focus now is on developing Tycho utilities to provide more feature rich functionally. 23

Webcam Browser Demo http: //dsg. port. ac. uk/projects/tycho/demos/web/ 24

Webcam Browser Demo http: //dsg. port. ac. uk/projects/tycho/demos/web/ 24

Links • Project Web page: – http: //dsg. port. ac. uk/projects/tycho/ • The DSG

Links • Project Web page: – http: //dsg. port. ac. uk/projects/tycho/ • The DSG Web page: – http: //dsg. port. ac. uk/ • The ACET Web page: – http: //acet. port. ac. uk/ 25