PeertoPeer Architectures Grid Infrastructures and Serviceoriented Architectures for

  • Slides: 38
Download presentation
Peer-to-Peer Architectures, Grid Infrastructures, and Service-oriented Architectures for Digital Libraries Heiko Schuldt Information &

Peer-to-Peer Architectures, Grid Infrastructures, and Service-oriented Architectures for Digital Libraries Heiko Schuldt Information & Software Engineering UMIT, Hall in Tyrol, Austria 16. 04. 2005 Joint work together with • Donatella Castelli (CNR, Pisa) • Gerhard Weikum (MPI, Saarbrücken) P 2 P, Grid, and So. A for Digital Libraries ― Heiko. WP Schuldt • and other DELOS 1 partners

Digital Libraries – The Past • Isolated and/or monolithic systems • Proprietary interfaces •

Digital Libraries – The Past • Isolated and/or monolithic systems • Proprietary interfaces • Limited to access to content of one single provider • … DL Management System DL Content 16. 04. 2005 P 2 P, Grid, and So. A for Digital Libraries ― Heiko Schuldt 2

Digital Libraries – The Future • Self-contained services rather than monolithic systems / DLs

Digital Libraries – The Future • Self-contained services rather than monolithic systems / DLs • Flexible access to services and content across DLs / content providers DL Services DL Management System DL Content 16. 04. 2005 P 2 P, Grid, and So. A for Digital Libraries ― Heiko Schuldt 3

Outline • Requirements of Next-Generation Digital Libraries (or Dynamic Ubiquitous Knowledge Environments) • Service-oriented

Outline • Requirements of Next-Generation Digital Libraries (or Dynamic Ubiquitous Knowledge Environments) • Service-oriented Architectures (So. A) • Peer-to-Peer Infrastructures (P 2 P) • Grid Architectures • P 2 P, Grid, and So. A for DL: Towards putting everything together 16. 04. 2005 P 2 P, Grid, and So. A for Digital Libraries ― Heiko Schuldt 4

Example 1: The Virtual Campus of a University Rope Bridges Advanced Querying: Find Similar

Example 1: The Virtual Campus of a University Rope Bridges Advanced Querying: Find Similar Image Eine Besonderheit besteht darin, dass sich die mit speziell geformten, variabel geneigten Kragarmen ausgestatteten Querträger direkt auf den Tragseilen absetzen. Am auffälligsten ist wohl der geringe Stich der Tragkabel, der mit 2, 30 m in der 144 m langen Mittelspannweite das traditionelle Stich-Spannweiten- … Insertion of new Project Description Coordination: DL Search Index • Replicate Project Description • Make Index Up-to-date Observe Changes Inst. of Building Tech. Web Servers of Research Groups Update Index Replicate Data Extract Features Report Repository University Research Reports 16. 04. 2005 . . . Image Database University Library P 2 P, Grid, and So. A for Digital Libraries ― Heiko Schuldt 5

Example 2: Health Monitoring and e. Health. Digital Libraries … Sensors Mobile device Patient

Example 2: Health Monitoring and e. Health. Digital Libraries … Sensors Mobile device Patient Blood pressure ECG Preprocessi ng ECG variability oxygen saturation activity context Preprocessi ng Has person fallen? oven, light off Home PC correlation and processing Healthcare provider 16. 04. 2005 Call emergency physicians/ neighbors processing historical data „e. Inclusion“ Analysis & prognosis P 2 P, Grid, and So. A for Digital Libraries ― Heiko Schuldt Electronic patient record 6

… Example 2: Health Monitoring and e. Health. Digital Libraries … Electronic Patient Record

… Example 2: Health Monitoring and e. Health. Digital Libraries … Electronic Patient Record Family doctor laboratory • Information is physically distributed • Virtual integration • Preservation of privacy 16. 04. 2005 Hospital Former family doctor P 2 P, Grid, and So. A for Digital Libraries ― Heiko Schuldt 7

… Example 2: Health Monitoring and e. Health. Digital Libraries examination Similar? • Similarity

… Example 2: Health Monitoring and e. Health. Digital Libraries examination Similar? • Similarity search in virtual digital health records – e. g. , multi-object multi-feature query – Query by example – Relevance feedback 16. 04. 2005 P 2 P, Grid, and So. A for Digital Libraries ― Heiko Schuldt 8

Requirements … • Specialized services, local to a content provider – Search • •

Requirements … • Specialized services, local to a content provider – Search • • • – – – 16. 04. 2005 Different media types Content-based Multi-object, multi-feature Relevance feedback … Indexing Annotation Metadata management Content management Resource Management … P 2 P, Grid, and So. A for Digital Libraries ― Heiko Schuldt 9

… Requirements … • Virtual DL (virtual collection) • Management of services which are

… Requirements … • Virtual DL (virtual collection) • Management of services which are – Distributed – Heterogeneous – Autonomous 16. 04. 2005 P 2 P, Grid, and So. A for Digital Libraries ― Heiko Schuldt 10

… Requirements … • Specialized services, across different providers – Search • • •

… Requirements … • Specialized services, across different providers – Search • • • Different media types Content-based Multi-object, multi-feature Relevance feedback … – Metadata management – Content management – Resource Management – Indexing without central control (no censorship) 16. 04. 2005 P 2 P, Grid, and So. A for Digital Libraries ― Heiko Schuldt 11

… Requirements … • Management of computationally intensive services – Scale-out – Load balancing

… Requirements … • Management of computationally intensive services – Scale-out – Load balancing • Composition of services – Defining complex services / processes / workflows on the basis of existing services – Flexibility: automatic adaptation 16. 04. 2005 P 2 P, Grid, and So. A for Digital Libraries ― Heiko Schuldt 12

… Requirements … • Notification of changes • Guaranteed consistency of derived data •

… Requirements … • Notification of changes • Guaranteed consistency of derived data • Personalization • Visualization • Access anywhere – Mobile devices • Context- and location-aware services • Authentication & authorization 16. 04. 2005 P 2 P, Grid, and So. A for Digital Libraries ― Heiko Schuldt 13

… Requirements … • High degree of availability: Access anytime – Replication • High

… Requirements … • High degree of availability: Access anytime – Replication • High degree of dependability / reliability – Systems their users can count on • High degree of scalability 16. 04. 2005 P 2 P, Grid, and So. A for Digital Libraries ― Heiko Schuldt 14

… Requirements • Continuously generated data – Sensor networks – Sensor data streams (hardware

… Requirements • Continuously generated data – Sensor networks – Sensor data streams (hardware / software sensors) • Monitoring of users 24/24 • Preservation of privacy 16. 04. 2005 P 2 P, Grid, and So. A for Digital Libraries ― Heiko Schuldt 15

Service-oriented Architectures • Self-contained functional units of applications encapsulated as services • Interaction between

Service-oriented Architectures • Self-contained functional units of applications encapsulated as services • Interaction between services by well-defined interfaces • Implementation details are hidden behind the service interface • Examples for service-oriented architectures: – DCOM – CORBA – Message-oriented middleware (e. g. , MQSeries, …) – Web services 16. 04. 2005 P 2 P, Grid, and So. A for Digital Libraries ― Heiko Schuldt 16

Web Services • Services to be invoked in a uniform way, independent of their

Web Services • Services to be invoked in a uniform way, independent of their implementation details (programming language, platform, etc. ) • Use common protocols (HTTP) for transport and XML for data representation – SOAP (Simple Object Access Protocol) for transport – WSDL (Web Services Description Language) for service description & discovery – UDDI (Universal Description, Discovery, and Integration) as service registry – BPEL 4 WS (Business Process Execution Language for Web Services) for the definition of composite services – And many other (more or less accepted) standards on the web service stack 16. 04. 2005 P 2 P, Grid, and So. A for Digital Libraries ― Heiko Schuldt 17

Web Services: Overview HTTP access. Meta. Data access. DLObjects Get. List. Of. Stocks Get.

Web Services: Overview HTTP access. Meta. Data access. DLObjects Get. List. Of. Stocks Get. Last. Trade. Price www. my. DL. com DL Services 16. 04. 2005 www. stockquoteserver. com Stock Data P 2 P, Grid, and So. A for Digital Libraries ― Heiko Schuldt Order www. pizza. at Pizza Service 18

SOAP: Architecture SOAP Client • SOAP spec: www. w 3. org/TR/soap/ • Unidirectional information

SOAP: Architecture SOAP Client • SOAP spec: www. w 3. org/TR/soap/ • Unidirectional information exchange • No common distributed object management infrastructure needed Request Document (XML Document with method call embedded) • Encapsulation of remote calls Get. Last. Trade. Price HTTP, . . . Unpack Request Document SOAP Server 16. 04. 2005 www. stockquoteserver. com Response Document Stock Data (XML Document with result) Method Call P 2 P, Grid, and So. A for Digital Libraries ― Heiko Schuldt 19

WSDL: Architecture Client SOAP Request Document Query: Services of the Stock. Quote. Server Get.

WSDL: Architecture Client SOAP Request Document Query: Services of the Stock. Quote. Server Get. Last. Trade. Price HTTP, . . . WSDL Document (XML Document with Method Specification) WSDL Server 16. 04. 2005 • WSDL spec: www. w 3. org. TR/wsdl/ • Implementation-independent description of – methods of web services and – their interfaces www. stockquoteserver. com Stock Data SOAP Server P 2 P, Grid, and So. A for Digital Libraries ― Heiko Schuldt 20

UDDI: Architecture & Usage Client UDDI: www. uddi. org Query: stock Services www. nyse.

UDDI: Architecture & Usage Client UDDI: www. uddi. org Query: stock Services www. nyse. com www. stockquoteserver. com www. nyse. com HTTP UDDI-Repository exchange UDDI-Repository … WSDL Server Register Service Stock Data SOAP Server www. stockquoteserver. com 16. 04. 2005 P 2 P, Grid, and So. A for Digital Libraries ― Heiko Schuldt 21

BPEL 4 WS Life Sciences Data … Probe Mass spectrograph Meta data Analysis Visualization

BPEL 4 WS Life Sciences Data … Probe Mass spectrograph Meta data Analysis Visualization • BPEL 4 WS: Business Process Execution Language for Web Services • Specification of business process behavior based on web services – Executable business processes model: actual behavior of a participant in a business interaction – Business protocols (abstract processes): specify the mutually visible message exchange behavior of each of the parties involved in the protocol, without revealing their internal behavior 16. 04. 2005 P 2 P, Grid, and So. A for Digital Libraries ― Heiko Schuldt 22

The “Rest” of the Web Service Stack • WS-Discovery • WS-Topics • WS-Interoperability •

The “Rest” of the Web Service Stack • WS-Discovery • WS-Topics • WS-Interoperability • WS-Atomic. Transaction • WS-Inspection Language • WS-Coordination • WS-Manageability • WS-Business. Activity • WS-Addressing • WS-Transaction. Management • WS-Message. Delivery • WS-Security • WS-Routing • WS-Trust • WS-Reliability • WS-Federation • WS-Eventing • WS-Choreography • WS-Composite. Application. Framework • WS-Conversation. Language • WS-Notification ( WS-Base. Notification / WS-Brokered. Notification) • WS-Context 16. 04. 2005 • … P 2 P, Grid, and So. A for Digital Libraries ― Heiko Schuldt 23

So. A and Future Digital Libraries • So. A is the core backbone of

So. A and Future Digital Libraries • So. A is the core backbone of DUKEs • Definition, invocation, description of services • Service directories • Basic service composition 16. 04. 2005 P 2 P, Grid, and So. A for Digital Libraries ― Heiko Schuldt 24

Peer-to-Peer Architectures … Serving Peer Napster Server Query List of Serving Peers Main goals

Peer-to-Peer Architectures … Serving Peer Napster Server Query List of Serving Peers Main goals and features: – Decentralization – Sharing of distributed resources – Autonomy – Self-organization and autonomic behavior Resource sharing Requesting Peer Request File Deliver File • Example: Napster, featuring a central index Serving Peer 16. 04. 2005 P 2 P, Grid, and So. A for Digital Libraries ― Heiko Schuldt 25

… Peer-to-Peer Architectures … • Example: gnutella – Completely distributed – Uses flooding algorithm

… Peer-to-Peer Architectures … • Example: gnutella – Completely distributed – Uses flooding algorithm Serving Peer Query Forwarding Delivery Serving Peer 16. 04. 2005 P 2 P, Grid, and So. A for Digital Libraries ― Heiko Schuldt 26

… Peer-to-Peer Architectures … • 1 st generation: focus on MP 3 sharing. But

… Peer-to-Peer Architectures … • 1 st generation: focus on MP 3 sharing. But also: • Computer resources – Processors – Memory – Disks, … • Intellectual resources like – User annotations – Recommendations – … • Other applications – Collaborative authoring – Groupware – Publish-subscribe applications, … 16. 04. 2005 P 2 P, Grid, and So. A for Digital Libraries ― Heiko Schuldt 27

… Peer-to-Peer Architectures Next generation P 2 P systems: • Distribute index over a

… Peer-to-Peer Architectures Next generation P 2 P systems: • Distribute index over a larger number of (super-)peers • Reduce number of messages by – By appropriate routing protocols – Efficient localization of data objects • Apply strategies for – Load balancing – Failure resilience – Replication – Self-organization • Other aspects – Trust – Privacy – Anonymity 16. 04. 2005 P 2 P, Grid, and So. A for Digital Libraries ― Heiko Schuldt 28

Examples for P 2 P Indexing • Scalable & self-organizing distributed Hash Tables (DHTs),

Examples for P 2 P Indexing • Scalable & self-organizing distributed Hash Tables (DHTs), e. g. – Pastry – Tapestry 1 – CAN 56 – CHORD 51 – … 8+1 14 8+2 14 8+4 14 8+8 21 8 + 16 32 8 + 32 42 8 key 54 • m-bit identifier assigned to each peer 48 key 10 14 • Peers organized on a ring (mod 2 m) • Key k assigned to the first node whose identifier is k • Reassignment (peer joins or leaves) just considers successor • Each peer maintains a partial local routing table (finger table) 16. 04. 2005 42 key 38 key 24 21 38 P 2 P, Grid, and So. A for Digital Libraries ― Heiko Schuldt 32 key 30 29

P 2 P and Future Digital Libraries • Complete decentralization – No global control

P 2 P and Future Digital Libraries • Complete decentralization – No global control – No censorship • Data and service management • Indexing • Self-organization … – … in the presence of many failures – … to address the high dynamics of large-scale DLs 16. 04. 2005 P 2 P, Grid, and So. A for Digital Libraries ― Heiko Schuldt 30

Grid Infrastructure … • Hardware and software infrastructure for coordinated resource sharing and problem

Grid Infrastructure … • Hardware and software infrastructure for coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organization • Generations of grid infrastructures – Computational grid – Data grid – Service grid 16. 04. 2005 P 2 P, Grid, and So. A for Digital Libraries ― Heiko Schuldt 31

… Grid Infrastructure • Key concepts: sharing of – Computing and storage devices (heterogeneous)

… Grid Infrastructure • Key concepts: sharing of – Computing and storage devices (heterogeneous) – Data – Software and Services – More general: networked resource usable in a remote way • Goals: – Provide a high degree of scalability, even if components are geographically distributed (latency, etc. ) – Provide a high degree of adaptability (failure handling, account for dynamics) – Support load balancing between components 16. 04. 2005 P 2 P, Grid, and So. A for Digital Libraries ― Heiko Schuldt 32

Service Grid • Open Grid Service Architecture (OGSA) – Encapsulation of resources as services

Service Grid • Open Grid Service Architecture (OGSA) – Encapsulation of resources as services – Providing grid technology to dynamically create, manage, discover, etc. these services • WS-Resource framework (WSRF) – Bringing web service technology to the grid – Creation, addressing, inspection, and lifetime management of stateful resources (WS-Resources) 16. 04. 2005 P 2 P, Grid, and So. A for Digital Libraries ― Heiko Schuldt 33

Grid Infrastructures in Practice • Globus Toolkit (GT): basic grid environment – GT 4

Grid Infrastructures in Practice • Globus Toolkit (GT): basic grid environment – GT 4 to support OGSA and WSRF • Condor (resource management) • g. Lite • … more to come in the afternoon … 16. 04. 2005 P 2 P, Grid, and So. A for Digital Libraries ― Heiko Schuldt 34

Grid and Future Digital Libraries • Efficient resource management • Load balancing and scheduling

Grid and Future Digital Libraries • Efficient resource management • Load balancing and scheduling • Self-adaptability – Automatic service installation and deployment • Availability • Replication • Authentication & authorization 16. 04. 2005 P 2 P, Grid, and So. A for Digital Libraries ― Heiko Schuldt 35

So. A, P 2 P & Grid: How does it fit together • These

So. A, P 2 P & Grid: How does it fit together • These are by far no orthogonal technologies • Differences more and more diminish – e. g. , grid services Grid P 2 P • There is not a unique recipe on … – how to construct a DL/DUKE – what base technology to use • Strongly dependent on particular requirements 16. 04. 2005 So. A P 2 P, Grid, and So. A for Digital Libraries ― Heiko Schuldt 36

So. A, P 2 P & Grid: All DL-related Problems Solved? • Several of

So. A, P 2 P & Grid: All DL-related Problems Solved? • Several of the challenges can be addressed by So. A, P 2 P and/or the Grid • But there is still some work left, e. g. , – Context- and location-aware services – Support for dependable and reliable systems – Continuous sensor data, monitoring of users – Guaranteed consistency of derived data – Support for mobile devices (switching between connected / disconnected mode), advanced replication and synchronization mechanisms – Complex queries across media types – … 16. 04. 2005 P 2 P, Grid, and So. A for Digital Libraries ― Heiko Schuldt 37

Summary & Outlook • So. A, P 2 P, Grid are well suited as

Summary & Outlook • So. A, P 2 P, Grid are well suited as base technology for DLs / DUKEs • Trend to integrate and merge functionality will increase • Work in DELOS will be continued by defining A Reference Model for Digital Library Management Systems 16. 04. 2005 P 2 P, Grid, and So. A for Digital Libraries ― Heiko Schuldt 38