Services Paths Faulttolerant Paths ISRG Retreat Z Morley

  • Slides: 22
Download presentation
Services Paths Fault-tolerant Paths ISRG Retreat Z. Morley Mao zmao@cs. berkeley. edu 1/11/2000 1

Services Paths Fault-tolerant Paths ISRG Retreat Z. Morley Mao zmao@cs. berkeley. edu 1/11/2000 1

Example path application: Jukebox/cell-phone application • Ninja Jukebox: Jukebox – service providing real-time streaming

Example path application: Jukebox/cell-phone application • Ninja Jukebox: Jukebox – service providing real-time streaming audio data from a collection of CDs in the network • GSM Cell-phone: – 12 kbps data, 13 kbps voice – communicates with BTS : operator Jukebox : connector Path 2

What is a path? • A way to compose services to create customizable complex

What is a path? • A way to compose services to create customizable complex services • Goals: – composability – accessibility – availability, fault-tolerance – scalability – security 3

Overall path construction process – a continuous optimization process with feedback: Logical Path Creation

Overall path construction process – a continuous optimization process with feedback: Logical Path Creation Physical Path Creation Path Instantiation, Execution, Maintenance Path Tear-down 4

Logical path creation: Path matching algorithm • Formulated as shortest path graph search –

Logical path creation: Path matching algorithm • Formulated as shortest path graph search – Operators ===> edges – Data format/type ===> nodes • Dijkstra’s shortest path algorithm – O(v 2) • Difficulty: expressing constraints and optimization variables 5

Path maintenance: Partial Path Repair (PPR) • APC(Automatic Path Creation) service guarantees robustness and

Path maintenance: Partial Path Repair (PPR) • APC(Automatic Path Creation) service guarantees robustness and fault-tolerance • Two ways of monitoring: – active checking of operator status – operators notify APC of neighboring operators’ failure 6

Performance measurements (4 operators, Jukebox/cell-phone app) • Logical/Physical path creation time: 264 ms •

Performance measurements (4 operators, Jukebox/cell-phone app) • Logical/Physical path creation time: 264 ms • Path instantiation time: 215 ms – operator instantiation: 70 ms – connector creation: 64 ms – start operator running: 81 ms • Path recovery: one operator fails – Time to detect failure of operator: 2 ms – Time to repair one failed operator: 400 ms • Path tear-down time: 289 ms 7

Open design issues • • • Wide area considerations Improved path reliability model Path

Open design issues • • • Wide area considerations Improved path reliability model Path performance modeling Path resource management framework Flexible path control – control path, path migration, dynamic adaptation • Applications for paths • Metrics for evaluation 8

Wide area path design Hierarchical APC Service APC WAN SAN APC service SAN APC

Wide area path design Hierarchical APC Service APC WAN SAN APC service SAN APC SAN 9

Step-by-step WAN path creation for Jukebox/Cell-phone application • End-user using cell-phone requests access to

Step-by-step WAN path creation for Jukebox/Cell-phone application • End-user using cell-phone requests access to Jukebox service • • • Qo. S needs: delay-sensitive, reliable service APC uses graph search algorithm finds the logical path APC searches for the physical path 1. Finds relevant parameters affecting Qo. S, determines the reliability model 10

Step-by-step WAN path creation for Jukebox/Cell-phone application 2. Obtains resource information from resource management

Step-by-step WAN path creation for Jukebox/Cell-phone application 2. Obtains resource information from resource management framework 3. Uses queuing model to evaluate choices 4. APC selects the optimal choice 5. APC dynamically adjusts the decision given feedback from the resource monitoring tools 11

Operator placement decisions • Depend on – operator computational requirement – software/hardware requirement –

Operator placement decisions • Depend on – operator computational requirement – software/hardware requirement – output/input properties • data location, data volume, delay-sensitivity, degradation properties – network characteristics • bandwidth, latency, jitter, packet loss 12

Path resource management framework • develop network monitoring tools – to obtain network statistics

Path resource management framework • develop network monitoring tools – to obtain network statistics • Available resources – computational, memory, network etc. • Make trade-offs due to interdependencies among resources • resource allocated per path basis 13

Path resource management framework • A high-level global hierarchical resource manager • Local resource

Path resource management framework • A high-level global hierarchical resource manager • Local resource manager per SAN • Runtime resource monitoring tools monitor/discover resource changes during the lifetime of paths 14

Applications for paths • Operators: – content transcoding operators: • • text-to-speech, mp 3

Applications for paths • Operators: – content transcoding operators: • • text-to-speech, mp 3 -to-PCM, PCM-to-GSM web search tools, filtering, aggregation, personalization Microsoft COM objects, existing Web services. . . Document conversion services – protocol translation operators: • serial socket, security transcoder, RMI Lite Any service Any device 15

Measurement metrics • Path creation time – logical/physical path creation, instantiation, execution • Scalability

Measurement metrics • Path creation time – logical/physical path creation, instantiation, execution • Scalability – number of paths created per amount of time • Fault-recovery time • Control, ease-of-use, programmability of paths • Ease of transparent path migration, adaptation to resource changes 16

Conclusion • Recent work: – APC prototype built for SAN with reasonable performance, Partial

Conclusion • Recent work: – APC prototype built for SAN with reasonable performance, Partial Path Repair • Future work: – focus on WAN, scalable path design – WAN test plan: campus-wide millennium cluster – support for continuous path optimization and adaptation 17

For more information • Please send comments/questions to Z. Morley Mao – zmao@cs. berkeley.

For more information • Please send comments/questions to Z. Morley Mao – zmao@cs. berkeley. edu • Slides will be available at: – http: //www. cs. berkeley. edu/~zmao/paths 18

Extra Slides 19

Extra Slides 19

Flexible path control: control path • Control path – Definition: make changes of operators,

Flexible path control: control path • Control path – Definition: make changes of operators, connectors • independent of data path • highly-available, fault-tolerant • Proposed design: – replicated control paths: • neighboring operators have control over each other • APC has complete control over localized operators 20

Flexible path control: path migration Two paths of different quality running – migrate from

Flexible path control: path migration Two paths of different quality running – migrate from the fast-to-startup, lower-quality one to slow-to-startup, higher-quality one • Transparent migration of paths – dynamic fusion of operators – dynamic deletion, addition, replacement of operators/connectors – Goal: adapt changes in resources and locations of end points 21

Wide area considerations • Goal: scalable, network-partition-tolerant • proposed design: – – replicated APC

Wide area considerations • Goal: scalable, network-partition-tolerant • proposed design: – – replicated APC service instances state of paths partitioned and replicated operators are soft-state continuous monitoring of operators and connectors by APC service instances – a few localized path components hooked together over wide area 22