DATA STREAM QUERY PROCESSING THROUGH SERVICES COORDINATION Vctor

  • Slides: 28
Download presentation
DATA STREAM QUERY PROCESSING THROUGH SERVICES COORDINATION Víctor Cuevas Vicenttín victor@imag. fr http: //optimacs.

DATA STREAM QUERY PROCESSING THROUGH SERVICES COORDINATION Víctor Cuevas Vicenttín victor@imag. fr http: //optimacs. imag. fr

2 OUTLINE ¡Context and motivation ¡Stream data services and queries ¡Service coordination for query

2 OUTLINE ¡Context and motivation ¡Stream data services and queries ¡Service coordination for query evaluation ¡Service-based query processor ¡Experimentation

3 QUERYING STREAM DATA SERVICES Q A persons products stream data services bids

3 QUERYING STREAM DATA SERVICES Q A persons products stream data services bids

4 STREAM DATA SERVICES ¡Timestamp value added at arrival time { "person_id": 0, "name":

4 STREAM DATA SERVICES ¡Timestamp value added at arrival time { "person_id": 0, "name": "Luitpold Martucci", "phone": "+56(52)3418151", "email": "Martucci@toronto. edu", "profile": { "interests": [{"category": 282}], "income": 59178. 78, "age": 35, "gender": "male", "education": "High School" } } 2/30 sec { "category": 202, "interval": { "start": 1886, "end": 53879 }, "seller_person": 21, "quantity": 9, "type": "Regular", "itemref": 9, "open_auction_id": 9 } 3/30 sec { "person_ref": 7, "time": 3306, "bid": 221. 00, "open_auction_id": 8 } 15/30 sec

5 DATA MODEL (JSON) • Types: atomic values, nested tuples and lists τ :

5 DATA MODEL (JSON) • Types: atomic values, nested tuples and lists τ : : = c | (A: τ, . . . , B: τ’ ) | [τ, . . . , τ’ ] { "product_id": 749437 -37, "name": "Pac-man arcade", "base_price": 421. 00, "tags": [ "games ", "retro", "electronics" ] "details": { "seller_id": 9735, "auction_date": "12 -01 -2010" 1 } } subscribe(t) → { tuple( att 1: val 1, att 2: val 2, … ) } stream { tuple , tuple 2, tuple 3, tuple 4, …}

6 EXAMPLE QUERY • Data processing For the last 30 persons and 30 products

6 EXAMPLE QUERY • Data processing For the last 30 persons and 30 products offered, retrieve the bids of the last 20 seconds greater than 15 euros Correlation Temporality Filtering

7 DATA PROCESSING TASKS Name : Bob Id : 1 Bidder : 3 Amount

7 DATA PROCESSING TASKS Name : Bob Id : 1 Bidder : 3 Amount : 12 Name : Mike Id : 2 Bidder : 1 Amount : 29 Name : Alice Id : 3 Bidder : 4 Amount : 38 Name : Jane Id : 4 Bidder : 2 Amount : 10 Correlation Filtering Temporality last > 15 n

8 QUERY EXPRESSION For the last 30 persons and 30 products offered, retrieve the

8 QUERY EXPRESSION For the last 30 persons and 30 products offered, retrieve the bids of the last 20 seconds greater than 15 euros SELECT bidstream. person_ref, bidstream. open_auction_id, bidstream. bid FROM bidstream [RANGE 20], auctionstream [ROWS 30], personstream [ROWS 30] WHERE bidstream. open_auction_id = auctionstream. open_auction_id AND auctionstream. seller_person = personstream. person_id AND bidstream. bid > 15; • Declarative query language • SQL-like + streams (≈CQL)

9 QUERY COORDINATION SELECT bidstream. person_ref, bidstream. open_auction_id, bidstream. bid FROM bidstream [RANGE 20],

9 QUERY COORDINATION SELECT bidstream. person_ref, bidstream. open_auction_id, bidstream. bid FROM bidstream [RANGE 20], auctionstream [ROWS 30], personstream [ROWS 30] WHERE bidstream. open_auction_id = auctionstream. open_auction_id AND auctionstream. seller_person = personstream. person_id AND bidstream. bid > 15; For the last 30 persons and 30 products offered, retrieve the bids of the last 20 seconds greater than 15 euros person [tuple win] ⋈ product [tuple win] ⋈ bid [time win] σ π

10 OUTLINE Context and motivation Stream data services and queries ¡Service coordination for query

10 OUTLINE Context and motivation Stream data services and queries ¡Service coordination for query evaluation ¡Service-based query processor ¡Experimentation

11 COORDINATION-BASED EVALUATION person 1 [tuple win] person 1' ⋈ join. Pr. BP product

11 COORDINATION-BASED EVALUATION person 1 [tuple win] person 1' ⋈ join. Pr. BP product 1 [tuple win] product 1' ⋈ join. Pr 1 B 1 bid [time win] bid 1' σ sel. Pr. BP π proj. Pr. BP Data access Communication Computation

12 QUERY COORDINATION ¡Workflow of activities ¡ Data access ¡ Data processing ¡ Parallel

12 QUERY COORDINATION ¡Workflow of activities ¡ Data access ¡ Data processing ¡ Parallel and sequential composition ¡Activity → subworkflow of activities → service coordination ¡ Calls to computing services ¡ Queue-based communication ¡ Access/modify local data ¡Computing services ¡ Data processing (e. g. indexation) ¡ Calculations (e. g. average)

13 COORDINATION-BASED EVALUATION Index service 1 product 1 [tuple win] product 1 Act 2

13 COORDINATION-BASED EVALUATION Index service 1 product 1 [tuple win] product 1 Act 2 ⋈ product 1 bid Act 3 [time win] bid 1 Index service 2 Data access Communication prod_bid 1 Act 4 Act 1 Computation

14 OUTLINE Context and motivation Stream data services and queries Service coordination for query

14 OUTLINE Context and motivation Stream data services and queries Service coordination for query evaluation ¡Service-based query processor ¡Experimentation

15 DEMO

15 DEMO

16 EXPERIMENT ARCHITECTURE Query processor <<SOAP access>> Comp. Services Multi. Stream Server stream 1

16 EXPERIMENT ARCHITECTURE Query processor <<SOAP access>> Comp. Services Multi. Stream Server stream 1 stream 2 stream n

17 EXPERIMENT ARCHITECTURE Query Processor Stream Operator subscribe (SOAP) buffer Gateway Data. Stream data

17 EXPERIMENT ARCHITECTURE Query Processor Stream Operator subscribe (SOAP) buffer Gateway Data. Stream data (SOAP) Data. Stream notify Multi Stream Server person product bid

18 DATA AND COMPUTATION SERVICES ¡NEXMark person, product (auction), and bid data stream services

18 DATA AND COMPUTATION SERVICES ¡NEXMark person, product (auction), and bid data stream services ¡Query operators supported by computation services Query operator Computation service Tuple-based window Simple queue service Time-based window Calendar queue service Join Hash index service

19 QUERY PROCESSOR IMPLEMENTATION ¡Selection, projection, join, tuple and time-based windows ¡Query language similar

19 QUERY PROCESSOR IMPLEMENTATION ¡Selection, projection, join, tuple and time-based windows ¡Query language similar to CQL* ¡Service coordination specified through standard Java code ¡Domain specific language for service coordination under implementation *Arasu, A. , Babu, S. , and Widom, J. 2006. The CQL continuous query language: semantic foundations and query execution. The VLDB Journal 15, 2 (Jun. 2006), 121 -142.

20 QUERY PROCESSOR ARCHITECTURE Query Parser Query Executor <<uses>> <<AST>> Evaluation Plan Constructor Scheduler

20 QUERY PROCESSOR ARCHITECTURE Query Parser Query Executor <<uses>> <<AST>> Evaluation Plan Constructor Scheduler <<Eval. Plan>> <<SOAP access>> stream 1 Stream. Op stream 2 [Join]Op Comp. Services stream n <<SOAP access>>

21 EXPERIMENT: SUMMARY ¡Proof of concept for practical applications ¡ Modification of NEXMark benchmark

21 EXPERIMENT: SUMMARY ¡Proof of concept for practical applications ¡ Modification of NEXMark benchmark ¡Allow control over data streams ¡ Data rates modifiable through code ¡ Synchronization mechanism ¡Initial results ¡ Created a testbed of 6 queries ¡ Measured latency (time elapsed from arrival to output)

22 LATENCY MEASUREMENTS 70 Average tuple latency 60 50 latency msec 40 JAX-WS on

22 LATENCY MEASUREMENTS 70 Average tuple latency 60 50 latency msec 40 JAX-WS on Tomcat 30 Local Java VM 20 10 0 1 2 3 4 NEXMark Query 5 6

23 LESONS LEARNED ¡Possible to implement a query processor largely through service coordination ¡Interfaces

23 LESONS LEARNED ¡Possible to implement a query processor largely through service coordination ¡Interfaces respecting service-oriented architecture principles are essential ¡Operators must be congruent to maintain query semantics ¡Performance penalties can be significantly high

24 Thanks

24 Thanks

25 COMPUTATION SERVICE ⋈ S 2 S 3 S 1 tion puta Com stateful

25 COMPUTATION SERVICE ⋈ S 2 S 3 S 1 tion puta Com stateful (e. g. hash storage) • Data management and calculation tasks • Operations with function -like interfaces (f: X → Y)

26 HASH INDEX SERVICE id: ‘ 2 AF 3 D 28’ key: ‘Alice’ obj:

26 HASH INDEX SERVICE id: ‘ 2 AF 3 D 28’ key: ‘Alice’ obj: 10110101… 01 0 1 2 ‘Bob’ ‘Don’ ‘Alice’ ‘Mike’ ‘Sarah’ ‘Mary’ ‘Alex’ ‘Alice’ …

27 SYMMETRIC HASH JOIN Hash. Index 1 (Alice, alice@hotmail) Hash. Index 2 (Arcade, Bob,

27 SYMMETRIC HASH JOIN Hash. Index 1 (Alice, alice@hotmail) Hash. Index 2 (Arcade, Bob, 175, …) (Mike, mike 7@gmail) (Camera, Mike, 36…) (Bob, bob@msn. com) (Painting, Alice, 3570) stateful person bid

28

28