STREAM The Stanford Data Stream Management System Rebuttal
STREAM: The Stanford Data Stream Management System Rebuttal Team Mingzhu Wei Di Yang CS 525 s - Fall 2006
Rebuttal Areas Foundation n Windows n Joins n Full Recalculation Strategy n Language Issues n
Foundation n Rebuttal ¨ n No proof provided to guarantee correctness or completeness of query plans resulting from the combination of operators, queues, and synopses Analysis ¨ ¨ ¨ STREAM is based on relation database theory Relational databases have been around for a long time Proofs exist that demonstrate their correctness CQL is a minor extension to SQL More effort could have been put into providing a more formal proof of CQL
Windows n Rebuttal ¨ Stream does not provide value-based windows ¨ For example, without a value-based window the system cannot process a query such as: ¨ Give me the name of the students who have the top 10 exam scores, efficiently n Analysis ¨ Feature not supported by STREAM
Joins n Rebuttal ¨ STREAM only uses the self-purge mechanism when performing a window-based join n Analysis ¨ STREAM’s criteria for judging when a tuple (in the state of the window) has expired is determined by comparison of its timestamp with that of the new incoming tuples in the same stream ¨ Cross-purge might be more efficient in some cases n Cross-purge = Compare timestamps across two streams
Full Recalculation Strategy n Rebuttal ¨ Stream uses a full recalculation strategy for result updating ¨ Could be very inefficient with big window sizes ¨ Example: n n n We are trying to join two windows each of size 1000 If both windows only slide 10 at each time, recalculation for the whole result would be much more expensive, than incremental result updating Analysis ¨ Using an Incremental Result Update Strategy might be more efficient in some cases n Keep most of the joined result and only calculate those for newly arrived tuples
Language Issues n Rebuttal ¨ Stream does not provide the stream to stream operator n Analysis ¨ The absence of a stream-to-stream operator is not explicitly justified in the paper ¨ It’s absence is reasonable because STREAM operators treat all input as relations ¨ STREAM does provide operators for converting streams to relations and for converting relations to streams
Language Issues (cont) n Rebuttal ¨ Stream uses an append-only model ¨ It does not provide an operator for updating data value in stream n Analysis ¨ Although not perfect, this is a common assumption in current stream processing papers
Conclusion Foundation n Windows n Joins n Full Recalculation Strategy n Language Issues n
- Slides: 9