Preference and Diversitybased Ranking in NetworkCentric Information Management
Preference and Diversity-based Ranking in Network-Centric Information Management Systems Ph. D defense Marina Drosou Computer Science & Engineering Dept. University of Ioannina
Why diversify? Car Animal Sports Team “Mr. Jaguar’’ Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 2
Thesis Goal This Ph. D thesis concerns the development, implementation and evaluation of models, algorithms and techniques for the ranking of information being presented to users of network-centric information management systems This ranking is based on the importance of each piece of information. We consider that importance is influenced by both relevance to user information needs and diversity: § Relevance is important so that users are only presented with the most useful results according to their needs § Diversity ensures that the received results do not all contain similar information. Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 3
Outline § Search Result Diversification: Introduction & Related Work § Content Diversification using Indices § Dis. C Diversity: Diversification based on Dissimilarity and Coverage § POIKILO: Evaluating the Results of Diversification Models and Algorithms § Summary Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 4
Outline § Search Result Diversification: Introduction & Related Work § Problem Definition § Variations § Algorithms § Content Diversification using Indices § Dis. C Diversity: Diversification based on Dissimilarity and Coverage § POIKILO: Evaluating the Results of Diversification Models and Algorithms § Summary Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 5
Problem Definition Given a set P of items and a number k, select a subset S* of P with the k most diverse items of P Given: 1. P = {p 1, …, pn} 2. k ≤ n 3. d: a distance metric 4. f: a diversity function Find: Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 6
What it means Given a set P of query results we want to select a representative diverse subset S* of P What does diverse mean? § Content: dissimilar items § e. g. , distant location on a map, different attribute values in tuples § Coverage: different aspects, perspectives, concepts § e. g. , different interpretations of a keyword in web search, different topics § Novelty: items not seen in the past § e. g. , novel results in a notification service Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 7
Content-based diversity Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 8
Coverage-based diversity Basic idea: Find a set of results that cover different interpretations of the query Common assumptions: § § A taxonomy exists Both queries and results may belong to many categories Statistics on the distribution of user intents have been collected Result independence Probabilistic view of the problem Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 9
Novelty-based diversity Novelty: the need to avoid redundancy (vs. Diversity: the need to resolve ambiguity) § Intuitively: an item should be returned in the ith position of the list if it is relevant the previous (i-1) items do not contain the same information Information is partitioned into “nuggets” § Often, human judges decide what is relevant or not for each nugget (IR approach) Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 10
Adding relevance in the mix We must not forget: Relevance to the query is also important! § Results must be both relevant and diverse Two alternatives: § Select the k most diverse results out of the top-m most relevant ones, m>k § Include diversity into the ranking criterion Augmenting diversity function with relevance Adapting IR criteria, e. g. , discounted cumulative gain(DGC) at position i Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 11
Adding relevance in the mix Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 12
Problem Complexity The problem of choosing diverse items is NP-hard § This follows from the MAX COVERAGE/SET COVER problems § Intuitively: To find the most diverse subset S* of all items P we have to compute all possible combinations of k items out of |P| and keep the one with the maximum diversity Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 13
Solving the problem Thus, we use heuristics for approximate solutions § Greedy heuristics: Selecting items one by one until we have k of them § Interchange heuristics: Start with a random solution and interchange items that improve the objective function § Also: Neighborhood heuristics: Disqualify items close to the ones already selected Simulated Annealing: Apply simulated annealing to avoid local maxima and others Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 14
Related Work Content Greedy Interchange Others Jain, PAKDD 2004 Ziegler, WWW 2005 Gollapudi, WWW 2009 Drosou, DEBS 2009 Tao, ICDE 2009 Haritsa, IEEE Data Eng. Bull. 2009 Vieira, ICDE 2011 Bozzon, ICWE 2012 Santos, SSBDM 2013 Abbar, WWW 2013 Valkanas, EBDT 2013 Yu, EDBT 2009 Vieira, ICDE 2011 Liu, PVLDB 2009 Minack, SIGIR 2011 Liu, TODS 2012 Coverage Agrawal, WSDM, 2009 Liu, SDM 2009 Zhu, WWW 2011 Li, CIKM 2012 Novelty Zhang, SIGIR 2002 Clarke, SIGIR 2008 Souravlias, 2010 Lathia, SIGIR 2010 Szpektor, WWW 2013 Vee, ICDE 2008 Zhang, Rec. Sys 2008 Angel, SIGMOD 2011 Fraternali, SIGMOD 2012 Li, PVLDB 2013 Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 15
Outline § Search Result Diversification: Introduction & Related Work § Content Diversification using Indices § Model § Diverse set computation § Combining diversity & relevance § Dis. C Diversity: Diversification based on Dissimilarity and Coverage § POIKILO: Evaluating the Results of Diversification Models and Algorithms § Summary Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 16
Introduction We focus on content-based diversification § MAXMIN Basic idea: employ indices for the efficient computation of diverse Items § Cover Trees We also define the Continuous k-diversity problem Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 17
The Cover Tree A leveled tree where each level is a “cover” for all levels beneath it Items at higher levels are farther apart from each other than items at lower levels Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 18
Cover Tree Invariants - Nesting p 2 p 1 p 2 Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems p 3 19
Cover Tree Invariants - Covering p 2 bl-1 p 1 p 2 p 3 p 2 b: the “base” of the tree l: the level of pi Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 20
Cover Tree Invariants - Separation p 2 bl-1 p 1 p 2 p 3 > bl-2 Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 21
Example Items indexed at the first ten levels of the same Cover Tree Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 22
Cover Tree Representations Implicit Representation Explicit Representation p 2 p 1 p 2 p 2 p 3 space depending on P Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems p 1 p 3 O(n) space 23
Dynamic Construction Items can be inserted and deleted from a Cover Tree in a dynamic fashion Insertion: 1. Starting from the root, descend towards the candidate nodes that can cover the new item p 2. Continue until a level Cl is reached where p is separated from all other items 3. Select as parent a candidate node of Cl+1 that covers p Deletion: 1. Descend the tree looking for p, keeping note of candidate nodes that can cover the children of p 2. Remove p and reassign its children to the candidate nodes Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 24
Level Family of Algorithms Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 25
Approximation Bound Let P be a set of items, k 2, d. OPT(P, k) the optimal minimum distance for the MAXMIN problem and d. CT(P, k) be the minimum distance of the diverse set computed by the Level-Basic algorithm. It holds that: d. CT(P, k) d. OPT(P, k), where = (b-1)/(2 b 2) (Proved by exploiting the covering invariant of the tree to bound the level where the least common ancestor of any two items of the optimal solution appears in the tree) Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 26
Cover Tree implementation of Greedy Any Cover Tree can be employed for implementing the greedy heuristic § ½-approximation of the optimal solution We perform k descends of the tree, using one of the following pruning rules: Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 27
Batch Construction If all items of P are available, we can perform a batch construction of the Cover Tree § We call such trees “Batch Cover Trees” (BCTs) § As we descend a BCT, we get items in the order selected by Greedy Algorithm: 1. The leaf level Cl contains all items in P 2. We greedily select items from Cl with distance larger than bl+1 and promote them to Cl+1 3. The rest of the items in Cl are distributed as children among the new nodes of Cl+1 4. Continue until we reach the root level Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 28
Adding relevance Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 29
Continuous Model We consider a streaming scenario, where new items arrive and older items expire We want to provide users with a continuously updated subset of the top-k most diverse recent items in the stream We consider a sliding-window model: Window Pi-1 Window Pi jump step w Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 30
Continuous k-Diversity Problem Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 31
Continuity Requirements Items in the tree are marked as valid or invalid: § Freshness: non-diverse items that are older than the newest diverse item from the previous window are marked as invalid in the cover tree and are not further considered. § Durability: Let r be the number of diverse items from previous windows that have not yet expired. We select k-r new valid diverse items from the new window. Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 32
Building Batch Cover Trees We measure the extra cost of building a BCT as compared to executing the greedy heuristic (GR) for k = n § This extra cost corresponds to assigning nodes to suitable parents to form the tree levels Extra Cost Clustered Faces b non-np np 1. 3 0. 42% 0. 58% 1. 49% 1. 94% 1. 5 0. 42% 0. 56% 1. 47% 1. 92% 1. 7 0. 41% 0. 55% 1. 47% 1. 91% np – nearest parent heuristic (choose closest candidate parent). The quality of the solution is the same for BCT and GR. Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 33
Building Incremental Cover Trees Extra Cost Building ICTs requires a small fraction of the cost required for the corresponding BCTs However, the quality of the solutions provided by ICTs is comparable to that of BCTs (and, thus, GR) b Clustered Faces 1. 3 0. 16% 0. 79% 1. 5 0. 08% 0. 41% 1. 7 0. 06% 0. 28% For trees with 10, ooo items: § Insertion cost: ~2. 6 msec § Deletion cost: ~10 msec Inserting/Removing items after a window jump depends on the size of the window and the jump step but is much faster than re-building a BCT for the new set of items Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 34
Pruning is even better for non uniform datasets, since each selection of a diverse item results in pruning a largest number of items around it Also, pruning is better for large values of λ Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 35
Streaming Data We compare ICTs against SGR, a streaming version of GR: § At each window, we keep any remaining diverse items from the previous window (durability) and let GR select items from the new window satisfying freshness Comparable achieved diversity, while ICTs are much faster Retrieving the top-100 items from an ICT with 1, 000 -10, 000 items requires ~1. 5 msec Executing SGR requires 3. 2 sec for 5, 000 items and more than 15 sec for 10, 000 items Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 36
Summary We proposed an indexed-based diversification approach based on Cover Trees We provided a new suite of algorithms along with theoretical results for the quality of our approach We studied the diversification problem in a dynamic setting, where items change over time and defined continuity requirements that the diversified items must satisfy Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 37
Related Publications 1. 2. M. Drosou and E. Pitoura, Diverse Set Selection over Dynamic , Data in IEEE TKDE (to appear) M. Drosou and E. Pitoura, Dynamic Diversification of Continuous, Data EDBT 2012 Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 38
Outline § Search Result Diversification: Introduction & Related Work § Content Diversification using Indices § Dis. C Diversity: Diversification based on Dissimilarity and Coverage § Dis. C Diversity § Algorithms § Comparison with other models § Incremental Dis. C § POIKILO: Evaluating the Results of Diversification Models and Algorithms § Summary Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 39
Dis. C Diversity What is the right size for the diverse subset S? What is a good k? What if… instead of k, a radius r? Given a result set P and a radius r, we select a representative subset S ⊆ P such that: 1. For each item in P, there is at least one similar item in. S (coverage) 2. No two items in S are similar with eachother (dissimilarity) Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 40
Dis. C Diversity Zoom-in Zoom-out Local zoom § Small r: more and less dissimilar points (zoom in) § Large r: less and more dissimilar points (zoom out) § Local zooming at specific points Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 41
Dis. C Diversity Since a Dis. C set for a set P is not unique § We seek a concise representation → the minimum Dis. C set Formal definition: Let P be a set of objects and r, r ≥ 0, a positive real number. A subset S ⊆ P is an r-Dissimilar-and-Coveringdiverse subset, or r-Dis. C diverse subset, of P, if the following two conditions hold: 1. coverage condition: ∀pi ∈ P, ∃pj ∈ N+r (pi), such that pj ∈ S 2. dissimilarity condition: ∀pi, pj ∈ S with pi ≠ pj, it holds that d(pi, pj) > r Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 42
Graph model We use a graph to model the problem: § Each item is a vertex § There exists an edge between two vertices, if their distance is less than r r Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 43
Graph model Finding a minimum r-Dis. C diverse subset of a set P is equivalent to finding a minimum Independent Dominating set of the corresponding graph § Independent: no edge between any two vertices in the set § Dominating: all vertices outside connected with at least one inside This is an NP-hard problem Dominating, not independent Dominating and independent Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 44
Computing Dis. C subsets A basic or greedy approach: § select random items or items with large neighborhoods Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 45
How smaller is the (optimal) minimum Dis. C set? The size of any r-Dis. C diverse subset S of P is B times the size of any minimum r-Dis. C diverse subset S* where B the maximum number of independent neighbors of any item in P § i. e. , each item has at most B neighbors that are independent from each other B depends on the distance metric and data cardinality § We have proved that: for the Euclidean distance in the 2 D plane: B = 5 for the Manhattan distance in the 2 D plane: B = 7 for the Euclidean distance in the 3 D plane: B = 24 Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 46
Raising the dissimilarity condition When we consider only coverage: Let Δ be the maximum number of neighbors of any item in P; the size of any covering(but not dissimilar) diverse subset S of P is at most lnΔ times larger than any minimum covering subset S* Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 47
Adding weights Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 48
Multiple radii We want to allow different areas of the data to contribute more or less items to the diverse set The problem now loses its symmetry Two interpretations: 1. pi can represent all items lying at a distance at most r(pi) around it (Covering problem) 2. pi can be represented only by items lying at a distance at most r(pi) around it (Covered. By problem) Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 49
Multiple radii variations A set P Covering Covered. By The problem is now modeled via a directed graph § Directed graphs do not always have an independent dominating set! § We provide heuristic algorithms that always locate a valid Dis. C set Covering: start with items with larger radii Covered. By: start with items with smaller radii Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 50
Comparison with other models r-Dis. C MAXSUM MAXMIN k-medoids Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 51
Comparison with MAXMIN For a set of items P, we have proved that: 1. Let S be an r-Dis. C set and S* be an optimal MAXMIN set. Let and * be the MAXMIN distances of the two sets. Then, * ≤ 3. 2. Let S* be the optimal MAXMIN set of size k with MAXMIN distance equal to *. Let S be an r-Dis. C set with r = *. Then, |S| < k′, where k′ is the first integer larger than k for which the corresponding optimal MAXMIN set of P S*′ has MAXMIN distance equal to λ*′, with λ*′ < λ*. Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 52
Zooming We want to change the radius r to r’ interactively and compute a new diverse set § r’ < r zoom in, r’ > r zoom out Two requirements: 1. Support an incremental mode of operation: the new set Sr’ should be as close as possible to the already seen result Sr. Ideally, Sr’ ⊇ Sr for r’ < r and Sr’ ⊆ Sr for r’ > r 2. The size of Sr’ should be as close as possible to the size of the minimum r’-Dis. C diverse subset There is no monotonic property among the r-Dis. C diverse and the r’Dis. C diverse subsets of a set of objects P (the two sets may be completely different) Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 53
Size when moving from r to r’ The change in size of the diverse set when moving from r to r’ depends on the number of independent neighbors (for r’) in the “ring” around an object between the two radii. Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 54
Zooming Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 55
Zooming-In Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 56
Zooming-Out For zooming-out, we keep the independent items of Sr and fill in the solution with items from uncovered areas. It holds that: 1. There at most N items in SrSr’ 2. For each item in SrSr’, at most (B-1) items are added to Sr’ (proof and various algorithms for keeping the size small in thesis) Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 57
Implementation We base our implementation on a spatial data structure (central operation: computeneighbors) We use an M-tree § We link together all leaf nodes (we visit items in a single left-to-right traversal of the leaf level to exploit locality) § We build trees using splitting policies that minimize overlap Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 58
Implementation Pruning. Rule: A leaf node that contains no white objects is colored grey. Whenall its childrenbecomegrey, an internalnodeis colored grey and becomes inactive. We prune subtrees with only “grey nodes”. Two implementations for our greedy approach § Grey-Greedy, White-Greedy Lazy variations for updating neighborhoods Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 59
Performance Solution size Many real and synthetic datasets General trade-off: Larger r → Smaller diverse set → higher cost Lazy variations of our algorithms further reduce computational cost Cost The cost also depends on the characteristics of the M-tree (fat-factor) Smaller sizes for clustered data Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 60
Diversity and Relevance Similar diversity for the Basic and Greedy algorithms Greedy considers relevance and produces subsets of larger average weight Raising the dissimilarity condition improves average weight but minimum distance is decreased Also, we get larger subsets than in the diversity-only case Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 61
Zooming Solution size Both requirements: § incremental (much smaller cost) and § small size (relative to computing it from scratch) Jaccard distance among solutions Cost Larger overlap among Sr and Sr’ Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 62
Related Publications 1. 2. M. Drosou and E. Pitoura, Multiple Radii Dis. C Diversity: Result Diversification based on Dissimilarity and Coverage (submitted) M. Drosou and E. Pitoura, Dis. C Diversity: Result Diversification based on Dissimilarity and Coverage , in PVLDB, vol. 6, no. 1, pp. 1324, 2012, VLDB Endowment (Best Paper Award) Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 63
Outline § Search Result Diversification: Introduction & Related Work § Content Diversification using Indices § Dis. C Diversity: Diversification based on Dissimilarity and Coverage § POIKILO: Evaluating the Results of Diversification Models and Algorithms § Summary Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 64
Visualizing Diverse Items Selecting diversification parameters Zooming and Streaming Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems Result Statistics 65
Visualizing Diverse Items Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 66
Related Publications 1. M. Drosou and E. Pitoura, POIKILO: A Tool for Evaluating the Results of Diversification Models and Algorithms , VLDB 2013 Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 67
Outline § Search Result Diversification: Introduction & Related Work § Content Diversification using Indices § Dis. C Diversity: Diversification based on Dissimilarity and Coverage § POIKILO: Evaluating the Results of Diversification Models and Algorithms § Summary § Thesis contribution § Directions for future research Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 68
Thesis Contributions Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 69
Thesis Contributions Diversification based on dissimilarity coverage and § We introduced a novel diversity definition, called Dis. C diversity, based on using a radius r rather than a size limit k to select diverse items § We presented both a spatial and a graph model for our definition § We studied the weighted and multiple radii cases § We introduced incremental diversification to a new radius through zooming-in and zooming-out § We presented algorithms for locating Dis. C diverse subsets and derived bounds concerning the size of such subsets § We provided efficient implementations of our algorithms based on spatial index structures, namely the M-Tree Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 70
Thesis Contributions Visualizingand comparing diversification algorithms § We developed a system prototype, called “Poikilo”, providing implementations of a wide variety of diversification approaches to assist users in locating, visualizing and comparing diverse results Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 71
Directions for future research Short term plans § Diversification in Database Exploration § Interesting suggestions in database exploration are often similar § Also: exploit external sources § Diversification of Multiple Search Results § Exploit overlap among results of different queries § Use diversified results of past queries to answer new ones § Diversification of Keyword Search Results in Databases § Moving diversification to the ranking phase § Apply coverage-based definitions Long term plans § Diversification in a distributed setting § Place “diversification filters” on the overlay network to reduce computational and communication costs Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 72
Thesis Publications Journal Publications 1. M. Drosou and E. Pitoura, Multiple Radii Dis. C Diversity: Result Diversification based on Dissimilarity and. Coverage (submitted) 2. M. Drosou and E. Pitoura, Ymal. DB: Exploring Relational Databasesvia Result Driven Recommendations, in VLDBJ(to appear) 3. M. Drosou and E. Pitoura, Diverse Set Selection over Dynamic Data, in IEEE TKDE(to appear) 4. M. Drosou and E. Pitoura, Dis. C Diversity: Result Diversificationbased on Dissimilarity and Coverage, in PVLDB, vol. 6, no. 1, pp. 1324, 2012, VLDB Endowment (Best Paper Award) 5. M. Drosou and E. Pitoura, Search Result Diversification, in SIGMOD Record, vol. 39, no. 1, pp. 4147, 2010, ACM 6. M. Drosou and E. Pitoura, Diversity over Continuous Data, in. IEEE Data Engineering Bulletin , vol. 32, no. 4, pp. 4956, 2009, IEEE Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 73
Thesis Publications Conference Publications 1. M. Drosou and E. Pitoura, Dynamic Diversification of Continuous Data, EDBT 2012 2. M. Drosou and E. Pitoura, REDRIVE: Result Driven Database Exploration through Recommendations, CIKM 2011 3. K. Stefanidis, M. Drosou and E. Pitoura, Per. K: Personalized Keyword Search in Relational Databases through Preferences, EDBT 2010 Workshop Publications 1. D. Souravlias, M. Drosou, K. Stefanidis and E. Pitoura, On Novelty in Publish/Subscribe Delivery, DBRank 2010 2. K. Stefanidis, M. Drosou and E. Pitoura, ‘‘You May Also Like’’ Results in Relational Databases, Pers. DB 2009 Demos 1. M. Drosou and E. Pitoura, POIKILO: A Tool for Evaluating the Resultsof Diversification Models and Algorithms, VLDB 2013 2. M. Drosou and E. Pitoura, Ymal. DB: A Result Driven Recommendation System for Databases, EDBT 2013 Marina Drosou, Preference and Diversity-based Ranking in Network-Centric Information Management Systems 74
Thank you! 75
- Slides: 75