Dynamic Data Partitioning for Distributed Graph Databases Xavier

Dynamic Data Partitioning for Distributed Graph Databases Xavier Martínez Palau David Domínguez Sal Josep Lluís Larriba Pey

Dynamic Data Partitioning Outline Introduction l Contributions l System Overview l Experiments l 2

Dynamic Data Partitioning Outline Introduction l Contributions l System Overview l Experiments l 3

Dynamic Data Partitioning Introduction: Databases l Database n n l Software to store large amounts of data High performance Several ways to store a graph n n n Graph database Relational database RDF Key-value datastore … 4

Dynamic Data Partitioning Distributed Databases l Distributed databases store more data and improve throughput 5

Dynamic Data Partitioning Outline Introduction l Contributions l System Overview l Experiments l 6

Dynamic Data Partitioning Contributions l System design in two levels n n l Data access pattern monitoring n l Physical storage Memory management Specific data structure Load and network balancing n Increased throughput 7

Dynamic Data Partitioning Outline Introduction l Contributions l System Overview l Experiments l 8

Dynamic Data Partitioning System Overview Memory managment Storage 9

Dynamic Data Partitioning Partition Manager l We propose a new data structure n n Monitors data access patterns Uses this information in a simple way to decide how to route queries Matrix of data access sequences New compressed data structure 10

Dynamic Data Partitioning Outline Introduction l Contributions l System Overview l Experiments l 11

Dynamic Data Partitioning Experiments l Scalability with cluster size n l Systems compared n n l Static partitioning Dynamic partitioning (ours) R-MAT graph n n l Tested up to 32 machines 37 M vertices 1 B edges Queries: BFS and k-hops 12

Dynamic Data Partitioning Experiments Throughput (more better) Imbalance (less better)
- Slides: 13