Introduction to Graph BLAS Historical overview Gbor Szrnyas
Introduction to Graph. BLAS Historical overview Gábor. Szárnyas szarnyas@mit. bme. hu
Overview of graph processing
GRAPH PROCESSING CHALLENGES connectedness the “curseof connectedness” computer architectures contemporary computerarchitectures are good at processing linearandhierarchical datastructures, such as. Lists, Stacks, or Trees a massive amount of random data access required, is CPU has caching and frequent cache misses, and implementing parallelism is parallelization difficult B. Shao, Y. Li, H. Wang, H. Xia (Microsoft Research ), Trinity Graph Engine and its Applications, IEEE Data Engineering Bulleting 2017
GRAPH PROCESSING CHALLENGES §
VERTEX-CENTRIC PROGRAMMING MODEL § Pregel paper § SIGMOD 2020 “Test of time” award § Enables distributed and fault-tolerant processing § TODO: edge-centric vs. vertex centric G. Malewicz et al. (Google), Pregel: A System for Large-Scale Graph Processing, SIGMOD 2010
DISTRIBUTED GRAPH PROCESSING § Many recent works focused on computation models for distributed execution allowing systems to scale out. o BSP (Bulk. Synchronous Parallel) model by Leslie Valiant o Map. Reduce and Pregel models by Google o Vertex-centri c, Scatter-Gather, Gather-Apply-Scatter models o Apache projects: Giraph, Spark Graph. X, Flink Gelly, Hama § Lots of research, summarized in survey/experiments papers V. Kalavri et al. : High-Level Programming Abstractions for Distributed Graph Processing , TKDE 2018 O. Batarfi et al. : Largescale graph processing systems: survey and anexperimentalevaluation, Cluster 2015 K. Ammar, M. T. Özsu: Experimental Analysis of Distributed Graph Systems, VLDB 2018 M. T. Özsu: Graph. Processing: APanaromic View and. Some Open. Problems, VLDB 2019
SCALING OUT VS. SCALING UP § Distributedapproaches are scalable but comparatively slow o largecommunication overhead o load balancing issuesdue to irregular distributions § Many systems struggle to outperform a single-threaded setup o COST =Configuration that Outperforms a Single Thread § Alternatives : o Partition-centric programming mode l (Blogel, etc. ) o Linear algebra -based programming model F. Mc. Sherry etal. : Scalability! But at what COST? Hot. OS 2015 N. Satish et al. : Navigatingthe Maze of Graph Analytics Frameworks using Massive Graph Datasets , SIGMOD 2014
LINEAR ALGEBRA-BASED GRAPH PROCESSING § Graphs are encodedsparse as adjacency matrices. § Usevector/matrix operations to expressgraph algorithms. 1 1 1 frontier 1 1 1 1 2 1 1
GRAPHBLAS TIMELINE Book—Papers—Graph. BLAS standards —Suite. Sparse: Graph. BLAS releases 2011 2012 2016 2013 2017 2018 2019 2020 … Graph Algorithms in the Language of Linear Algebra Standards for graph algorithm primitives, HPEC Mathematical foundations, HPEC C API, GABB@ IPDPS LAGraph, Suite. Spars Gr. APL e: Graph. BLAS, @IPDPS TOMS
The duality between graphs and matrices
Dénes Kőnig’s work 1916 On graphs and their application to determinant and set theory 1931 Graphs and matrices
TEXTBOOKS ON SEMIRING-BASED GRAPH PROCESSING § 1974: Aho-Hopcroft-Ullman book o The Design and Analysis of Computer Algorithms • Sec. 5. 6: Path-finding problems • Sec. 5. 9: Path problems and matrix multiplication § 1990: Cormen-Leiserson-Rivest book o Introduction to Algorithms • Sec. 26. 4: A general framework for solving path problems in directed graphs § 2011: GALLA book (edited by Kepner and Gilbert) o Graph Algorithms in the Language of Linear Algebra A lot of literature but few practical implementations.
Open questions
SOME OPEN QUESTIONS § D. G. Spampinato et al. : Linear Algebraic Depth-First Search , ARRAYS@PLDI 2019 Y. Ahmad et al. : LA 3: A Scalable. Link- and. Locality -Aware Linear Algebra-Based Graph. Analytics System, VLDB 2018
- Slides: 14