INTRO TO GRAPHDBS Brief introduction to graph DB
INTRO TO GRAPHDBS Brief introduction to graph DB concepts
ABOUT ME CREATE p = (person: Person {name: 'Jen', email: 'jenparker 1975@gmail. com', github: 'https: //github. com/jenparker 1975'}) – [: WORKS_AT {since: 2013}] -> (company: Company {name: 'Healthcare. Source', tag: 'Leading provider of talent management solutions for Healthcare' }) RETURN p MATCH (person: Person {name: 'Jen'}) CREATE (person) -[: IS_LEARNING]>(technology: Technology {name: 'Neo 4 j'})
AGENDA 1. What’s a graph DB anyway? 2. Core Concepts 3. DBs with Benefits… 4. Popular Graph. DB Engines 5. Complex Use Cases 6. Diabook – Social Network 7. Building the Network 8. Questions/Links
WHAT’S A GRAPHDB ANYWAY?
GRAPHS ARE EVERYWHE RE!
SO, WHAT IS A GRAPHDB? ØData model is represented by nodes and relationships ØUses graph structures to semantically represent objects and relationships ØRelationships are first class citizens and can have properties on their own ØAllows simple and fast retrieval of complex hierarchical structures ØDirectly relates data items in the store to allow data to be linked together
TYPICAL USE CASES ØSocial Networks ØRecommendations engines ØPath Finding (How do I get from x to y in the shortest path) ØNetwork Topology diagrams
CORE CONCEPTS
BUILDING BLOCKS üNodes üRelationships üProperties üLabels
NODES ØNodes represent entities and complex types ØNodes can contain properties ØEach node can have different properties
RELATIONSHIPS ØEvery relationship has a name and direction ØRelationships can contain properties, which can further clarify the relationship ØMust have a start and end node
PROPERTIES ØKey value pairs used for nodes and relationships ØAdds metadata to your nodes and relationships ØEntity attributes ØRelationship qualities
LABELS ØUsed to represent objects in your domain (e. g. user, person, movie) ØWith labels, you can group nodes ØAllows us to create indexes and constraints with groups of nodes
DBS WITH BENEFITS… Graph. DBs focus on relationships over normalization
GRAPH DB VS RELATIONAL DB Graph DB Relational DB
GRAPH DATABASES: PROS AND CONS Pros: üEasy to query üAbility to connect disparate data easily without needing a common data model Cons: §Requires a different way to think about data §No single graph query language
POPULAR GRAPHDB ENGINES
Pros: Ø Runs complex distributed queries Ø Scales out through sharded storage Ø Returns data natively in JSON, making it ideally suited for web development Ø Written on top of Graph. QL Cons: Ø No native windows installation Ø No support for windows in a production environment
Pros: Ø Multi model DB – both graph and document DB Ø Easily add users/roles Ø Supports multiple databases Cons: Ø No native windows service installation Ø Requires more schema design up front
Pros: Ø Runs on Windows natively - in either a console or as a service Ø 24/7 production support since 2003 – Mature Ø Large and active user community Cons: Ø Only one DB can be running on one port at a time
WHAT DOES NEO 4 J PROVIDE? üFull ACID (atomicity, consistency, isolation, durability) üREST API üProperty Graph üLucene Index üHigh Availability (with Enterprise Edition)
CONSIDER USING NEO 4 J, IF YOU’VE EVER DONE ANY OF THE FOLLOWING: ØWritten a recursive CTE ØHad a Parent Id as a self-referencing foreign key in a table ØJoined more than 7 tables together ØNeeded to relate disparate, non-uniform data
“Neo 4 j helps us to understand our online shoppers’ behavior and the relationship between our customers and products, providing a perfect tool for real-time product recommendations. . As the current market leader in graph databases, and with enterprise features for scalability and availability, Neo 4 j is the right choice to meet our demands. It suits our needs very well. ” – Marcos Wada, Software Developer, Walmart “Our Neo 4 j solution is literally thousands of times faster than the prior My. SQL solution, with queries that require 10 -100 times less code. At the same time, Neo 4 j allowed us to add functionality that was previously not possible. ” – Volker Pacher, Senior Developer, e. Bay
MORE COMPLEX USE CASES
https: //neo 4 j. com/graphgist/a 123 a 6 fcd 881 -4206 -b 42 a-f 864 b 7 bfbbd 3 ORGANIZATION LEARNING What courses do I have to take to get my Certification? MATCH (c: Certification {name: “Certification"})-[: NEXT_LEARNING]-> (learning: Learning. Item)-[: FULFILLED_BY]->(course: Course) RETURN course. name
FRAUD DETECTION https: //neo 4 j. com/graphgist/9 d 627127 -003 b-411 a-b 3 cef 8 d 3970 c 2 afa#listing_category=fraud-detection How many account holders have duplicate contact information? MATCH (account. Holder: Account. Holder)-[]->(contact. Information) WITH contact. Information, count(account. Holder) AS Ring. Size MATCH (contact. Information)<-[]-(account. Holder) WITH collect(account. Holder. Unique. Id) AS Account. Holders, contact. Information, Ring. Size WHERE Ring. Size > 1 RETURN Account. Holders AS Fraud. Ring, labels(contact. Information) AS Contact. Type, Ring. Size ORDER BY Ring. Size DESC
DIABOOK – SOCIAL NETWORK Example using Type 1 Diabetes Disclaimer: all data presented is fictional
YOU CAN’T MODEL THAT (ISH) IN SQL ØThe SQL becomes more complex as the length of the relationships increase Ø Performance on the joins becomes an issue quickly ØSQL is not well-suited to model rich domains ØIt’s not easy to start at one row and follow relevant relationships along a path
SQL MODEL
FIND FRIENDS OF FRIENDS THAT HAVE TYPE 1 DIABETES SELECT Me. Person. Id AS Me. Id, Me. Name, Friend. Of. Friend. Related. Person. Id AS Suggested. Friend. Id, Friend. Of. AFriend. Name FROM Person AS Me INNER JOIN Person. Relationship AS My. Friends ON My. Friends. Person. Id = Me. Person. Id INNER JOIN Person. Relationship AS Friend. Of. Friend ON My. Friends. Related. Person. Id = Friend. Of. Friend. Person. Id INNER JOIN Person AS Friend. Of. AFriend ON Friend. Of. Friend. Related. Person. Id = Friend. Of. AFriend. Person. Id LEFT JOIN Person. Relationship AS Friends. With. Me ON Me. Person. Id = Friends. With. Me. Person. Id AND Friend. Of. Friend. Related. Person. Id = Friends. With. Me. Related. Person. Id INNER JOIN Person. Disease ON Person. Disease. Person. Id = Friend. Of. AFriend. Person. Id WHERE Friends. With. Me. Person. Id IS NULL AND Me. Person. Id <> Friend. Of. Friend. Related. Person. Id AND Me. Name = 'Bill' AND Person. Disease. Id = 1
NEO 4 J MODEL
NEO 4 J PROPERTY GRAPH
FIND FRIENDS OF FRIENDS THAT HAVE TYPE 1 DIABETES MATCH (user: Person {name: 'Bill'})-[: FRIENDS_WITH*2. . 5]>(fof)-[: DIAGNOSED_WITH]->(disease) return fof
BUILDING THE NETWORK Creating our small social network
CREATING NODES ØManually create nodes without relationships: CREATE (person: Person {name: 'Jan', age: '42'}) return person ØManually create nodes with relationships: CREATE p = (person: Person {name: 'Bill', age: '14'}) – [: DIAGNOSED_WITH] -> (disease: Disease { name: 'Type 1 Diabetes' }) RETURN p
ADDING RELATIONSHIPS Add a relationship between people nodes MATCH (p: Person {name: 'Jan'}), (f: Person {name: 'Samantha'}) CREATE (p)-[: FRIENDS_WITH {since: 2009}]->(f)
UPDATING NODE PROPERTIES ØSet additional properties on a node MATCH (person: Person { name: 'Jan' }) SET person. profession = 'Software Engineer' RETURN person
DELETING RELATIONSHIPS AND NODES ØDeletes a relationship MATCH ()-[r: FRIENDS_WITH]-() DELETE r ØDeletes a node MATCH (a: Camp) WHERE a. name='Joselin Diabetes Camp' DELETE a
REST API Ø POST to http: //localhost: 7474/db/data/transaction/commit { "statements" : [ { "statement" : "CREATE (n) RETURN id(n)" }] } Ø Can be used to execute multiple statements or begin, rollback, or commit a transaction
HELPFUL LINKS ü https: //neo 4 j. com/graphgists/ - Graph gists ü https: //neo 4 j. com/developer/cypher/ - Cypher query language ü https: //github. com/Readify/Neo 4 j. Client/wiki - Neo 4 j Client Documentation
- Slides: 43