IKT 437 Knowledge Engineering and Representation No SQL

  • Slides: 30
Download presentation
IKT 437 Knowledge Engineering and Representation No. SQL ~ No SQL or Not Only

IKT 437 Knowledge Engineering and Representation No. SQL ~ No SQL or Not Only SQL Jan Pettersen Nytun, Ui. A

Overview • Introduction and Motivation • Categories of No. SQL • Examples of No.

Overview • Introduction and Motivation • Categories of No. SQL • Examples of No. SQL systems • Encodings • Querying • Examples • Summary 2

NOSQL – Comes in many different variants Some Possible Characteristics All characteristics may not

NOSQL – Comes in many different variants Some Possible Characteristics All characteristics may not be supported • Non-relational • Flexible schema • Other or additional query languages than SQL • Distributed – horizontal scaling • Less structured data • Supports big data 3

The Benefits of No. SQL [https: //www. mongodb. com/nosql-explained] When compared to relational databases,

The Benefits of No. SQL [https: //www. mongodb. com/nosql-explained] When compared to relational databases, No. SQL databases are more scalable and provide superior performance, and their data model addresses several issues that the relational model is not designed to address: • Geographically distributed architecture instead of expensive, monolithic architecture • Large volumes of rapidly changing structured, semistructured, and unstructured data • Agile sprints, quick schema iteration, and frequent code pushes • Object-oriented programming that is easy to use and flexible 4

[ref: http: //www. cs. tut. fi/~tjm/seminars/nosql 2012/No. SQL-Intro. pdf] 5

[ref: http: //www. cs. tut. fi/~tjm/seminars/nosql 2012/No. SQL-Intro. pdf] 5

Overview • Introduction and Motivation • Categories of No. SQL • Examples of No.

Overview • Introduction and Motivation • Categories of No. SQL • Examples of No. SQL systems • Encodings • Querying • Examples • Summary 6

No. SQL Database Types [https: //www. mongodb. com/nosql-explained] • Graph stores are used to

No. SQL Database Types [https: //www. mongodb. com/nosql-explained] • Graph stores are used to store information about networks of data, such as social connections. Graph stores include Neo 4 J and triple stores like Fuseki. • Document databases pair each key with a complex data structure known as a document. • Key-value stores are the simplest No. SQL databases. Every single item in the database is stored as an attribute name (or 'key'), together with its value. Examples of key-value stores are Riak and Berkeley DB. • Wide-column stores such as Cassandra and HBase are optimized for queries over large datasets, and store columns of data together, instead of rows. 7

Document Store • The central concept is the notion of a "document“ which corresponds

Document Store • The central concept is the notion of a "document“ which corresponds to a row in RDBMS. • A document comes in some standard formats like JSON (BSON). • Documents are addressed in the database via a unique key that represents that document. • The database offers an API or query language that retrieves documents based on their contents. • Documents are schema free, i. e. , different documents can have structures and schema that differ from one another. (An RDBMS requires that each row contain the same columns. ) 8

Mongo. DB to documents (JSON): { _id: Object. Id("51156 a 1 e 056 d

Mongo. DB to documents (JSON): { _id: Object. Id("51156 a 1 e 056 d 6 f 966 f 268 f 81"), type: "Article", author: "Derick Rethans", title: "Introduction to Document Databases with Mongo. DB", date: ISODate("2013 -04 -24 T 16: 26: 31. 911 Z"), body: "This arti…" }, { _id: Object. Id("51156 a 1 e 056 d 6 f 966 f 268 f 82"), type: "Book", author: "Derick Rethans", title: "php|architect's Guide to Date and Time Programming with PHP", isbn: "978 -0 -9738621 -5 -7" } 9

What's the most popular No. SQL database? [https: //www. quora. com/Whats-the-most-popular-No. SQL-database] Vadim Ismakaev,

What's the most popular No. SQL database? [https: //www. quora. com/Whats-the-most-popular-No. SQL-database] Vadim Ismakaev, Co-Founder at Grace. Updated Apr 27, 2015 • Asking “what No. SQL database is the most popular” is a bit incorrect since different problems require different types of No. SQL solutions. …focus on solving very specific problems. While this allows to achieve the best possible results in those specific cases, it comes at a cost of some other functionalities. 10

So - what's the most popular No. SQL database? Top No. SQL Database Engines

So - what's the most popular No. SQL database? Top No. SQL Database Engines by http: //www. kdnuggets. com/2016/06/top-nosqldatabase-engines. html Next Two Slides: 11

Method of calculating the scores of the DB-Engines Ranking [http: //db-engines. com/en/ranking_definition] We measure

Method of calculating the scores of the DB-Engines Ranking [http: //db-engines. com/en/ranking_definition] We measure the popularity of a system by using the following parameters: • Number of mentions of the system on websites, … • General interest in the system. For this measurement, we use the frequency of searches in Google Trends. • Frequency of technical discussions about the system. . . Stack Overflow … • Number of job offers, in which the system is mentioned. . . • Number of profiles in professional networks, in which the system is mentioned. . . Linked. In … • Relevance in social networks. We count the number of Twitter tweets, in which the system is mentioned. 12

[http: //www. kdnuggets. com/2016/06/top-nosql-database-engines. html] Document databases: Mongo. DB Wide-column stores: Cassandra and Hbase

[http: //www. kdnuggets. com/2016/06/top-nosql-database-engines. html] Document databases: Mongo. DB Wide-column stores: Cassandra and Hbase key-value: Redis Graph database: Neo 4 j 13

[http: //db-engines. com/en/ranking_trend] The DB-Engines Ranking ranks database management systems according to their popularity

[http: //db-engines. com/en/ranking_trend] The DB-Engines Ranking ranks database management systems according to their popularity – not only NOSQL databases 14

Neo 4 J • Graph-oriented • Implemented in Java and accessible from software written

Neo 4 J • Graph-oriented • Implemented in Java and accessible from software written in other languages using the Cypher query language through a transactional HTTP endpoint. • ACID-compliant transactional database with native graph storage and processing. • The most popular graph database. • Everything is stored as an edge, a node or an attribute. • Each node and edge can have any number of attributes. • Both the nodes and edges can be labelled. • Labels can be used to narrow searches. 15

Following Slides are copied from a presentation made by Jim Webber Neo 4 J

Following Slides are copied from a presentation made by Jim Webber Neo 4 J

from stole loves enemy companion appeared in enemy appeared in companion appeared in Victory

from stole loves enemy companion appeared in enemy appeared in companion appeared in Victory of the Daleks appeared in A Good Man Goes to War

Property Graph Model

Property Graph Model

Property Graph Model ITH W _ S L TRAVE LOVES WITH _ S L

Property Graph Model ITH W _ S L TRAVE LOVES WITH _ S L E TRAV TRA V ELS_ IN ED OW 963 R R : 1 O B ear y

Property Graph Model ITH W _ S L TRAVE LOVES WITH _ S L

Property Graph Model ITH W _ S L TRAVE LOVES WITH _ S L E TRAV first name: Rose late name: Tyler TRA V ELS_ IN name: the Doctor age: 907 species: Time Lord ED OW 963 R R r: 1 O B ea y vehicle: tardis model: Type 40

Graphs are very whiteboard-friendly

Graphs are very whiteboard-friendly

What’s Neo 4 j? • It’s is a Graph Database • Embeddable and server

What’s Neo 4 j? • It’s is a Graph Database • Embeddable and server • Full ACID transactions – don’t mess around with durability, ever. • Schema free

More on Neo 4 j • Neo 4 j is stable – In 24/7

More on Neo 4 j • Neo 4 j is stable – In 24/7 operation since 2003 • Neo 4 j is under active development • High performance graph operations – Traverses 1, 000+ relationships / second on commodity hardware

Neo 4 j Logical Architecture REST API Java Ruby … Clojure JVM Language Bindings

Neo 4 j Logical Architecture REST API Java Ruby … Clojure JVM Language Bindings Traversal Framework Core API Caches Memory-Mapped (N)IO Filesystem Graph Matching

Data access is programmatic • Through the Java APIs – JVM languages have bindings

Data access is programmatic • Through the Java APIs – JVM languages have bindings to the same APIs • JRuby, Jython, Clojure, Scala… • • • Managing nodes and relationships Indexing Traversing Path finding Pattern matching

Core API • Deals with graphs in terms of their fundamentals: – Nodes •

Core API • Deals with graphs in terms of their fundamentals: – Nodes • Properties – KV Pairs – Relationships • Start node • End node • Properties – KV Pairs

Creating Nodes Graph. Database. Service db = new Embedded. Graph. Database("/tmp/neo"); Transaction tx =

Creating Nodes Graph. Database. Service db = new Embedded. Graph. Database("/tmp/neo"); Transaction tx = db. begin. Tx(); try { Node the. Doctor = db. create. Node(); the. Doctor. set. Property("character", "the Doctor"); tx. success(); } finally { tx. finish(); }

Creating Relationships Transaction tx = db. begin. Tx(); try { Node the. Doctor =

Creating Relationships Transaction tx = db. begin. Tx(); try { Node the. Doctor = db. create. Node(); the. Doctor. set. Property("character", "The Doctor"); Node susan = db. create. Node(); susan. set. Property("firstname", "Susan"); susan. set. Property("lastname", "Campbell"); susan. create. Relationship. To(the. Doctor, Dynamic. Relationship. Type. with. Name("COMPANION_OF")); tx. success(); } finally { tx. finish(); }

Indexing a Graph? • Graphs are their own indexes! • But sometimes we want

Indexing a Graph? • Graphs are their own indexes! • But sometimes we want short-cuts to wellknown nodes • Can do this in our own code – Just keep a reference to any interesting nodes

Why graph matching? • It’s super-powerful for looking for patterns in a data set

Why graph matching? • It’s super-powerful for looking for patterns in a data set – E. g. retail analytics • Higher-level abstraction than raw traversers – You do less work!