Loading a Database to SEMOSS Ingrid Narvaez Fall
Loading a Database to SEMOSS Ingrid Narvaez Fall 2013 Dr. Mongi Abidi
Outline • Introduction – Overview of Key Elements – Additional Details – Creating a SEMOSS Database • • Slide 2 Metamodel Generating Loading Sheets Building the database Questions sheet
Introduction • Objective – Demonstrate how a database can be uploaded into SEMOSS for future analysis. • Key elements to have a basic knowledge of – SPARQL – RDF data formatting – SEMOSS Slide 3
Overview of key elements • SPARQL • Scripting language that allows us to query databases. • RDF Data formation • SPARQL queries are executed against information contained in RDF files. • SEMOSS • Visualization tool to analyze the results of the query. Slide 4
Additional Details • Software Used – SEMOSS – Any text based software to write RDF file such as Notepad. – Excel • Database Description – The database defines the security network of a company. – Describes a network of employees with a specified workstation. – The features described in the dataset include: IP address, employee’s name and sites. Slide 5
Creating a SEMOSS Database • To create a SEMOSS ready database follow the following steps: 0. Understand the data and basic relations within the data. 1. Develop the metamodel 2. Load the metamodel into the SEMOSS property sheet (Map sheet or RDF) 3. Generate Loading sheets 4. Build the dataset 5. Develop the Questions Sheet Slide 6
Metamodel • The single most important step in database creation ready for the SEMOSS environment. • This step requires that the user is familiarized with the core relations that there exist within the data. • Should be created based on the core relations and not on any specific question we want to answer later in the query state. Slide 7
Example Metamodel Web Traffic Log Asset Database HR Database Seemingly disparate data can be combined with a common relationship Slide 8
Metamodel Translation • The next step, after the metamodel has been visualized, is to translate it into a mapping sheet. • To load the triplets into the RDF file follow the following steps: 1. Set the base objects 2. Set the base relationships 3. Set the class relationships • The above steps can be completed in any text software. Slide 9
Metamodel Translation • Set the base Objects – Nodes or vertices of the graph • Metamodel includes: Asset. ID, IPAddress, Server, Username, Employee, Site. – RDF Syntax • Case Sensitive • URI can be anything • One line per node – Example ##Base Objects## Site. Employee http: //health. mil/ontologies/dbcm/Concept/Site. Employee. Asset. ID http: //health. mil/ontologies/dbcm/Concept/Employee. Asset. ID Slide 10
Metamodel Translation • Set the base Relationships – Relations between nodes represented by edges. – Metamodel includes: accesses, assigned. To, has – RDF Syntax • Case Sensitive • One line per relationship – Example ##Base Predicates## Asset. ID_Accesses_IPAddress http: //health. mil/ontologies/dbcm/Relation/Accesses Employee_Has_User. Name http: //health. mil/ontologies/dbcm/Relation/Has … Slide 11
Metamodel Translation • Set the class relationships – Define the concept relationships for each new node type and verb. – All nodes of type concept are added – All predicates of type relation are added – Structure: • Node types in question • Standard base predicate to denote subclass • Class – Example http: //health. mil/ontologies/dbcm/Concept/Asset. ID+http: //www. w 3. org/2000/01/rdfschema#sub. Class. Of+http: //health. mil/ontologies/dbcm/Concept; Slide 12
Generate Loading Sheets • Tool used: Excel • Guidelines – First tab must be called Loader – Other tabs contain specific information surrounding an individual triple store that is loaded. Also called the relation tabs. Slide 13
Generate Loading Sheets • The Loader Sheet “Semoss. ” [Online]. Available: http: //semoss. org/userdocs. html#Generate. Loading. Sheets. [Accessed: 21 -Nov-2013]. Slide 14
Generate Loading Sheets • Relation Tabs • Cell A 1 must be called Relation • Cell B 1 must be called the same as a node. This node is at the beginning of the edge. • Cell B 1 must be called the same as a node. This node is at the end of the edge. • Cell A 2 must contain a relation (from metamodel). • Additional features or properties of the edges can be added in the following columns. Slide 15
Generate Loading Sheets • Node Property tabs • Additional node properties can be added to separate tabs. • Cell A 1 must be named Node. • Cell A 2 must be named Ignore. • Cell B 1 must contain the name of the node whose properties follow. Slide 16
Building the Database • To import a database to SEMOSS locate the DB Modification tab at the top of the window an either create a new database or add data to an existing one. • To add to an existing database, simply browse for the excel file where the data is locates. SEMOSS will not duplicate any existing triples. Slide 17
Building the Database • To create a new database follow the following guidelines – The name of the database must not contain any spaces. – All objects and predicate types must be contained in the Base object and Base predicates section of the MAP file. – The MAP file can not be opened during import • Successful data import is prompted by SEMOSS. Slide 18
Questions Sheet • Contains base SPARQL queries. • Created automatically using a base subset of generic questions. • At the end, the results are shown in the left panel in the form of a drop down menu. Slide 19
Questions Sheet • Sheet Structure – The available perspectives are noted PERSPECTIVE Generic-Perspective; Security-Perspective – The questions for each perspective are noted Generic-Perspective GQ 1; GQ 2; GQ 3; GQ 4 Security-Perspective SQ 1; SQ 2; SQ 3; SQ 4; SQ 5; SQ 6 – The questions are listed SQ 1 What workstations have accessed this IP address? SQ 2 What IP addresses has this workstation accessed? SQ 3 What IP addresses has this Employee accessed? … Slide 20
Questions Sheet • The individual queries are entered • A query has two components – Layout Specification (grid, graph…) – Actual query text – SQ 1_LAYOUT prerna. ui. components. Graph. Play. Sheet SQ 1_QUERY CONSTRUCT {? Asset. ID ? Accesses ? IPAddress} WHERE {{? IPAddress < http: //www. w 3. org/2000/01/rdf-schema#label > "@IPAddress@"; }{? IPAddress < http: //www. w 3. org/1999/02/22 -rdfsyntax-ns#type> < http: //Sdb. com/ontologies/Concept/IPAddress>; }{? Accesses < http: //www. w 3. org/2000/01/rdf-schema#sub. Property. Of> < http: //Sdb. com/ontologies/Relation/Accesses > ; }{? Asset. ID < http: //www. w 3. org/1999/02/22 -rdf-syntax-ns#type > < http : //Sdb. com/ontologies/Concept/Asset. ID>; }{? Asset. ID ? Accesses ? IPAddress} Slide 21
References [1] “Semoss. ” [Online]. Available: http: //semoss. org/. [Accessed: 14 -Nov-2013]. [2] “SEMOSS Security Demo - You. Tube. ” [Online]. Available: http: //www. youtube. com/watch? v=1 x. Rl. Cz. I 7 yj. [Accessed: 21 -Nov-2013]. Slide 22
- Slides: 22