Powering Filtration Process of Cyber Security Ecosystem Using
Powering Filtration Process of Cyber Security Ecosystem Using Knowledge Graph by Claude Asamoah, Lixin Tao, Keke Gai, and Ning Jiang Pace University Research Day Conference May 6 th 2016
Introduction • Cyber Security breaches and attacks are on the ascendancy as corporations, governments, universities, and private individuals are conducting their business and personal transactions on the web • This increasing participating on the web necessitates that robust and efficient cyber security systems need to be put in place by these entities to safeguard their cyber assets • Intelligent Systems needs to be employed to buttress the cyber security protocols established in cloud computing for proper decision-making, which may depend on the effective knowledge representation • However, as one of the dominant industry standards for knowledge representation, Web Ontology Language (OWL) has limitations, such as lack of support for custom relations • Pace University has extended OWL to support Knowledge Graph (KG) as a replacement to better support knowledge representation and decision making • This paper examines using KG as the basis in the design of a knowledgerepresentation system that drives the filtration process of the vast incoming cyber traffic of a company’s cyber security ecosystem in cloud computing by employing a use case of cyber security communications in-order to identify the entity relations of threats for the filtration process.
Knowledge Graph (KG) For Knowledge Representation (KR) Web Ontology Language (OWL) • is industry standard for KR • can be serialized in many formats • has two popular types of serialization • OWL/XML • RDF/XML
Limitations of OWL for Developing KR Applications • OWL supports the single first-class relation “is-a” • OWL supports the emulated custom relations with complex object and data properties • Most KR needs custom relations • Pace KG extended Stanford IDE to support custom relation declaration and application so that domain experts can easily create knowledge representation applications with custom relations
OWL/XML Serialization • Format proposed by University of Manchester UK • Represents OWL information in small XML structures similar to triples • Has standard XSD syntax definition • Not suitable for people to read or write because it scatters information into many small structures • Has no OWL/XML extension for supporting custom relations
RDF/XML Serialization • Was designed for RDF, but it can also be used to represent OWL ontologies with dedicated name spaces • Doesn’t have standard syntax definition because it also needs to support RDF, RDF Schema, NTriples, Turtle, and many others • Can represent multiple languages in the same document • Is more concise and used by most researchers
Challenges to be Surmounted • Syntax Validation • Visual Navigation • How to develop a new algorithm to support KG syntax validation so that domain experts can develop valid and robust Knowledge Graphs • Provide support for visual navigation of KGs to support domain experts navigate their KGs to ascertain the completeness and logical accuracy of their KGs • Domain experts need a straight forward custom relation declarations and applications • Pace University to provide a mechanism to declare and apply custom relations support in the same document and make it easy to encode and verify knowledge intuitively • Syntax Validator does not work for RDF/XML serialized documents. • Need syntax validation for a subset of RDF/XML to support researchers • Knowledge base documents can be huge and overwhelming in size and logical structure. • Needs a visual navigation mechanism to support easy navigation since real-life Knowledge Graph can easily contain hundreds or thousands of classes with complex inter-relations that can pose a major challenge for the review and validation of knowledge representation • Provide a Web and Application models of Visual Navigation and Visualization Apps
Importance of Syntax Validation Though RDF/XML are popular to researchers, • there is no syntax validation schema for RDF/XML serialized documents • domain experts (not IT experts) can easily introduce bugs, when designing KR systems using RDF/XML documents • typical KR documents are large and complicated and document transmission can compromise documents The main objective for syntax extension to support custom relations is to • enable domain experts (not IT experts) to be able to declare custom relations and use them in the same document • use custom relations directly and intuitively without using object property emulation • declare and apply custom relations in IDEs like Protégé
Importance of KG Visual Navigation and Visualization • Very hard for domain experts to have global view of their KR. • Since a real-life Knowledge Graph can easily contain hundreds or thousands of classes with complex inter-relations, it is a major challenge for domain experts to review and validate their knowledge representation • It is hard for application developers to fully understand the complex relationship among the classes especially very large KGs. Need a visualization and navigation mechanism • IDEs like Protégé with plugins such as OWLViz may not be suitable to display all class relations and the elements embedded in very large KGs it its entirety
KG Syntax Validation: RDF/XML Syntax Specification • RDF/XML Syntax Specification can be broken into three parts • Definition of Namespace • Definition of Custom Relation • Definition of Classes
RDF/XML Syntax Specification: Namespace Definition Example
RDF/XML Syntax Specification: Custom Relations Definition Example
RDF/XML Syntax Specification: Classes Definition Example
Syntax Validation Schema Design highlights • Need to create a RDF/XML XSD schema that can validate a subset of RDF/XML centric documents • Found out that the six main namespaces rdfs, main(rdf), owl, rel, xml, and pace namespaces needed to be created as individual schemas that import each schema into each other so that all six namespaces persist in each schema • This is because a schema can have only one targeted name space within it
Pace Schema (six namespaces schemas working as one unit)
Main Schema showing imports Example
RDF/OWL Syntax Validation Algorithm
Psuedo Code for Multiple Pass Syntax Validation
KG Syntax Validation Flowchart
KG Visual Navigation Web-based Workflow
Visual Navigation Web Design Process Flow
Flow Chart KG Visual Navigation implementation (Web App)
KG Navigation Web Application Main Page
KG Visual Navigation Classes Page
Individual Class Relations Page
Relations Page
Individual Relations Page
Work Flow of Application based KG Navigation
Process Flow of Application Model
KG Visual Navigation Application Model
Web versus Application: Pros and Cons • Both models are rugged and requires no internet connection • Both do not require a database • Application base is much cleaner and easier to use than the web model in that you just locate the KG via a File explorer • Web-base model requires launching from the command line each time you want to input a new KG
Cyber Security Filtration Approach Use Case • This use case assumes that an engineer intends to build a knowledge graph of cyber security terminology using a semantic approach to bridging the gap between different levels of technicality from layman to technological sophistication. • The requirement is that people may be referring to the same or near similar things but using different terminologies which need to be reconciled. • The benefit of reconciling these different versions of terminologies is to use the information collected to build a powerful cyber security filtration process based on knowledge representation in-order to channel cyber security threats to the appropriate channel for resolution. • The engineer posits that a same cyber threat can have different terminologies and a good filtration system should recognize that the different levels of threat terminologies are actually referring to the same thing and therefore pipe threat to the appropriate channel for resolution.
Benefits of Using KG • The use of a KG will expose threat relations which might rather be obscured but laid bare by building a hierarchical tree to develop the class relations. • The requirement is that people may be referring to the same or near similar things but using different terminologies which can be reconciled by KG which will examine the different levels of terminology of the threat type
Fig. 1: Cyber Security hierarchical tree
Fig. 2: Cyber security concepts relationships
Fig. 3: Threat Vulnerabilities mitigation class relations
Fig. 4: Threat Mitigations and Threat Type relations
Fig. 5: Threat Types Ontology Class Relations
Cyber Security Filtration Process Some assumptions: • Company has some basic cyber security safeguards such as • Anti virus software • Firewalls • Policies • Mitigation programs • There are many cyber security event types such as Email, File, Endpoint, External, URL, and many others. This paper for brevity is considering only Email and File Events in the security filtration design.
Threat Mitigation Architecture Based on Fig 3 of the cyber security communication KG, the main blocks of the threat vulnerability mitigation process has been identified as • Detection [know about it] • Deterrence [Effective method of reducing frequency of security compromises] • Defense [desire to protect yourself] These main blocks may have sub divisions with further specializations. For example Detection, Deterrence, and Defense may have Mitigation Classification which can be sub divided into Events such as • Email • File Email and Files can be directed for disposition to • Behavioral Agent or • Signature Matching Agent
Threat Mitigation Architecture • Filtered threats will be sent to these three blocks that specializes in certain types of vulnerability mitigation and threat resolution. • Fig. 4 displays the threat types with some having different terminologies (both expert and layman terms). • These threat types need to be filtered to be grouped appropriately and sent to the proper channel for resolution. • The Class relations provided by KG via custom relations will fetch all related entities to a threat type and facilitate the proper classification, filtration, and analysis of cyber security threats and send these threats to the appropriate channel for mitigation or resolution. •
Cyber Security Threat Filtration System
Conclusions • This paper • proposed an approach using KG for increasing the filtration process of the cyber security ecosystem. • proposed extending ontology to KG simplifies the relation of classes using predicates in the Triple format in that expressive custom relations could be used to relate classes in a meaningful way • addressed the problem of syntax validation for RDF Serialized KG and the problem that an xsd schema could only have one targeted namespace prompting the design of Pace Schema • provided support for visual navigation of KGs to support easy navigation since real-life Knowledge Graph can easily contain hundreds or thousands of classes with complex inter-relations that can pose a major challenge for the review and validation of knowledge representation
References • • • • • • • • • • • • • • [1] K. Gai, M. Qiu, S. Jayaraman, and L. Tao. Ontology-based knowledge representation for secure self-diagnosis in patient-centered telehealth with cloud systems. In The 2 nd IEEE Int’l Conf. on Cyber Security and Cloud Computing , pages 98– 103, New York, USA, 2015. IEEE. [2] M. Qiu, M. Zhong, J. Li, K. Gai, and Z. Zong. Phase-change memory optimization for green cloud with genetic algorithm. IEEE Transactions on Computers, 64(12): 3528 – 3540, 2015. [3] K. Gai and S. Li. Towards cloud computing: a literature review on cloud computing and its development trends. In 2012 Fourth Int’l Conf. on Multimedia Information Networking and Security , pages 142– 146, Nanjing, China, 2012. [4] K. Gai, M. Qiu, H. Zhao, L. Tao, and Z. Zong. Dynamic energy-aware cloudlet-based mobile cloud computing model for green computing. Journal of Network and Computer Applications , 59: 46– 54, 2016. [5] L. Tao, S. Golikov, K. Gai, and M. Qiu. A reusable software component for integrated syntax and semantic validation for services computing. In 9 th Int’l IEEE Symposium on Service-Oriented System Engineering, pages 127– 132, San Francisco Bay, USA, 2015. [6] K. Gai, M. Qiu, B. Thuraisingham, and L. Tao. Proactive attributebased secure data schema for mobile cloud in financial industry. In The IEEE International Symposium on Big Data Security on Cloud; 17 th IEEE International Conference on High Performance Computing and Communications, pages 1332– 1337, New York, USA, 2015. [7] M. Qiu, K. Gai, B. Thuraisingham, L. Tao, and H. Zhao. Proactive user-centric secure data scheme using attribute-based semantic access controls for mobile clouds in financial industry. Future Generation Computer Systems, PP: 1, 2016. [8] D. Pancho, J. Alonso, O. Cordo´ n, A. Quirin, and L. Magdalena. FINGRAMS: visual representations of fuzzy rule-based inference for expert analysis of comprehensibility. IEEE Transactions on Fuzzy Systems, 21(6): 1133– 1149, 2013. [9] X. Wang, Z. Ma, J. Chen, and X. Meng. f-RIF metamodel-centered fuzzy rule interchange in the semantic web. Knowledge-Based Systems, 70: 137– 153, 2014. [10] M. Lytras and R. Garc´ıa. Semantic web applications: A framework for industry and business exploitation-what is needed for the adoption of the semantic web from the market and industry. International Journal of Knowledge and Learning, 4(1): 93– 108, 2008. [11] Y. Li, K. Gai, Z. Ming, H. Zhao, and M. Qiu. Intercrossed access control for secure financial services on multimedia big data in cloud systems. ACM Transactions on Multimedia Computing Communications and Applications , PP(99): 1, 2016. [12] A. Altowayan and L. Tao. Simplified approach for representing partwhole relations in OWL-DL ontologies. In The IEEE International Symposium on Big Data Security on Cloud; 17 th IEEE International Conference on High Performance Computing and Communications , pages 1399– 1405, New York, NY, USA, 2015. IEEE. [13] A. Formica. Similarity reasoning for the semantic web based on fuzzy concept lattices: An informal approach. Information Systems Frontiers, 15(3): 511– 520, 2013. [14] M. Minsky. A framework for representing knowledge. MIT Publishing, 1974. [15] N. Nilsson. Logic and artificial intelligence. Artificial Intelligence, 47(1 -3): 31– 56, 1991. [16] K. Gai, M. Qiu, L. Chen, and M. Liu. Electronic health record error prevention approach using ontology in big data. In 17 th IEEE International Conference on High Performance Computing and Communications, pages 752– 757, New York, USA, 2015. [17] K. Patel, I. Dube, L. Tao, and N. Jiang. Extending OWL to support custom relations. In IEEE 2 nd International Conference on Cyber Security and Cloud Computing, pages 494– 499, New York, NY, USA, 2015. IEEE.
- Slides: 44