Disclosure to 2019 FIRST CTI Symposium London Copyrightc2019
- Slides: 31
Disclosure to 2019 FIRST CTI Symposium London Copyright(c)2019 NTT Corp. All Rights Reserved. A Lightweight Markup Language for Graph-Structured Threat Sharing Mayo YAMASAKI NTT-CERT, NTT Secure Platform Labs Copyright© 2019 NTT Corp. All Rights Reserved.
$whoami n Mayo YAMASAKI @ Tokyo, Japan n Researcher at NTT R&D n Member of NTT-CERT's OSINT Team n R&D Topics n Search System n Knowledge Extraction from Text with NLP/ML n Knowledge Representation for Threat Sharing www. ntt-cert. org Copyright© 2019 NTT Corp. All Rights Reserved. 1
Agenda n Background n Overview of Proposed Lightweight Markup Language n Demo n Capability & Limitation n Current Implementation n Future Work Copyright© 2019 NTT Corp. All Rights Reserved. 2
Agenda n Background n Overview of Proposed Lightweight Markup Language n Demo n Capability & Limitation n Current Implementation n Future Work Copyright© 2019 NTT Corp. All Rights Reserved. 3
Security Reporting in NTT-CERT n Publishing about 20 security (threat) reports every day n Creating these reports associated with STIX data n Using web-based internal collab system for security reporting STIX Data Security Report Io. Cs CVEs TTPs Attributions Copyright© 2019 NTT Corp. All Rights Reserved. 4
Problem in Operation n Difficulty of creating structured STIX data in our operations n Time constraints and/or inadequate analyst training n Shortage of structured data in operations[1][2] n Unstructured expression used by 60% practitioners n Lack of context such as TTPs in structured data How to create structured data more easily ? [1] "The Value of Threat Intelligence: The Second Annual Study of North American & United Kingdom Companies ", Anomali, 2017. [2] "Exploring the opportunities and limitations of current Threat Intelligence Platforms", ENISA, 2018. Copyright© 2019 NTT Corp. All Rights Reserved. 5
Our Approach n Text-based representation, not GUI n New language for STIX 2 compatible graph data High Graph Expressiveness STIX System Language Proposal RDF&DOT Human Language Markdown Low Graph Expressiveness Copyright© 2019 NTT Corp. All Rights Reserved. 6
RDF; Resource Description Framework n Graph as set of tuples such as (subject, predicate, object) n Multiple format including JSON, XML, N-Triples, Turtle, etc. RDF N-Triples <http: //example. com/ipv 4/192. 168. 0. 0> <http: //example. com/indicates> <http: //example. com/malware/Evil. Rat>. <http: //example. com/ipv 4/192. 168. 0. 0> <http: //example. com/indicates> <http: //example. com/malware/Evil. Trojan>. Graph Representation Evil. Rat(malware) Evil. Trojan(malware) indicates 192. 168. 0. 0(ioc-ipv 4) Copyright© 2019 NTT Corp. All Rights Reserved. 7
DOT Language n A DSL for a graph visualization tool such as Graphviz DOT Representation digraph G { a [label = “malware: Evil. Rat”]; b [label = “malware: Evil. Trojan”]; c [label = “ioc-ipv 4: 192. 168. 0. 0”]; c -> a [label=“indicates”]; c -> b [label=“indicates”]; } Graph Representation Evil. Rat(malware) Evil. Trojan(malware) indicates 192. 168. 0. 0(ioc-ipv 4) Copyright© 2019 NTT Corp. All Rights Reserved. 8
Agenda n Background n Proposed Lightweight Markup Language n Demo n Capability & Limitation n Current Implementation n Future Work Copyright© 2019 NTT Corp. All Rights Reserved. 9
Overview n Consists of pre-shared graph schema & graph data n Schema: domain specific definition of edges between node types n Rarely updated like RDB-schema n Data(Report): plane-text with nodes as "[NAME]{TYPE}" n Edges automatically extracted according to the schema Sharing a report Team A Schema Report STIX Schema Team B Report STIX Copyright© 2019 NTT Corp. All Rights Reserved. 10
Example 1: IP with malware Schema: A table of directed relationships from rows to cols Destination types malware ioc-ipv 4 Source types malware - - ioc-ipv 4 indicates - Report [Evil. Rat]{malware} sends system information to [192. 168. 0. 0]{ioc-ipv 4}. The IP address also had been used for C 2 server of [Evil. Trojan]{malware}. Edge types STIX compatible Graph Evil. Rat(malware) Evil. Trojan(malware) indicates 192. 168. 0. 0(ioc-ipv 4) Copyright© 2019 NTT Corp. All Rights Reserved. 11
Constraints 1. Any pair of node types must have 1 or 0 edge type. n To deterministically extract edges 2. All pair of nodes in a paragraph has edges if edge between these node types is defined in schema. n To extract multiple graph data in a document Copyright© 2019 NTT Corp. All Rights Reserved. 12
How to Make STIX Data? 1. Extracting graphs from each paragraphs in a report 2. Merging nodes by comparing node names/types 3. Rule-based conversion to STIX data Report Sub Graphs Graph STIX Copyright© 2019 NTT Corp. All Rights Reserved. 13
Example 2: A Sample Scenario Par. 1 Par. 2 Par. 3 Par. 4 Par. 5 Copyright© 2019 NTT Corp. All Rights Reserved. 14
Schema & Rules for STIX 2 n Schema definition including 52 node types & 200 edge labels n SRO relationship_type between two SDO types n Additional node types for usability and a language's constraint n Examples of conversion rules ioc-ipv 4 file-sha 256 label=“indicator”, pattern="[ipv 4 -addr: vlaue=${NODE_NAME}]", labels=[“unknown”] label=“observed-data”, objects={ "0": {"type": "file", "hashes": { "SHA-256": "${NODE_NAME}"}} }, first_observed=${CURRENT_TIMESTAMP}, last_observed=${CURRENT_TIMESTAMP}, number_observed=1 Copyright© 2019 NTT Corp. All Rights Reserved. 15
Agenda n Background n Overview of Proposed Lightweight Markup Language n Demo n Capability & Limitation n Current Implementation n Future Work Copyright© 2019 NTT Corp. All Rights Reserved. 16
Agenda n Background n Overview of Proposed Lightweight Markup Language n Demo n Capability & Limitation n Current Implementation n Future Work Copyright© 2019 NTT Corp. All Rights Reserved. 17
Evaluation of Proposed Language 1. Comparing editing cost for evaluating usability n How easy is creating structured data ? n Comparing raw STIX, RDF, DOT, and the proposed language 2. Measuring STIX data coverage for understanding tradeoff n How much is coverages ratio of STIX data on the proposed lang ? n Comparing between existing STIX data and the proposed lang Copyright© 2019 NTT Corp. All Rights Reserved. 18
Comparing Editing Cost: Method n Comparing editing costs on existing text-based threat reports n Levenshtein Distance*1 of markup-ed reports to original ones n Costs of STIX, RDF and DOT equal number of characters n expressions separated from text-based reports Proposed Lang Original Report Editing Cost: 16 [Evil. Rat]{malware} sends stolen Evil. Rat malware sends data to [192. 168. 0. 0]{ioc-ipv 4}. stolen data to 192. 168. 0. 0. It is the minimum number of single-character editing operations (insertion or deletion) required to convert one sequence into the other. *1 Copyright© 2019 NTT Corp. All Rights Reserved. 19
Comparing Editing Cost: Dataset n Collecting 15 threat reports on the WEB and extracting these text n Creating structured data by using representations n STIX(JSON, YAML) n RDF(N-Triples, Turtle) n DOT Stats of Dataset Count Avg. text length (number of characters) 13, 548 Avg. number of SDOs (≒ number of nodes) 13. 9 Avg. number of SROs (≒ number of edges) 14. 5 Copyright© 2019 NTT Corp. All Rights Reserved. 20
Comparing Editing Cost: Result Avg. Cost Percentage of Avg. Cost to STIX(JSON) 17509 100 STIX(YAML) 14001 80 RDF(N-Triples) 3239 19 RDF(Turtle) 2594 15 DOT 1793 10 Proposed Lang 327 2 Copyright© 2019 NTT Corp. All Rights Reserved. 21
STIX Data Coverage: Method & Dataset n Method: Comparing data coverage on existing STIX data n Coverage: Percentage of all STIX objects and object's attributions extracted from proposed lang's report to existing STIX data's ones n Dataset: 3 threat reports on the STIX official site*1 n APT 1, Poison Ivy, IMDDOS *1 https: //oasis-open. github. io/cti-documentation/stix/examples Copyright© 2019 NTT Corp. All Rights Reserved. 22
STIX Data Coverage: Result Report SDO Coverage SRO Coverage Attribution Coverage APT 1 46/48 30/30 94/422 Poison Ivy 45/66 53/90 107/351 4/9 2/5 8/35 IMDDOS n Uncovered data n SDO: marking-definition, complex indicator containing AND/OR n SRO: associated with uncovered SDOs n Attribution: created, modified, description, objec_marking_refs, labels, and kill_chain_name Copyright© 2019 NTT Corp. All Rights Reserved. 23
Agenda n Background n Overview of Proposed Lightweight Markup Language n Demo n Capability & Limitation n Current Implementation n Future Work Copyright© 2019 NTT Corp. All Rights Reserved. 24
Tools: Currently Internal Project : ( n Raph: A parser of lightweight markup language for graph description n Golang implementation and API n Domain independent graph description n r. CTI: Raph Language for Cyber Threat Intelligence n Golang's CLI tool, REST API server, and WEB editor app n CTI domain graph description n Exporting STIX 2 data Copyright© 2019 NTT Corp. All Rights Reserved. 25
r. CTI REST API sample Input Text Output STIX Copyright© 2019 NTT Corp. All Rights Reserved. 26
Integration in Our Team New & Experimental Flow n Our Team's Workflow n Collecting open source info Collab System for Security Reporting n Analyzing selected events App DBs n Writing reports with proposed lang Browser n Editing STIX data on WEB UI REST API n Publishing reports TIPs r. CTI Server Copyright© 2019 NTT Corp. All Rights Reserved. 27
Agenda n Background n Overview of Proposed Lightweight Markup Language n Demo n Capability & Limitation n Current Implementation n Future Work Copyright© 2019 NTT Corp. All Rights Reserved. 28
Future Work n Language n Representing attributions n Schema n Fixing and Expanding schema for CTI n Implementation n Resolving equivalence identification of nodes including TIP's ones n Next Step n Using markup-ed reports for ML-based knowledge extraction Copyright© 2019 NTT Corp. All Rights Reserved. 29
Summary Proposed a lightweight markup language for graph-structured CTI Copyright© 2019 NTT Corp. All Rights Reserved. 30
- First cti
- Data centre world london 2019
- Acd pbx
- Georgia cti
- Salesforce cti adapter
- Cti 8200
- Cts cti cosa sono
- Sans cti summit
- Siipped hidalgo
- Cti 란
- Cti uib
- Cti assessment
- Cti assessment
- Cti 2500
- Cti life sciences fund
- Cti energy sas
- Cti otca svojho i matku svoju
- Oasis cti
- Thomaslloyd cti vario erfahrungen
- Ntt cti
- Open disclosure in aged care
- Nothing to disclose slide
- Disclosure slides
- Disclosure slides
- Class disclosure
- Source code disclosure
- Per diem interest charge disclosure
- Uniform closing dataset
- Nik kutnaks
- No disclosure slide
- No disclosure slide
- Nothing to disclose slide