Architecting an RDFOWL based Enterprise Conformance and Compliance
Architecting an RDF/OWL based Enterprise Conformance and Compliance Registry at the National Cancer Institute Cecil O. Lynch, MD, MS UC Davis Pathology Informatics NCI Chief Semantic Architect 3/7/2021 7: 46 PM
Outline • What will we cover in this talk? – NCI Semantic Infrastructure Version 2 • ca. GRID 2. 0 – What’s new in 2? • Services Aware Interoperability Framework (SAIF) – What does it do for semantics? • Enterprise Conformance and Compliance Framework Registry (ECCF registry) – What is it and how do we use it? – Biomedical Research Integrated Domain Group Model (BRIDG) – How is OWL used in this model? What is the future for BRIDG OWL development? – What is the impact of the work at NCI for the Semantic Web and Ontology community? Page 2
Semantic Infrastructure V 2 Overarching/Core Requirements • Lower the barrier-to-entry for participation in ca. BIG® • ca. BIG® 1. x is too heavily front-loaded. • Provide a “linear value proposition” to all stakeholders • Easy things should be easy to do. • Support legacy data and functionality • Next-generation ca. BIG® is evolution, not revolution • Leverage ca. BIG® 1. x Lessons Learned • Leverage technology and semantic progress in the larger scientific and commercial communities Page 3
Semantic Infrastructure 2. 0 from 50, 000 feet. . . • Design-time and run-time integration with ca. Grid 2. 0 • Artifact management (design-time and run-time) • • Meta-data-driven service discovery and governance CBIIT SAIF IG • ECCF (including Conformance testing) • Forms definition (e. g. CRFs) • Decision support • Semantically-aware workflow • ca. BIG® Clinical Information Suite (ca. CIS) project • CDISC CSHARE project Page 4
ca. GRID 2. 0 • Lower barriers to entry for all stakeholders-scientists, clinicians, technologists, and informaticists-creating a working environment in which “easy things are easy to do” • Leverage the increasingly mature collection of publicly available open source infrastructure and the expanding trends in user friendly platforms • Continue to provide support and migration strategies for users of ca. GRID 1. X Page 5
ca. GRID 1. x Work Flow • Concepts representing the domain must exist in a terminology server (EVS). • Common Data Elements (ISO 11179), which use those concepts and controlled vocabulary along with other information, must exist for every class and attribute to be used in the object model. • An object model that has every class and attribute annotated with CDEs must exist which represents the data types to be used. • A schema must be generated that reflects how the object model will actually look when serialized to XML. • The annotated object model must be submitted to NCI CBIIT for review and acceptance. • The annotated model must have a corresponding physical data model that describes exactly which class and attributes go into which tables and rows. • Once the model is approved, the ca. Core and ca. Grid development tools can be used to create and expose the grid service. Page 6
ca. GRID 2. 0 Interaction with SIV 2 • Data Representation and Information Models –referenced in SIV 2 • Data and metadata discovery – SPARQL endpoint to provide SIV 2 access through REST • Service Discovery and Utilization –both ca. GRID registry and interaction with service metadata in SIV 2 • Service Semantics – maintained in SIV 2 • Data Semantics – SIV 2 function • Data Discovery and Exploration – Query history acquired by ca. GRID, linkages instantiated in SIV 2 • Service Interface Mediation - shared responsibility • High-Throughput Data and Computation – SIV 2 to capture the metadata about the mapping of models to binary content data types improving service choreography Page 7
Services Aware Interoperability Framework (SAIF) Page 8
The Lens of SAIF (1): Contextualizing SAIF • SAIF: intersection of SOA, MDA, CSI, Distributed Systems Architecture, and HL 7 (e. g. HDF, Core Principles) provide goals, artifacts, portions of a methodology, and a framework for defining the HL 7 EA, a robust, durable business-oriented set of constructs that provide extensibility, reuse, and governance. Service Oriented Architecture Health Level 7 (Implementation Guide) Computable Semantic Interoperability Reference Model For Open Distributed Processing Model Driven Architecture You are here (Vous êtes içi) Page 9
The Lens of SAIF (2): Services-Oriented Architecture • SOA (Services-Oriented Architecture) – SAIF is “services-aware, ” i. e. , not “just about services” • Service awareness (at the architecture level) surfaces need for: – Behavioural Framework built around Contracts and Roles – Well-defined Conformance/Compliance Framework (ECCF) – Attention to “separation of concerns” (static vs behavioural) – Requirement for Governance (GF) – Technology-Independent specifications • Conformance certified for each technology binding Page 10
The Lens of SAIF (3): Model-Driven Architecture • MDA (Model-Driven Architecture) enables – Levels of abstraction that layer complexity from Conceptual through Logical to Implementation • Support/enforce SOA thinking • Support partitioning of artifacts to layers of Conformance/Compliance Framework – Solid tooling support Page 11
The Lens of SAIF (4): Computable Semantic Interoperability (CSI) • CSI (Computable Semantic Interoperability) – Pillar #1: Common Model across all domains-of-interest • Multiple domains one or more domain analysis models Common Model Components Universally applied • Static and Behavioural (“dynamic”) semantics – Pillar #2: Elements from Model(s) #1 bound to robust data type specification (e. g. ADT, ISO 21090) – Pillar #3: Methodology for binding terms from concept-based terminologies – Pillar #4: A formally-defined process for specifying the static and behavioral semantics for WI scenarios Page 12
The Lens of SAIF (5): RM-ODP (1) ISO Standard (RM – ODP, ISO/IEC IS 10746 | ITU-T X. 900 ) • SAIF uses the Reference Model for Open Distributed Processing (RM-ODP) as its lingua franca categorize the various artifacts – Five non-orthogonal, non-hierarchical Viewpoints in which Conformance Assertions are made or validated • Conformance Statements made (Conformance asserted): – Enterprise/ Business VP – Informational VP – Computational VP – Engineering VP – Technology VP – Conformance Statements validated via Conformance testing of Implementation-specific Conformance Assertions made against Conformance Statements. Page 13
RM-ODP (2) ISO Standard (RM – ODP, ISO/IEC IS 10746 | ITU-T X. 900 ) Why? What? How? Where? True? SAIF Specification Stack is made up of Conformance Statements and Compliance Validations. In SAIF, the artifacts are constructed via Constraint Patterns sorted by RM-ODP Viewpoints. Page 14
RM-ODP (3) ISO Standard (RM – ODP, ISO/IEC IS 10746 | ITU-T X. 900 ) • RM-ODP Viewpoints are – Non-hierarchical – Non-orthogonal – Each Viewpoint can (and often will) contain a hierarchy of layered information o f n I mp u tat ion s/ ines Bus rprise e Ent inee ring al Technology Eng rm on i t a Co Page 15
The Lens of the SAIF (6): Health Level 7 • SAIF takes a number of enterprise architecture best practices / approaches and contextualizes them to HL 7 including – Use of existing HL 7 artifacts • Core Principles • HDF • RMIMs • etc. – Awareness of HL 7 business context – Dedication to HL 7 Mission and Goals RE Working Interoperability Page 16
Enterprise Conformance and Compliance Framework Page 17
ca. BIG® Compatibility Guidelines: Today Compatibility Guidelines Today • Platform Specific • Annotated Models (CDE’s) • Service Interfaces Page 18
ca. BIG® Interoperability Specifications: Tomorrow Conformance Guidelines Tomorrow • Layered Specifications • Testable Conformance • Behavioral Semantics • Traceability • Binding to standard models and data types Page 19
Information Framework CIM DAM DIM PIM Vocabulary PSM Page 20
Artifact Management (ECCF Registry) • Manage lifecycle, governance and versioning of the models, content and derived forms (e. g. CRFs) – Establish and manage relationships and dependencies between models, content and, forms – Manage content provenance, jurisdiction, authority, and intellectual property – Tools to hide complexity of underlying semantic models – Support multiple representations/views of information • Provide access control and other security constraints for content • Define meta-data to enhance artifact/content discovery – Support usage scenarios and context for the information in the SI • Support appropriate value set content and binding management – Value set queries, etc. Page 21
ca. BIG® Clinical Information Suite (ca. CIS) as an SIV 2 consumer Page 22
ca. CIS Requirements • CBIIT implementation of Service-Aware Interoperability Framework – ISO 21090 Data Types – HL 7 Development Framework (HDF) • Modeling/MDA Tooling – Clinical Document Architecture (CDA) • Publishing of templates • Evaluating semantics of CDA documents – ECCF-related Needs • Modeling constructs to facilitate complete and valid system specification. • Traceability across RM-ODP viewpoints and MDA layers. – RM-ODP Reference Model for Open Distributed Processing – MDA Model-Driven Architecture • Formal expression of conformance assertions. • Reasoning / Decision Support – Structured eligibility criteria – Adverse event reporting 23 Page 23
ca. CIS Metrics • SVN Repository is 4. 86 GB of data content around Analysis, Architecture, Development, QA and Deployment of the • 100, 023 Files organized into 41, 489 Folders • File types include Word documents, UML diagrams, XMI files, XML files, Visio diagrams, OWL files, JPEG images, HTML files, java code, Excel files, PDF documents, Cmaps, text files and others • Increasingly difficult to find files of interest and contextual relevance • Makes reuse difficult to impossible • File relationships are limited to folder organization Page 24
Proposed ECCF Metadata Management • Apply DITA transforms to current SVN document artifacts to define high level metadata – Pilot testing has been completed and looks feasible • Convert DITA headings to RDF triples • Capture Dublin Core document metadata • Query for common RDF statements to link objects as an automated first pass at linking artifacts • Follow this with manual review for metrics Page 25
Proposed ECCF Model Transforms • All conceptual modeling is done in UML and these UML artifacts will be converted to OWL using the Eclipse e. Core based EMF Ontology Definition Meta-model (EODM) that takes an EMF • HL 7 models are developed using the RMIM designer plug-in for Visio – Artifacts include the vsd file, an XML schema model and a MIF file – NCI is currently building an HL 7 MIF to OWL converter that allows any V 3 model to be precisely defined in OWL 2 capturing all constraints on the model as well as pointers to vocabulary bindings Page 26
Proposed ECCF Conformance Testing • Develop the OWL ontology for SAIF matrix representation to provide the ECCF meta-model • Define the ECCF meta-model relations to the model artifact type • Use the HL 7 RIM MIF to OWL transform to define all classes as a model classification profile for constraint checking • Use the standard OWL reasoners to classify new models according to the ECCF matrix meta-model • Identify failed classification errors to inform users of compliance level Page 27
Page 28
Biomedical Research Integrated Domain Group Model (BRIDG) Page 29
BRIDG Project Stakeholders • Clinical Data Interchange Standards Consortium (CDISC) • HL 7 Regulated Clinical Research Information Management Technical Committee (HL 7 RCRIM WG) • National Cancer Institute (NCI), including the Cancer Biomedical Informatics Grid (ca. BIG™) project • Federal Drug Administration (FDA) Page 30
BRIDG Project Goals • Produce a shared view of the dynamic and static semantics of a common domain-of-interest, specifically the domain of protocol-driven research and its associated regulatory artifacts. • Aid stakeholders and their communities to achieve computable semantic interoperability (CSI), i. e. the ability for information systems to exchange at a machine-to-machine level the meaning (rather than simply the structure) of data and/or to effectively combine functionality across machine/system boundaries • Provide a shared view for multiple audiences and for multiple purposes through layering of the model as: – an abstract UML model friendly to the general community – An HL 7 V 3 model that expresses the UML with extensions to further refine the UML giving a more useful view to developers using the BRIDG model. – An OWL intermediate layer that allows precise mapping between the UML and HL 7 V 3 layers and allows reasoning across the models for classification and error testing. Page 31
BRIDG Models Structure Page 32
BRIDG UML View Page 33
BRIDG HL 7 V 3 View Page 34
BRIDG OWL View Page 35
BRIDG OWL Today and Tomorrow • Current OWL construction is complex, time consuming and costly • There is always a lag between versions of BRIDG UML /RIM views and the OWL version due to the time to build the OWL model additions • Current task has been approved and funded to build a MIF to OWL automated transform tool • The tool will be generic to any MIF so will handle all RIM constructs • Reasoning occurs on the V 3 side as far as equivalent class structures, so most of the work is here and can be completely automated Page 36
What is the impact of the work at NCI for the Semantic Web and Ontology community? Page 37
What is the impact of the work at NCI for the Semantic Web and Ontology community? • Impact is bidirectional - NCI is a consumer of the W 3 C and OMG specifications and feeds back to these communities by participating in the W 3 C Life sciences group, consuming NCBO ontologies • All tooling and infrastructure built at NCI is open source and available without restrictions to all • The BRIDG model is one of the most complicated ontologies from a reasoning perspective and has been shared with Clark and Parsia to aid in the development of future Pellet versions for reasoning optimization • NCI Roadmaps for the Semantic Infrastructure and ca. GRID 2. 0 are open and NCI would greatly appreciate feedback form the Semantic community Page 38
Questions? • clynch@ontoreason. com or colynch@ucdavis. edu • https: //wiki. nci. nih. gov/display/CBIITseminfra/Sema ntic+Infrastructure+2. 0+Roadmap+Wiki • https: //wiki. nci. nih. gov/display/CBIITtech/ca. Grid+2. 0+Roadmap+Wiki Page 39
- Slides: 39