Assured Cloud Computing for Assured Information Sharing Dr

  • Slides: 36
Download presentation
Assured Cloud Computing for Assured Information Sharing Dr. Bhavani Thuraisingham The University of Texas

Assured Cloud Computing for Assured Information Sharing Dr. Bhavani Thuraisingham The University of Texas at Dallas (UTD) February 2014

Outline Objectives Assured Information Sharing Layered Framework for a Secure Cloud-based Assured Information Sharing

Outline Objectives Assured Information Sharing Layered Framework for a Secure Cloud-based Assured Information Sharing Cloud-based Secure Social Networking Other Topics – Secure Hybrid Cloud – Cloud Monitoring – Cloud for Malware Detection – Cloud for Secure Big Data • Education • Directions • Related Books • • •

Team Members • Sponsor: Air Force Office of Scientific Research • The University of

Team Members • Sponsor: Air Force Office of Scientific Research • The University of Texas at Dallas – Dr. Murat Kantarcioglu; Dr. Latifur Khan; Dr. Kevin Hamlen; Dr. Zhiqiang Lin, Dr. Kamil Sarac • Sub-contractors – Prof. Elisa Bertino (Purdue) – Ms. Anita Miller, Late Dr. Bob Johnson (North Texas Fusion Center) • Collaborators – Late Dr. Steve Barker, Dr. Maribel Fernandez, Kings College, U of London (EOARD) – Dr. Barbara Carminati; Dr. Elena Ferrari, U of Insubria (EOARD)

Objectives • Cloud computing is an example of computing in which dynamically scalable and

Objectives • Cloud computing is an example of computing in which dynamically scalable and often virtualized resources are provided as a service over the Internet. Users need not have knowledge of, expertise in, or control over the technology infrastructure in the "cloud" that supports them. • Our research on Cloud Computing is based on Hadoop, Map. Reduce, Xen • Apache Hadoop is a Java software framework that supports data intensive distributed applications under a free license. It enables applications to work with thousands of nodes and petabytes of data. Hadoop was inspired by Google's Map. Reduce and Google File System (GFS) papers. • XEN is a Virtual Machine Monitor developed at the University of Cambridge, England • Our goal is to build a secure cloud infrastructure for assured information sharing and related applications

Information Operations Across Infospheres: Assured Information Sharing Objectives l Develop a Framework for Secure

Information Operations Across Infospheres: Assured Information Sharing Objectives l Develop a Framework for Secure and Timely Data Sharing across Infospheres l Investigate Access Control and Usage Control policies for Secure Data Sharing l Develop innovative techniques for extracting information from trustworthy, semi-trustworthy and untrustworthy partners Data/Policy for Coalition Publish Data/Policy Component Data/Policy for Agency A Component Data/Policy for Agency C Component Data/Policy for Agency B Scientific/Technical Approach l Conduct experiments as to how much information is lost as a result of enforcing security policies in the case of trustworthy partners l Develop more sophisticated policies based on role-based and usage control based access control models l Develop techniques based on game theoretical strategies to handle partners who are semi-trustworthy l Develop data mining techniques to carry out defensive and offensive information operations Accomplishments l Developed an experimental system for determining information loss due to security policy enforcement l Developed a strategy for applying game theory for semitrustworthy partners; simulation results l Developed data mining techniques for conducting defensive operations for untrustworthy partners Challenges l Handling dynamically changing trust levels; Scalability

Our Approach • Policy-based Information Sharing • Integrate the Medicaid claims data and mine

Our Approach • Policy-based Information Sharing • Integrate the Medicaid claims data and mine the data; • Enforce policies and determine how much information has been lost (Trustworthy partners); • Application of Semantic web technologies • Apply game theory and probing to extract information from semi-trustworthy partners • Conduct Active Defence and determine the actions of an untrustworthy partner – Defend ourselves from our partners using data analytics techniques – Conduct active defence – find our what our partners are doing by monitoring them so that we can defend our selves from dynamic situations

Policy Enforcement Prototype Coalition

Policy Enforcement Prototype Coalition

Layered Framework for Assured Cloud Computing Policies XACML RDF Qo. S Applications Resource Allocation

Layered Framework for Assured Cloud Computing Policies XACML RDF Qo. S Applications Resource Allocation HIVE/SPARQL/Query Hadoop/Map. Reduc/Storage Risks/ Costs XEN/Linux/VMM Cloud Monitors Secure Virtual Network Monitor Figure 2. Layered Framework for Assured Cloud 9/6/2021 8

Secure Query Processing with Hadoop/Map. Reduce • We have studied clouds based on Hadoop

Secure Query Processing with Hadoop/Map. Reduce • We have studied clouds based on Hadoop • Query rewriting and optimization techniques designed and implemented for two types of data • (i) Relational data: Secure query processing with HIVE • (ii) RDF data: Secure query processing with SPARQL • Demonstrated with XACML policies • Joint demonstration with Kings College and University of Insubria – First demo (2011): Each party submits their data and policies – Our cloud will manage the data and policies – Second demo (2012): Multiple clouds

Fine-grained Access Control with Hive System Architecture q Table/View definition and loading, § Users

Fine-grained Access Control with Hive System Architecture q Table/View definition and loading, § Users can create tables as well as load data into tables. Further, they can also upload XACML policies for the table they are creating. Users can also create XACML policies for tables/views. § Users can define views only if they have permissions for all tables specified in the query used to create the view. They can also either specify or create XACML policies for the views they are defining. § Collaborate. Com 2010

SPARQL Query Optimizer for Secure RDF Data Processing New Data Web Interface Data Preprocessor

SPARQL Query Optimizer for Secure RDF Data Processing New Data Web Interface Data Preprocessor Map. Reduce Framework Parser N-Triples Converter Query Validator & Rewriter Prefix Generator Predicate Based Splitter Predicate Object Based Splitter Answer Query Server Backend XACML PDP Query Rewriter By Policy Plan Generator Plan Executor To build an efficient storage mechanism using Hadoop for large amounts of data (e. g. a billion triples); build an efficient query mechanism for data stored in Hadoop; Integrate with Jena Developed a query optimizer and query rewriting techniques for RDF Data with XACML policies and implemented on top of JENA IEEE Transactions on Knowledge and Data Engineering, 2011

Demonstration: Concept of Operation Agency 1 Agency 2 Agency n … User Interface Layer

Demonstration: Concept of Operation Agency 1 Agency 2 Agency n … User Interface Layer Relational Data Fine-grained Access Control with Hive RDF Data SPARQL Query Optimizer for Secure RDF Data Processing

RDF-Based Policy Engine Technology By UTDallas Interface to the Semantic Web Inference Engine/ Rules

RDF-Based Policy Engine Technology By UTDallas Interface to the Semantic Web Inference Engine/ Rules Processor e. g. , Pellet Policies Ontologies Rules In RDF JENA RDF Engine RDF Documents

RDF-based Policy Engine on the Cloud A testbed for evaluating different policy sets over

RDF-based Policy Engine on the Cloud A testbed for evaluating different policy sets over different data representation. Also supporting provenance as directed graph and viewing policy outcomes graphically § Determine how access is granted to a resource as well as how a document is shared § User specify policy: e. g. , Access Control, Redaction, Released Policy § Parse a high-level policy to a low-level representation § Support Graph operations and visualization. Policy executed as graph operations § Execute policies as SPARQL queries over large RDF graphs on Hadoop § Support for policies over Traditional data and its provenance § IFIP Data and Applications Security, 2010, ACM SACMAT 2011

Integration with Assured Information Sharing: Agency 1 Agency 2 Agency n … User Interface

Integration with Assured Information Sharing: Agency 1 Agency 2 Agency n … User Interface Layer SPARQL Query RDF Data and Policies Policy Translation and Transformation Layer RDF Data Preprocessor Map. Reduce Framework for Query Processing Hadoop HDFS Result

Architecture

Architecture

Policy Reciprocity q. Agency 1 wishes to share its resources if Agency 2 also

Policy Reciprocity q. Agency 1 wishes to share its resources if Agency 2 also shares its resources with it § Use our Combined policies § Allow agents to define policies based on reciprocity and mutual interest amongst cooperating agencies SPARQL query: SELECT B FROM NAMED uri 1 FROM NAMED uri 2 WHERE P

Develop and Scale Policies q. Agency 1 wishes to extend its existing policies with

Develop and Scale Policies q. Agency 1 wishes to extend its existing policies with support for constructing policies at a finer granularity. § The Policy engine – Policy interface that should be implemented by all policies – Add newer types of policies as needed

Justification of Resources q. Agency 1 asks Agency 2 for a justification of resource

Justification of Resources q. Agency 1 asks Agency 2 for a justification of resource R 2 • Policy engine – Allows agents to define policies over provenance – Agency 2 can provide the provenance to Agency 1 • But protect it by using access control or redaction policies

Other Example Policies q Agency 1 shares a resource with Agency 2 provided Agency

Other Example Policies q Agency 1 shares a resource with Agency 2 provided Agency 2 does not share with Agency 3 q Agency 1 shares a resource with Agency 2 depending on the content of the resource or until a certain time q Agency 1 shares a resource R with agency 2 provided Agency 2 does not infer sensitive data S from R (inference problem) q Agency 1 shares a resource with Agency 2 provided Agency 2 shares the resource only with those in its organizational (or social) network

Analyzing and Securing Social Networks in the Cloud Analytics Location Mining from Online Social

Analyzing and Securing Social Networks in the Cloud Analytics Location Mining from Online Social Networks Predicting Threats from Social Network Data, Sentiment Analysis Cloud Platform for implementation Security and Privacy Preventing the Inference of Private Attributes (liberal or conservative; gay or straight) Access Control in Social Networks Cloud Platform for implementation

Security Policies for On-Line Social Networks (OSN) • Security Policies ate Expressed in SWRL

Security Policies for On-Line Social Networks (OSN) • Security Policies ate Expressed in SWRL (Semantic Web Rules Language) examples

Security Policy Enforcement • A reference monitor evaluates the requests. • Admin request for

Security Policy Enforcement • A reference monitor evaluates the requests. • Admin request for access control could be evaluated by rule rewriting – Example: Assume Bob submits the following admin request – Rewrite as the following rule

Framework Architecture Social Network Application Access Decision Access request Reference Monitor Knowledge Base Queries

Framework Architecture Social Network Application Access Decision Access request Reference Monitor Knowledge Base Queries Modified Access request Reasoning Result Semantic Web Reasoning Engine Policy Retrieval Policy Store SN Knowledge Base

Secure Social Networking in the Cloud with Twitter-Storm Social Network 1 Social Network 2

Secure Social Networking in the Cloud with Twitter-Storm Social Network 1 Social Network 2 Social Network N … User Interface Layer Relational Data Fine-grained Access Control with Hive RDF Data SPARQL Query Optimizer for Secure RDF Data Processing

Secure Storage and Query Processing in a Hybrid Cloud • The use of hybrid

Secure Storage and Query Processing in a Hybrid Cloud • The use of hybrid clouds is an emerging trend in cloud computing – Ability to exploit public resources for high throughput – Yet, better able to control costs and data privacy • Several key challenges – Data Design: how to store data in a hybrid cloud? • Solution must account for data representation used (unencrypted/encrypted), public cloud monetary costs and query workload characteristics – Query Processing: how to execute a query over a hybrid cloud? • Solution must provide query rewrite rules that ensure the correctness of a generated query plan over the hybrid cloud

Hypervisor integrity and forensics in the Cloud Applications Linux forensics Solaris XP Mac. OS

Hypervisor integrity and forensics in the Cloud Applications Linux forensics Solaris XP Mac. OS OS integrity Virtualization Layer (Xen, v. Sphere) Hardware Layer Ø Secure control flow of hypervisor code Hypervisor Cloud integrity & forensics Ø Integrity via in-lined reference monitor Ø Forensics data extraction in the cloud Ø Multiple VMs Ø De-mapping (isolate) each VM memory from physical memory

Cloud-based Malware Detection Stream of known malware or benign executables Buffer Unknown executable Feature

Cloud-based Malware Detection Stream of known malware or benign executables Buffer Unknown executable Feature extraction and selection using Cloud Feature extraction Malware Remove Training & Model update Ensemble of Classification models Classify Class Benign Keep

Cloud-based Malware Detection • Binary feature extraction involves – Enumerating binary n-grams from the

Cloud-based Malware Detection • Binary feature extraction involves – Enumerating binary n-grams from the binaries and selecting the best n-grams based on information gain – For a training data with 3, 500 executables, number of distinct 6 -grams can exceed 200 millions – In a single machine, this may take hours, depending on available computing resources – not acceptable for training from a stream of binaries – We use Cloud to overcome this bottleneck • A Cloud Map-reduce framework is used – to extract and select features from each chunk – A 10 -node cloud cluster is 10 times faster than a single node – Very effective in a dynamic framework, where malware characteristics change rapidly

Identity Management Considerations in a Cloud • Trust model that handles – (i) Various

Identity Management Considerations in a Cloud • Trust model that handles – (i) Various trust relationships, (ii) access control policies based on roles and attributes, iii) real-time provisioning, (iv) authorization, and (v) auditing and accountability. • Several technologies are being examined to develop the trust model – Service-oriented technologies; standards such as SAML and XACML; and identity management technologies such as Open. ID. • Does one size fit all? – Can we develop a trust model that will be applicable to all types of clouds such as private clouds, public clouds and hybrid clouds Identity architecture has to be integrated into the cloud architecture.

Big Data and the Cloud 0 Big Data describes large and complex data that

Big Data and the Cloud 0 Big Data describes large and complex data that cannot be managed by traditional data management tools 0 From Petabytes to Zettabytes to Exabytes of data 0 Need tools for capture, storage, search, sharing, analysis, visualization of big data. 0 Examples include - Web logs, RFID and surveillance data, sensor networks, social network data (graphs), text and multimedia, data pertaining to astronomy, atmospheric science, genomics, biogeochemical, biological fields, video archives 0 Big Data Technologies 0 Hadoop/Map. Reduce Platform, HIVE Platform, Twitter Storm Platform, Google Apps Engine, Amazon EC 2 Cloud, Offerings from Oracle and IBM for Big Data Management, Other: Cassandra, Mahut, Pig. Latin, - - - 0 Cloud Computing is emerging a critical tool for Big Data Management 0 Critical to maintain Security and Privacy for Big Data

Security and Privacy for Big Data 0 Secure Storage and Infrastructure 0 How can

Security and Privacy for Big Data 0 Secure Storage and Infrastructure 0 How can technologies such as Hadoop and Map. Reduce be Secured 0 Secure Data Management 0 Techniques for Secure Query Processing 0 Examples: Securing HIVE, Cassandra 0 Big Data for Security 0 Analysis of Security Data (e. g. , Malware analysis) 0 Regulations, Compliance Governance 0 What are the regulations for storing, retaining, managing, transferring and analyzing Big Data 0 Are the corporations compliance with the regulations 0 Privacy of the individuals have to be maintained not just for raw data but also for data integration and analytics 0 Roles and Responsibilities must be clearly defined

Security and Privacy for Big Data 0 Regulations Stifling Innovation? 0 Major Concern is

Security and Privacy for Big Data 0 Regulations Stifling Innovation? 0 Major Concern is too many regulations will stifle Innovation 0 Corporations must take advantage of the Big Data technologies to improve business 0 But this could infringe on individual privacy 0 Regulations may also interfere with Privacy – example retaining the data 0 Challenge: How can one carry out Analytics and still maintain Privacy? 0 National Science F Workshop Planned for Spring 2014 at the University of Texas at Dallas

Education on Secure Cloud Computing and Related Technologies • Secure Cloud Computing – NSF

Education on Secure Cloud Computing and Related Technologies • Secure Cloud Computing – NSF Capacity Building Grant on Assured Cloud Computing • Introduce cloud computing into several cyber security courses • Completed courses – Data and Applications Security, Data Storage, Digital Forensics, Secure Web Services – Computer and Information Security • Capstone Course – One course that covers all aspects of assured cloud computing – Week long course to be given at Texas Southern University • Analyzing and Securing Social Networks • Big Data Analytics and Security

Directions • Secure VMM and VNM – Designing Secure XEN VMM – Developing automated

Directions • Secure VMM and VNM – Designing Secure XEN VMM – Developing automated techniques for VMM introspection – Determine a secure network infrastructure for the cloud • Integrate Secure Storage Algorithms into Hadoop • Identity Management in the Cloud • Secure cloud-based Big Data Management/Social Networking

Related Books • Developing and Securing the Cloud, CRC Press (Taylor and Francis), November

Related Books • Developing and Securing the Cloud, CRC Press (Taylor and Francis), November 2013 (Thuraisingham) • Secure Data Provenance and Inference Control with Semantic Web, CRC Press 2014, In Print (Cadenhead, Kantarcioglu, Khadilkar, Thuraisingham) • Analyzing and Securing Social Media, CRC Press, 2014, In preparation (Abrol, Heatherly, Khan, Kantarcioglu, Khadilkar, Thuraisingham)