QUANTIFYING INFORMATION LOSS AFTER REDACTING DATA PROVENANCE TEAM




















- Slides: 20
QUANTIFYING INFORMATION LOSS AFTER REDACTING DATA PROVENANCE TEAM: AVINI SOGANI VAISHNAVI SUNKU VENUGOPAL BOPPA
INTERNET OF THINGS
SEMANTIC WEB AND PROVENANCE • Meaning behind anything you say • Semantic web is the platform that provides secure sharing of heterogeneous data on the web. • Provenance of data can be traced down to the origin of the data or can be simply an immediate source. • Provides assessment of authenticity, enables trust, and provides assurance for data quality and thereby allows reproducibility of that resource.
• Imposing restrictions to data access by users • Types – DAC, MAC, RBAC REDACTION • Process of removing or hiding sensitive data • Protect sensitive information from unauthorized users
RELATED WORK
PRIVACY CONTROL ACTS • HIPAA – Health Insurance Portability and Accountability Act • Regulates EMR/EPR • PHI – Protected Health Information • PII – Personally Identifiable Information • HITECH Act – Health Information Technology for Economic and Clinical Health • Minimum necessary for the stated purpose
W 3 C RECOMMENDATIONS • A. C. model applications • • • File systems Database Provenance? • Data Models: • • RDF (Triples, subject, predicate, object) OPM • Querying: • • OPQL (From(e), to, from-1(n), to-1, prev(n), next) SPARQL (Regular expressions)
REDACTION POLICIES • Medical Scenario
REDACTION ON DATA PROVENANCE • Why med: Doc 1_2?
REDACTION BY GRAPH GRAMMAR AND R. E.
ARCHITECTURE
LIMITATIONS
• No Quantification of the information lost by the process of redaction • The availability of redacted information available from different source (internet, knowledge of the context. . )
OUR PROPOSALS
INFORMATION LOSS • Relevance of the data to the user • Vectorial model formula for calculating the relevance • Terms: • • • True relevant data Retrieved data Relevant data F Measure (precision and recall) NMI (Normalized Mutual Information)
INFORMATION LOSS
CONCLUSION
REFERENCES: • Query Language Constructs for Provenance, Murali Mani, Mohamad Alawa, Arunlal Kalyanasundaram • Tyrone Cadenhead, Vaibhav Khadilkar, Murat Kantarcioglu, and Bhavani Thuraisingham. 2011. Transforming provenance using redaction. In Proceedings of the 16 th ACM symposium on Access control models and technologies (SACMAT '11). ACM, New York, NY, USA, 93102. • Tyrone Cadenhead, Vaibhav Khadilkar, Murat Kantarcioglu and Bhavani Thuraisingham, A Language for Provenance Access Control • Nettleton, David F. , and Daniel Abril. "An Information Retrieval Approach to Document Sanitization. " Advanced Research in Data Privacy. Springer International Publishing, 2015. 151 -166. • Baeza-Yates, R. , Ribeiro-Neto, B. : Modern Information Retrieval: The Concepts and Technology Behind Search, 2 nd edn. ACM Press Books, England (2011)
THANK YOU. .