CCS 2019 Towards Continuous Access Control Validation and











![Ways to solve the problem • Detecting access control misconfigurations • [Bauer 2008], [Das Ways to solve the problem • Detecting access control misconfigurations • [Bauer 2008], [Das](https://slidetodoc.com/presentation_image_h2/a9d01e9a2347a313c0d59af8afa464aa/image-12.jpg)
![Ways to solve the problem • Detecting access control misconfigurations • [Bauer 2008], [Das Ways to solve the problem • Detecting access control misconfigurations • [Bauer 2008], [Das](https://slidetodoc.com/presentation_image_h2/a9d01e9a2347a313c0d59af8afa464aa/image-13.jpg)














































- Slides: 59
CCS 2019 Towards Continuous Access Control Validation and Forensics Chengcheng Xiang 1, Yudong Wu 1, Bingyu Shen 1, Mingyao Shen 1, Haochen Huang 1, Tianyin Xu 2, Yuanyuan Zhou 1, Cindy Moore 1, Xinxin Jin 3, Tianwei Sheng 3 1 2 3 1
Let’s start with a recent data breach incident Data breach impacts 100 Million people (July 30, 2019) ▪ 140, 000 U. S. Social Security numbers ▪ 1 million Canadian Social Insurance numbers ▪ 80, 000 bank account numbers ▪Undisclosed number of names, addresses, credit scores, etc. 2
Let’s start with a recent data breach incident Data breach impacts 100 Million people (July 30, 2019) ▪How it happened: A hacker gained access through a broken firewall with access control misconfigurations (March 22, 2019). 3
Let’s start with a recent data breach incident Data breach impacts 100 Million people (July 30, 2019) ▪How it is discovered: An external sender reported to Capital. One that there was some data leaked in github (July 18, 2019). 4
Let’s start with a recent data breach incident Data breach impacts 100 Million people (July 30, 2019) ▪How it is discovered KEEP CALM ONLY 4 MONTHS TO GO 5
This is not the only data breach by access control misconfigurations Sep. 16, 2019 Jan. 24, 2019 Apr. 12, 2018 Jun. 19, 2017 6
A closer look at Access control configuration • Access control configuration is complicated 7
A closer look at Access control configuration • Access control configuration is complicated 8
A closer look at Access control configuration • Access control misconfiguration is hard to discover system No obvious symptom 9
A closer look at Access control configuration • Access control misconfiguration is hard to discover system end user access No obvious symptom No incentive to report a breach 10
A closer look at Access control configuration • Access control misconfiguration is hard to discover sysadmin system manage Do not routinely validate system access logs end user access No obvious symptom No incentive to report a breach 11
Ways to solve the problem • Detecting access control misconfigurations • [Bauer 2008], [Das 2010], [Fisler 2005] • Can only detect certain types of misconfigurations • Fundamentally: hard to know sysadmins’ intentions Check configuration and report errors 12
Ways to solve the problem • Detecting access control misconfigurations • [Bauer 2008], [Das 2010], [Fisler 2005] • Can only detect certain types of errors • Fundamentally: hard to know sysadmins’ intentions Check configuration and report errors • Validating Access behavior • Lack of tooling support • Totally manual validation is not feasible Help admins validate system access behavior 13
Validating access behavior is challenging • Asking sysadmins to validate every access is not feasible Industry system Dataset Time span # of Access Average access Wikipedia 2 weeks 369 M 26 M/day Center 11 months 5. 9 M 18 K/day Course 11 months 3. 8 M 12 K/day Whova 3 hours 100 K 0. 8 M /day Group 1 month 32 K 1. 1 K/day Least average access 14
Validating access behavior is challenging • Our Solution • Only ask sysadmins to validate when there is an access control behavior (policy) change e. g. accesses to “/project/detail. doc” are previously denied but now allowed 15
How can we obtain policy changes? • From access control configurations Various formats and heterogeneous system architecture 16
How can we obtain policy changes? • From access control configurations • From access logs Web server Httpd access logs 2019 -3 -12 11: 09: 21 sysop (user) /index. php? title=Colour&acition=edit 200 (ALLOW) Various formats and heterogeneous system architecture 2019 -3 -12 11: 10: 42 anonymous (user) /index. php? title=Colour&acition=edit 403 (DENY) End-to-end access behavior Simple and clear semantics 17
Challenges for Inferring policies from logs 18
Challenges for Inferring policies from logs Why do we need to infer policies? • Can we just compare old and new logs to find changes? 2019 -3 -12 11: 09: 21 Alice exam/1. htm deny … 2019 -3 -12 11: 10: 42 Alice exam/1. htm allow Change 19
Challenges for Inferring policies from logs Why do we need to infer policies? • Can we just compare old and new logs to find changes? • Access logs are sparse • Cannot find a direct comparison target Policy • Prof. */* allow • Stu. Homework/* allow • Stu. Exam/* deny 20
Challenges for Inferring policies from logs • Challenge 1: how to uniformly represent policies of various access control models IP url user Web request Unix file permissions Role-based access control Attribute-based access control 21
Challenges for Inferring policies from logs • Solution: use a decision tree model to represent them Visualizedaccess logs Visualized Decision tree-represented policy represented IF $role == “Prof. ” True Allow False IF $dir== “Exam” True Deny False Allow 22
Challenges for Inferring policies from logs • Solution: use a decision tree model to represent them Visualizedaccess logs Visualized Exam/ Decision tree-represented Decision treepolicy represented IF $role == “Prof. ” Prof. True Prof. Allow False IF $dir== “ Exam” True Deny False Allow 23
Challenges for Inferring policies from logs • Solution: use a decision tree model to represent them Visualized accesslogs Decision tree-represented policy represented IF $role == “Prof. ” True Allow False IF $dir== “Exam” True Deny False Allow 24
Challenges for Inferring policies from logs • Solution: use a decision tree model to represent them • Flexibility to encode Step-by-step decision • Inherent hierarchical structure Visualized access logs Decision treerepresented policy IF $role == “Prof. ” True Allow False IF $dir== “Exam” True Deny False Allow 25
Challenges for Inferring policies from logs • Challenge 2: how to represent and infer policy changes Access logs Traditional decision treerepresented policy IF $prefix 1 == “/proj” T 1, PUT, /proj/1. html deny T 2, GET, /proj/2. html deny T 3, GET, /proj/3. html deny T 4, GET, /proj/4. html allow T 5, GET, /proj/5. html allow T 6, PUT, /proj/6. html deny True False IF $method == “GET” True 50% deny 50% allow ··· False ··· 26
Challenges for Inferring policies from logs • Solution: a new time-changing decision tree to encode policy changes Traditional decision treerepresented policy Time-changing decision treerepresented policy IF $prefix 1 == “/proj” True IF $method == “GET” True 50% deny 50% allow True False IF $method == “GET” ··· True False ··· Status deny allow T 2 T 3 T 4 T 5 Timestamp 27
New Challenge: how to infer policies with changes • Traditional decision tree learning algorithms cannot recognize policies with changes 28
New Challenge: how to infer policies with changes • Traditional decision tree learning algorithms cannot recognize policies with changes Set Gini impurity / entropy: measure the purity of a set IF $prefix 1 == “/proj” True IF $method == “GET” True 50% deny 50% allow True IF $method == “GET” True 100% deny (allow) 29
New Challenge: how to infer policies with changes • Traditional decision tree learning algorithms cannot recognize policies with changes Set Gini impurity / entropy: measure the purity of a set IF $prefix 1 == “/proj” True IF $method == “GET” True 50% deny 50% allow NOT policy Policy with changes 30
New Challenge: how to infer policies with changes • Traditional decision tree learning algorithms cannot recognize policies with changes Recognize policy with changes: measure changes in the Time series IF $prefix 1 == “/proj” True IF $method == “GET” NOT policy Policy with changes True IF $method == “GET” True 50% deny 50% allow IF $prefix 1 == “/proj” True Status deny allow T 2 T 3 T 4 T 5 Timestamp Changes: 3 NOT Policy T 2 T 3 T 4 T 5 Timestamp Changes: 1 Policy with changes 31
New Challenge: how to infer policies with changes • Traditional decision tree learning algorithms cannot recognize policies with changes Recognize policy with changes: measure changes in the Time series IF $prefix 1 == “/proj” True IF $method == “GET” Hah, interesting? Status 50% deny More details in our paper! deny 50% allow True IF $method == “GET” True allow NOT policy Policy with changes IF $prefix 1 == “/proj” True Status deny allow T 2 T 3 T 4 T 5 Timestamp Changes: 3 NOT Policy T 2 T 3 T 4 T 5 Timestamp Changes: 1 Policy with changes 32
Overview of our solution: P-DIFF Time-changing decision treerepresented policy Access logs IF $prefix 1 == “/proj” T 1, PUT, /proj/1. html deny T 2, GET, /proj/2. html deny T 3, GET, /proj/3. html deny T 4, GET, /proj/4. html allow T 5, GET, /proj/5. html allow T 6, PUT, /proj/6. html deny True False IF $method == “GET” Policy Inference True deny Change validation Sysadmin ··· False ··· Status Two Use cases Change forensics allow T 2 T 3 T 4 T 5 Timestamp 33
Use case 1 • Change validation: detect policy changes and warn sysadmins to validate IF $prefix 1 == “/proj” True False IF $method == “PUT” True ··· False ··· Status deny allow T 1 T 2 T 3 T 4 Timestamp 34
Use case 1 • Change validation: detect policy changes and warn sysadmins to validate IF $prefix 1 == “/proj” True New-coming access: False IF $method == “PUT” T 5, PUT, /proj/1. htm allow True ··· False ··· Status deny allow T 1 T 2 T 3 T 4 Timestamp 35
Use case 1 • Change validation: detect policy changes and warn sysadmins to validate IF $prefix 1 == “/proj” /proj True New-coming access: False IF $method == “PUT” T 5, PUT, /proj/1. htm allow True ··· False ··· Status deny allow T 1 T 2 T 3 T 4 Timestamp 36
Use case 1 • Change validation: detect policy changes and warn sysadmins to validate IF $prefix 1 == “/proj” True New-coming access: False IF $method == “PUT” PUT T 5, PUT, /proj/1. htm allow True ··· False ··· Status deny allow T 1 T 2 T 3 T 4 Timestamp 37
Use case 1 • Change validation: detect policy changes and warn sysadmins to validate IF $prefix 1 == “/proj” True New-coming access: False IF $method == “PUT” T 5, PUT, /proj/1. htm allow T 5 True ··· False ··· Status deny allow T 1 T 2 T 3 T 4 T 5 Timestamp 38
Use case 1 • Change validation: detect policy changes and warn sysadmins to validate IF $prefix 1 == “/proj” True New-coming access: False IF $method == “PUT” T 5, PUT, /proj/1. htm allow T 5 True False ··· Warning: a new policy change T 4 -T 5, PUT, /proj deny allow ··· Status deny allow T 1 T 2 T 3 T 4 T 5 Timestamp 39
Use case 2 • Change forensics: find historical policy change related to a risky access IF $prefix 1 == “/proj” True False IF $method == “GET” True ··· False ··· Status deny allow T 1 T 2 T 3 Timestamp T 5 40
Use case 2 • Change forensics: find historical policy change related to a risky access IF $prefix 1 == “/proj” True Risky access: False IF $method == “GET” T 4, GET, /proj/1. htm allow True ··· False ··· Status deny allow T 1 T 2 T 3 Timestamp T 5 41
Use case 2 • Change forensics: find historical policy change related to a risky access IF $prefix 1 == “/proj” True Risky access: False IF $method == “GET” T 4, GET, /proj/1. htm allow True ··· False ··· Status deny allow T 1 T 2 T 3 Timestamp T 5 42
Use case 2 • Change forensics: find historical policy change related to a risky access IF $prefix 1 == “ /proj” True Risky access: False IF $method == “ GET” T 4, GET, /proj/1. htm allow True ··· False ··· Status deny allow T 1 T 2 T 3 Timestamp T 5 43
Use case 2 • Change forensics: find historical policy change related to a risky access IF $prefix 1 == “ /proj” True Risky access: False IF $method == “ GET” T 4, GET, /proj/1. htm deny T 4 allow True ··· False ··· Status deny allow T 1 T 2 T 3 T 4 T 5 Timestamp 44
Use case 2 • Change forensics: find historical policy change related to a risky access IF $prefix 1 == “ /proj” True Risky access: False IF $method == “ GET” T 4, GET, /proj/1. htm deny T 4 allow True ··· False ··· Status deny allow T 1 T 2 T 3 T 4 T 5 Timestamp 45
Use case 2 • Change forensics: find historical policy change related to a risky access IF $prefix 1 == “ /proj” True Risky access: False IF $method == “ GET” T 4, GET, /proj/1. htm deny T 4 allow True False ··· A related historical change T 2 -T 3, GET, /proj deny allow ··· Status deny allow T 1 T 2 T 3 T 4 T 5 Timestamp 46
Evaluation setup • Controlled experiments • Real-world access logs • Random injected policy changes • Access log dataset Dataset Industry system Configuration Time span # of Access Wikipedia Application logic 2 weeks 369 M Center Web server configuration 11 months 5. 9 M Course File permission 11 months 3. 8 M Whova Firewall 3 hours 100 K Group File permission 1 month 32 K 47
Evaluation result • Accuracy of policy change detection (use case 1) Dataset Total changes Detected changes Precision (False positives) Recall (False negatives) Wikipedia 25 25 (100%) 1. 0 (0) Center 18 16 (89%) 0. 76 (5) 0. 89 (2) Course 18 17 (94%) 0. 85 (3) 0. 94 (1) Whova 21 18 (86%) 0. 81 (4) 0. 86 (3) Group 17 17 (100%) 1. 0 (0) Total 99 93 (94%) 0. 89 (12) 0. 94 (6) best Worst 48
Evaluation result • Accuracy of forensic diagnosis (use case 2) Dataset Access diagnosed Root-cause changes Pinpointed Wikipedia 63 61 (97%) Center 60 53 (88%) Course 60 59 (98%) Whova 60 51 (85%) Group 60 59 (98%) Total 303 283 (93%) best Worst 49
Accuracy analysis: access result prediction • Using Spark-DT’s average precision: 0. 83, recall: 0. 86, and F-score: 0. 80. 50
Accuracy analysis: access result prediction • TCDT vs Spark-DT P-DIFF-TCDT’s precision: 0. 997 (+0. 167), recall: 0. 92 (+0. 06), and F-score: 0. 94 (+0. 14). 51
Conclusion • We present a tool, P-DIFF, to help sysadmins validate and diagnose accesses • Infers access control policies from access logs • Validation: Detects policy changes and ask sysadmins to validate them • Forensics: Helps sysadmins find out which historical change enabled a risky access 52
Conclusion and Takeaways • We present a tool, P-DIFF, to help sysadmins validate and diagnose accesses • Infers access control policies from access logs • Validation: Detects policy changes and ask sysadmins to validate them • Forensics: Helps sysadmins find out which change enabled a risk access • Takeaways • Decision tree is effective on representing various access control models • Using machine learning to infer system policies needs to handle policy changes • Consider our Time-Changing Decision Tree (TCDT)! 53
Thank you! Q&A 54
Backup: access logs 55
Backup: Different access control models 56
Backup: Inject policy changes 57
Backup: performance 58
Backup: scalability 59