Database Security EECS 710 Information Security Fall 2006
Database Security EECS 710: Information Security Fall 2006 Presenter: Amit Dandekar Instructor: Dr. Hossein Saiedian 1
Contents • • Database concepts Security requirements SQL security model Data sensitivity Security vs. Precision Inference & aggregation problem Multilevel databases Future direction 2
Database Concepts • Database a collection of data & set of rules that organize the data – user works with a logical representation of the data – • Relational database in the relational model, data is organized as a collection of RELATIONS or tables – relations is a set of ATTRIBUTES or columns – each row (or record) of a relation is called a TUPLE – • • Database management system (DBMS) – maintains the DB and controls read write access – sets the organization of and access rules to the DB Database administrator (DBA) 3
Database Concepts • Relationships between tables (relations) must be in the form of other relations base (‘real’) relations: named and autonomous relations, not derived from other relations (have stored data) – views: named derived relations (no stored data) – snapshots: like views are named, derived relations, but they do have stored data – query results: result of a query - may or may not have name, and no persistent existence – 4
Database Concepts • Within every relation, need to uniquely identify every tuple a primary key of a relation is a unique and minimal identifier for that relation – can be a single attribute - or may be a choice of attributes to use – when primary key of one relation used as attribute in another relation it is a foreign key in that relation – 5
Database Concepts • Structured Query Language (SQL) – • to manipulate relations and data in a relational database Types of SQL Commands – – – Data Dictionary Language (DDL) • define, maintain, drop schema objects • SELECT, INSERT, UPDATE • control security (GRANT, REVOKE) and concurrent access (COMMIT , ROLLBACK) Data Manipulation Language (DML) Data Control Language (DCL): 6
Security Requirements • • Physical database integrity Logical database integrity Element integrity Auditability Access control User authentication Availability 7
Security Requirements • Physical database integrity – immunity to physical catastrophe, such as power failures, media failure physical securing hardware, UPS • regular backups • • Logical database integrity – reconstruction Ability maintain a log of transactions • replay log to restore the systems to a stable point • 8
Security Requirements • Element integrity – integrity of specific database elements is their correctness or accuracy • • • field checks – allow only acceptable values access controls – allow only authorized users to update elements change log – used to undo changes made in error referential Integrity (key integrity concerns) two phase locking process Auditability – log read/write to database 9
Security Requirements • Access Control (similar to OS) logical separation by user access privileges – more complicated than OS due to complexity of DB (granularity/inference/aggregation) – • User Authentication may be separate from OS – can be rigorous – • Availability – concurrent users • – granularity of locking reliability 10
SQL Security Model • SQL security model implements DAC based on – users: users of database - user identity checked during login process; – actions: including SELECT, UPDATE, DELETE and INSERT; – objects: tables (base relations), views, and columns (attributes) of tables and views • Users can protect objects they own when object created, a user is designated as ‘owner’ of object – owner may grant access to others – users other than owner have to be granted privileges to access object – 11
SQL Security Model • Components of privilege are – grantor, grantee, object, action, grantable privileges managed using GRANT and REVOKE operations – the right to grant privileges can be granted – • Issues with privilege management each grant of privileges is to an individual or to “Public” – makes security administration in large organizations difficult – individual with multiple roles may have too many privileges for one of the roles – SQL 3 is moving more to role based privileges – 12
SQL Security Model • Authentication & identification mechanisms CONNECT <user> USING<password> – DBMS may chose OS authentication – or its own authentication mechanism – Kerberose • PAM • 13
SQL Security Model • Access control through views – many security policies better expressed by granting privileges to views derived from base relations – example CREATE VIEW AVSAL(DEPT, AVG) AS SELECT DEPT, AVG(SALARY) FROM EMP GROUP BY DEPT • access can be granted to this view for every dept mgr – example CREATE VIEW MYACCOUNT AS SELECT * FROM Account WHERE Customer = current_user() • view containing account info for current user 14
SQL Security Model • Advantages of views are flexible, and allow access control to be defined at a description level appropriate to application – views can enforce context and data-dependent policies – data can easily be reclassified – 15
SQL Security Model • Disadvantages of views access checking may become complex – views need to be checked for correctness (do they properly capture policy? ) – completeness and consistency not achieved automatically - views may overlap or miss parts of database – security-relevant part of DBMS may become very large – 16
SQL Security Model • Inherent weakness of DAC allows subject to be written to any other object which can be written by that subject – trojan horses to copy information from one object to another – 17
SQL Security Model • Mandatory access controls (MAC) no read up, no write down – traditional MAC implementations in RDBMS have focused solely on MLS – there have been three commercial MLS RDBMS offerings – • trusted Oracle , Informix On. Line/Secure, Sybase Secure SQL Server 18
SQL Security Model • Enforce MAC using security labels – – – assign security levels to all data • label associated with a row • label associated with the user • access to a row based upon – the label associated with that row and the label associated with the user accessing that row. assign a security clearance to each users DBMS enforces MAC 19
Case Study RECORDID CLIENTNO DEPTNO ALLOCATION_DATE LAST_UPDATE MEDICAL_HISTORY RISK_FACTOR 0010 K 108341 K 01 2006/01/05 2006/02/05 Diabetes 0 0020 K 104546 K 01 2006/10/20 2006/11/05 Arthritis 2 0030 S 245987 S 02 2006/09/01 2006/10/05 High Blood Pressure 3 0040 S 245456 S 02 2006/06/26 2006/07/05 Asthma 1 – Medical record analyst READ all records • WRITE all records • – Managers READ client records of their department • READ only non-confidential columns • No WRITE access • 20
Case Study • Columns • Rows: medical record analysts have READ/WRITE access to confidential columns – managers have READ access to non-confidential columns – medical record analysts can read and update all the records – managers can read but not update client records for their department – 21
Case Study: DAC Solution Three approaches used to provide row level security using DAC (Discretionary Access Control) • application views • programming logic embedded in the application • physical separation using one or more databases 22
Case Study: DAC Solution • Application views Widely used approach – Views provide the ability to filter data. – 23
Case Study: DAC Solution Create view manager_K 01 as select recordid, clientno, deptno, allocation_date, last_update, risk_factor from Med_records where Dept = ‘K 01’; Create view manager_S 01 as select recordid, clientno, deptno, allocation_date, last_update, risk_factor from Med_records where Dept = ‘S 01’; Create view med_rec_analyst as select * from Med_records; 24
Case Study: DAC Solution • Application views number of views required is sometimes large as application ages – directing application users to the correct view becomes management burden – application complexity tends to increase due to unforeseen security requirements – 25
Case Study: DAC Solution • Application Programmatic Logic Approach – in this approach, application controls SQL statements outside the application. 26
Case Study: DAC Solution • Application program logic approach SQL statements issued outside the application using utility such as SQL Plus can’t be controlled – In scenario of application rewriting SQL statements to restrict access based on data sensitivity, typically numerous additional tables must be build – Those tables need to be maintained to manage information related authorizations of application user – 27
Case Study: DAC Solution • Multiple database approach No of databases required is equal to the number of data sensitivities. – data can be protected by using dedicated databases to manage each sensitivity – 28
Case Study: DAC Solution • Multiple database approach number of databases required is equal to the number of data sensitivities – overhead created by running multiple databases in terms of memory, processing power and physical storage is substantially increased – cost associated of managing single database is multiplied by number of databases – viewing information across multiple database requires distributed queries and application logic – 29
Case Study: MAC Solution • Designing security solution row and column security labels that protect the columns and rows – user security labels that grant users the appropriate access – 30
Case Study: MAC Solution – revisit the problem Position READ WRITE Medical record analyst ALL Managers Client records for their department and only non-confidential columns None to restrict access to the column that is confidential, apply confidential security label to the column – to restrict managers' access to only the records for their department, each row can be tagged with a security label that indicates the department. – write restriction for managers can be implemented by revoking their write privileges. – 31
Case Study: MAC Solution a column security label. • four security labels for row protection • user security label for medical record analysts • grant security labels to users • 32
SQL Security Model • Issues with MAC information tends to becomes over classified – no protection against violations that produce illegal information flow through indirect means – inference Channels - A user at a low security class uses the low data to infer information about high security class • covert channels - Require two active agents, one at a low level and the other at a high level and an encoding scheme • 33
Data Sensitivity Sensitive data is data that should not be made public • Factors determining sensitivity • – – inherently sensitive: The value itself may be so revealing that it is sensitive • locations of defensive missiles • CIA informer whose identity may be compromised • salary attribute from an HR database • longitude of secret army base if latitude is known from a sensitive source part of a sensitive attribute or a sensitive record sensitive with respect to previously disclosed data 34
Data Sensitivity • Even metadata (data about data) may be sensitive bounds: indicating that a sensitive value, y, is between two values, L and H. – negative Result: disclosing that z is not the value of y may be sensitive. Especially when z has only – small set of possible values existence: existence of data is itself may be sensitive piece of data – probable Value: probability that a certain element has a certain value – 35
Security vs. Precision • Precision: revealing as much non sensitive data as possible – – • Security: reveal only those data that are not sensitive – – • disclose as much data as possible Issue: User may put together pieces of disclosed data and infer other, more deeply hidden, data rejecting any query that mentions a sensitive field Issue: may reject many reasonable and non disclosing queries The ideal combination : perfect confidentiality with maximum precision – achieving this goal is not easy ! 36
Security vs. Precision 37
Statistical Databases A database limited to statistical measures (primarily counts and sums) • Example: medical record database where researchers access only statistical measures • In a statistical database, information retrieved by means of statistical (aggregate) queries on an attribute • 38
Inference Security issue with statistical databases • Inference problem exists when sensitive data can be deduced from non sensitive data • – attacker combines information from outside the database with database responses 39
Inference Sensitive fields exist in database • Only when viewed row wise • DBA must not allow names to be associated with sensitive attributes • “n items over k percent” rule (do not respond if n items represents over k% of the result) • 40
Inference • Anonymous medical data: SSN Name Race DOB Sex Zip Marital Heath Asian 09/07/64 F 22030 Married Obesity Black 05/14/61 M 22030 Married Obesity White 05/08/61 M 22030 Married Chest pain White 09/15/61 F 22031 Widow Aids • Public available voter list: Name Address City Zip DOB Sex Party …. …. Sue Carlson 900 Market St. Fairfax 22031 09/15/61 F Democrat • Sue Carlson has Aids! 41
Inference • Types of attack direct attack: aggregate computed over a small sample so individual data items leaked – indirect attack: combines several aggregates; – tracker attack: type of indirect attack (very effective) – linear system vulnerability: takes tracker attacks further, using algebraic relations between query sets to construct equations yielding desired information – 42
Inference NAME SEX RACE AID FINES Adams Bailey Chin Dewitt Earhart Fein Groff Hill Koch Liu Majors C B A B C C C B C A C 5000 0 3000 1000 2000 1000 4000 5000 0 0 2000 45 0 20 35 95 15 0 10 0 M M F F F M DRUGS 1 0 0 3 1 0 3 2 1 2 2 43 DORM Holmes Grey West Grey Holmes West Grey
Inference • Direct Attack determine values of sensitive fields by seeking them directly with queries that yield few records – request LIST which is a union of 3 sets – LIST NAME where (SEX =M DRUGS = 1) (SEX M SEX F) (DORM = Ayres) • No dorm named Ayres , Sex either M or F – “n items over k percent” rule helps prevent attack 44
Inference Indirect attack: combines several aggregates Sums of Financial Aid by Dorm and Sex Holmes Grey West Total M 5000 3000 4000 12000 F 7000 0 4000 11000 12000 3000 8000 23000 Total Students by Dorm and Sex Holmes Grey West Total M 1 3 1 5 F 2 1 3 6 Total 3 4 4 11 1 Male in Holmes receives 5000 • 1 Female in Grey received no aid • – request a list of names by dorm (non sensitive) 45
Inference Often databases protected against delivering small response sets to queries • Trackers can identify unique value • request (n) and (n-1) values – given n and n – 1, we can easily compute the desired single element – 46
Inference • How many caucasian females live in Holmes Hall? count((SEX=F) (RACE=C) (DORM=Holmes) – result: refused because one record dominates the result – – now issue two queries on database count(SEX=F) response = 6 • count((SEX=F) (RACE C) (DORM Holmes)) response=5 • – thus 6 -5=1 females live in Holmes Hall 47
Inference • Tracker is a specific case of ‘Linear system vulnerability’ – result of the query is a set of records • • • q 1 q 2 q 3 q 4 q 5 = c 1+c 2+c 3+c 4+c 5 = c 1+c 2 +c 4 = c 3+c 4 = c 4+c 5 = c 2 +c 5 we can obtain c 5 = ((q 1 – q 2) – (q 3 –q 4))/2 – all other c can be derived – 48
Inference Protection techniques • Only queries disclosing non sensitive data allowed • difficult to discriminate between queries – effective primarily against direct attacks – • Controls applied to individual items within the database suppression: don’t provider sensitive data – concealing: provider slightly modified value – 49
Inference • “n item over k percent rule” not sufficient in itself prevent inference Students by Dorm and Sex, with Low Count Suppression • Holmes Grey West Total M – 3 – 5 F 2 – 3 6 Total 3 4 4 11 We must suppress one other value in each row and column to disallow 50
Inference • Suppression by Combining results – combines rows or columns to protect sensitive values Suppression by Combining Revealing Values Drug Use Sex 0 or 1 2 or 3 M 2 3 F 4 2 51
Inference • Random sample partition data and take random sample from partition – equivalent queries may or may not result in the sample – • Random data perturbation – • intentionally introduce error into response Query analysis history Driven – difficult – 52
Aggregation • Aggregation problem exists when the aggregate of two or more data items is classified at a level higher than the least upper bound of the classification of the individual items that comprise the aggregate – • the data items multiple instances of same entity Addressing the aggregation problem is difficult requires the DBMS to track what results each user had already received – it can take place outside the system – relatively few proposals for countering aggregation – 53
Aggregation • Data association: A sub-problem of aggregation data association – sensitive associations between instances of two or more distinct data items – (cardinal) aggregation - associations among multiple instances of the same entity – 54
Inference vs. Aggregation • They are similar but different – inference: sensitive data deduced from non sensitive data relatively easier problem • protection by means of control over query , data and other ways • – aggregation: multiple instances of entity result in sensitive data difficult problem • protection requires the DBMS to track what results each user had already received • 55
Multilevel Databases • Data sensitivity not black or white exist shades of sensitivity – grades of security may be needed – • So far we seen sensitivity a function of the attribute (column) – • e. g. ‘Drug use’ column sensitive Actually sensitivity not function of column or row the security of one element may be different from that of other elements of the same row or column – security implemented for each individual element – 56
Multilevel Databases Data and Attribute Sensitivity Name Department Salary Phone Rogers training 43, 800 4 -5067 A 2 Jenkins research 62, 900 6 -4281 D 4 Poling training 38, 200 4 -4501 B 1 Garland user services 54, 600 6 -6600 A 4 Hilten user services 44, 500 4 -5351 B 1 Davis admin 51, 400 4 -9505 A 3 57 Performance
Multilevel Databases • Leads to Multi Level Security Model – n levels of sensitivity objects separated into compartments by category – sensitivity marked for each value in database – every combination of elements can also have a distinct sensitivity – access control policy dictate which users may have access to what data – 58
Multilevel Databases • To preserve Integrity , DBMS must enforce “No write down” (*-property) the process that reads high level data cannot write to a lower level – issue: DBMS must read all records and write new records for backups, query processing etc – • • solution: trusted process Preserving confidentiality – issue: Leads to redundancy 59
Multilevel Databases • Polyinstantiation different users operating at two different levels of security might get two different answers to the same query – one record can appear (be instantiated) many times, with a different level of confidentiality each time – Polyinstantiated Records Name Sensitivity Assignment Location Hill, Bob C Program Mgr London Hill, Bob TS Secret Agent South Bend 60
Future Direction • Civilian users dislike inflexibility of MLS databases – • MLS databases primarily research interest Privacy concerns fueling interest in database security hippocratic database – database design that takes consumer privacy into account in the way it stores and retrieves information – 61
References • • Pfleeger, “Security in Computing”, 3 rd ed, 2003(Chapter 8) Abrams, Jojodia, Podell, “Information Security, An Integrated Collection of Essays”, 1995 NCSC Technical Report 005 Volume 1/5 Inference and Aggregation Issues In Secure Database Management Systems Oracle Corporation, “Trusted Label Security”, Redwood City, CA, USA, 2004 62
References • Class notes from Database Security Class at George Mason University – http: //classweb. gmu. edu/brodsky/isa 765/ 63
Thank you! 64
Case Study: MAC Solution Example of steps to implement LBAC: 1. Defining the security policies and labels a. Defining the security label component CREATE SECURITY LABEL COMPONENT SLC_LEVEL SET {'CONFIDENTIAL'} CREATE SECURITY LABEL COMPONENT SLC_LIFEINS_ORG TREE {'LIFE_INS_DEPT' ROOT, 'K 01' UNDER 'LIFE_INS_DEPT', 'K 02' UNDER 'LIFE_INS_DEPT', 'S 01' UNDER 'LIFE_INS_DEPT', 'S 02' UNDER 'LIFE_INS_DEPT' } b. Defining the security policy CREATE SECURITY POLICY MEDICAL_RECORD_POLICY COMPONENTS SLC_LEVEL, SLC_LIFEINS_ORG WITH DB 2 LBACRULES RESTRICT NOT AUTHORIZED WRITE SECURITY LABEL 65
Case Study: MAC Solution c. Defining the security labels CREATE SECURITY LABEL MEDICAL_RECORD_POLICY. MED_RECORD COMPONENT SLC_LEVEL 'CONFIDENTIAL' For each department, CREATE SECURITY LABEL MEDICAL_RECORD_POLICY. LIFEINS_DEPT_K 01 COMPONENT SLC_LIFEINS_ORG 'K 01' For Medical analyst CREATE SECURITY LABEL MEDICAL_RECORD_POLICY. MEDICAL_ANALYST COMPONENT SLC_LEVEL 'CONFIDENTIAL', COMPONENT SLC_LIFEINS_ORG 'K 01', 'K 02', 'S 01', 'S 02' 66
Case Study: MAC Solution 2. Altering the MEDICAL_RECORD table by adding a security label column for row level protection, marking the confidential column as protected, and attaching the security policy to the table. GRANT SECURITY LABEL MEDICAL_RECORD_POLICY. MEDICAL_ANALYST TO USER <administrator_auth_id> FOR ALL ACCESS ALTER TABLE MEDICAL_RECORD ALTER COLUMN MEDICAL_HISTORY SECURED WITH MEDICAL_RECORD_POLICY. MED_RECORD ADD COLUMN DEPARTMENT_TAG DB 2 SECURITYLABEL ADD SECURITY POLICY MEDICAL_RECORD_POLICY 67
Case Study: MAC Solution 3. Updating the MEDICAL_RECORD table security label column. GRANT EXEMPTION ON RULE DB 2 LBACWRITETREE FOR MEDICAL_RECORD_POLICY TO USER <administrator_auth_id> For each department, UPDATE MEDICAL_RECORD set DEPARTMENT_TAG= SECLABEL_BY_NAME ('MEDICAL_RECORD_POLICY', 'DEPT_K 01') where DEPTNO='K 01' 68
LBAC(Label Based Access Control) 4. Granting the appropriate security labels to users. GRANT SECURITY LABEL MEDICAL_RECORD_POLICY. MEDICAL_ANALYST TO USER PETER FOR ALL ACCESS GRANT SECURITY LABEL MEDICAL_RECORD_POLICY. DEPT_K 01 TO USER Andrea FOR ALL ACCESS GRANT SECURITY LABEL MEDICAL_RECORD_POLICY. DEPT_S 02 TO USER Joseph for ALL ACCESS 69
- Slides: 69