Source Analysis for Security Trent Jaeger March 29

  • Slides: 54
Download presentation
Source Analysis for Security Trent Jaeger March 29, 2004

Source Analysis for Security Trent Jaeger March 29, 2004

Example 1

Example 1

Example 2 get_free_buffer(struct stripe_head *sh, …) { struct buffer_head *bh; unsigned long flags; save_flags(flags);

Example 2 get_free_buffer(struct stripe_head *sh, …) { struct buffer_head *bh; unsigned long flags; save_flags(flags); cli(); if ((bh = sh->buffer_pool) == NULL) return NULL; sh->buffer_pool – bh->b_next; bh->b_size = b_size; restore_flags(flags); return bh; }

Example 3

Example 3

Example 3 (con’t)

Example 3 (con’t)

Example 4 int notify_change(struct dentry * dentry, struct iattr * attr) { struct inode

Example 4 int notify_change(struct dentry * dentry, struct iattr * attr) { struct inode *inode = dentry->d_inode; … if (inode->i_op && inode->i_op->setattr) { error = security_inode_setattr(dentry, attr); if (!error) error = inode->i_op->setattr(dentry, attr); … }

Find Software Bugs l Education – l Testing – l Context independent 4 GL

Find Software Bugs l Education – l Testing – l Context independent 4 GL – l Tedious and error prone Compiler checking – l Misses many code paths, time consuming Manual Inspection – l Difficult to know how code will be used Incomplete and don’t know how source code will be used Assurance – Extremely costly and complex – what do we do about existing code?

Limited Source Code Analysis l Source code is the level security is defined –

Limited Source Code Analysis l Source code is the level security is defined – l Compilers can check for various properties – l Problems manifest in errors in code (although design can be a problem too) Rules on program source Programmers can express some properties – – Semantic properties Must specify correctly (no/few false negatives) Must not be too conservative (few false positives) Like to be robust with code changes

Source Code Analysis l l l Covert source code into a model Convert property

Source Code Analysis l l l Covert source code into a model Convert property into a computation on model Report positive cases (violate/meet property) Determine if cases are true or false Resolve true cases Refine model or property and repeat

Some Properties l Never/always do X – l l Do X rather than Y

Some Properties l Never/always do X – l l Do X rather than Y Always do X before/after Y – l l LSM mediation (Example 1) Never do X before/after Y In situation X, do (not) Y – l Never use floating point in kernel Re-enable disabled interrupts (Example 2) In situation X, do Y rather than X

Program Models l l l l Abstract Syntax Tree Control flow Data flow Def-use

Program Models l l l l Abstract Syntax Tree Control flow Data flow Def-use chain Aliases Type constraints …

Abstract Syntax Tree Func_decl Sys_fcntl var_decl Struct file *filp Expr_stmt = Var_decl filp Func_decl

Abstract Syntax Tree Func_decl Sys_fcntl var_decl Struct file *filp Expr_stmt = Var_decl filp Func_decl Do_fcntl Expr_stmt = Var_decl err Expr_stmt = call_decl Fget(fd) Var_decl err Cmpd_stmt Security_op Func_decl Fcntl_setlk var_decl Struct file *filp Call_stmt Fcntl_setlk(fd) call_decl do_fcntl Expr_stmt = Var_decl filp cmpd_stmt Use filp call_decl Fget(fd)

Control Flow (Interprocedural) Func_decl Sys_fcntl var_decl Struct file *filp Expr_stmt = Var_decl filp Func_decl

Control Flow (Interprocedural) Func_decl Sys_fcntl var_decl Struct file *filp Expr_stmt = Var_decl filp Func_decl Do_fcntl Expr_stmt = Var_decl err Expr_stmt = call_decl Fget(fd) Var_decl err Cmpd_stmt Security_op Func_decl Fcntl_setlk var_decl Struct file *filp Call_stmt Fcntl_setlk(fd) call_decl do_fcntl Expr_stmt = Var_decl filp cmpd_stmt Use filp call_decl Fget(fd)

Control Flow (Intraprocedural) Func_decl Sys_fcntl var_decl Struct file *filp Expr_stmt = Var_decl filp Func_decl

Control Flow (Intraprocedural) Func_decl Sys_fcntl var_decl Struct file *filp Expr_stmt = Var_decl filp Func_decl Do_fcntl Expr_stmt = Var_decl err Expr_stmt = call_decl Fget(fd) Var_decl err Cmpd_stmt Security_op Func_decl Fcntl_setlk var_decl Struct file *filp Call_stmt Fcntl_setlk(fd) call_decl do_fcntl Expr_stmt = Var_decl filp cmpd_stmt Use filp call_decl Fget(fd)

Data Flow Func_decl Sys_fcntl var_decl Struct file *filp Expr_stmt = Var_decl filp Func_decl Do_fcntl

Data Flow Func_decl Sys_fcntl var_decl Struct file *filp Expr_stmt = Var_decl filp Func_decl Do_fcntl Expr_stmt = Var_decl err Expr_stmt = call_decl Fget(fd) Var_decl err Cmpd_stmt Security_op Func_decl Fcntl_setlk var_decl Struct file *filp Call_stmt Fcntl_setlk(fd) call_decl do_fcntl Expr_stmt = Var_decl filp cmpd_stmt Use filp call_decl Fget(fd)

Def-Use Func_decl Sys_fcntl var_decl Struct file *filp Expr_stmt = Var_decl filp Func_decl Do_fcntl Expr_stmt

Def-Use Func_decl Sys_fcntl var_decl Struct file *filp Expr_stmt = Var_decl filp Func_decl Do_fcntl Expr_stmt = Var_decl err Expr_stmt = call_decl Fget(fd) Var_decl err Cmpd_stmt Security_op Func_decl Fcntl_setlk var_decl Struct file *filp Call_stmt Fcntl_setlk(fd) call_decl do_fcntl Expr_stmt = Var_decl filp cmpd_stmt Use filp call_decl Fget(fd)

Property Models l Finite State Automata – – l enable Start Operation Disable Interrupts

Property Models l Finite State Automata – – l enable Start Operation Disable Interrupts End Operation Type Constraints – – – Unchecked type Checked type Expect checked type disable enable double_enable End Op Exit w/ disabled double_disable

CQUAL Static Analysis l CQUAL is a type-based static analysis tool from UC Berkeley

CQUAL Static Analysis l CQUAL is a type-based static analysis tool from UC Berkeley l Enables qualification of types, analogous to const l Enables verification that the type passed to a function is the type expected l Used previously for verification of format string vulnerabilities – Wagner’s group at UC Berkeley in USENIX Security 2001

CQUAL Principles l Interprocedural control flow – l Def-Use data flow – – l

CQUAL Principles l Interprocedural control flow – l Def-Use data flow – – l do_fcntl calls fcntl_getlk Assignments tracked back to def where type is declared Type inference Variables have type restrictions – – Cannot assign a variable to another of an incompatible type Cannot send a variable as a parameter to a function unless its type is compatible

CQUAL Approach

CQUAL Approach

Identify Declarations

Identify Declarations

Identify Controlled Params

Identify Controlled Params

Create “Checked” Variable

Create “Checked” Variable

Verify Local Controlled Ops

Verify Local Controlled Ops

Find Assignments to ‘Checked’

Find Assignments to ‘Checked’

Verify Interprocedural Paths

Verify Interprocedural Paths

Verify Interprocedural Paths

Verify Interprocedural Paths

Find Example 1 Error

Find Example 1 Error

Sensitivity: Flow and Context l Flow-sensitivity – – – l The order of statements

Sensitivity: Flow and Context l Flow-sensitivity – – – l The order of statements in a function matters CQUAL is not flow-sensitive Must create new ‘checked’ variable Must use GCC to verify intraprocedural paths Must use GCC to find reassignments after ‘checked’ Context-sensitivity – – – A function is treated differently depending on calling site CQUAL is not context-sensitive If two functions call the same descendant must have the same requirements in CQUAL

CQUAL Postscript l Flow-sensitive CQUAL – l Field level data flow – l Initial

CQUAL Postscript l Flow-sensitive CQUAL – l Field level data flow – l Initial performance was not good Extensions at UC Berkeley We switched to new tool (Ja. BA) – – – Interprocedural control flow Intraprocedural control flow (flow-sensitive) Context-sensitive Variable and field-level data flow Replicated analyses of Example 1 and 3 while preventing false positives of Example 4

Meta-compilation l Compilers – – – l Have program source Can implement straightforward rules

Meta-compilation l Compilers – – – l Have program source Can implement straightforward rules for source checking Lack domain semantics of programs Programmers – – Have domain semantics of programs Need a means to express these semantics such that they can be checked

Meta-compilation l Model – – – l Properties – – l GCC abstract syntax

Meta-compilation l Model – – – l Properties – – l GCC abstract syntax tree Compute interprocedural control flow graph Compute intraprocedural control flow graph Finite state automata Generate extensions from specification Computation – – FSA state transitions are represented by patterns Find syntactic patterns in code Build intraprocedural paths with relevant state changes For each path, compute resultant state transitions

Properties: Meta Language (metal) { #include “linux-includes. h” } sm check_interrupts { // Variables

Properties: Meta Language (metal) { #include “linux-includes. h” } sm check_interrupts { // Variables used in patterns decl { unsigned } flags; enable disable enable // Patterns to specify enable/disable fns double_enable pat enable = { sti(); } | { restore_flags(flags); } ; pat disable = { cli() }; End Op Exit w/ disabled // States – implicit initial state is_enabled: disable is_disabled enable { err(“double enable”); } ; is_disabled: disable { err(“double disable”); } | $end of path$ { err(“exiting w/ intr disabled”); } double_disable

Example 2 Processing get_free_buffer(struct stripe_head *sh, …) { struct buffer_head *bh; unsigned long flags;

Example 2 Processing get_free_buffer(struct stripe_head *sh, …) { struct buffer_head *bh; unsigned long flags; save_flags(flags); cli(); if ((bh = sh->buffer_pool) == NULL) return NULL; sh->buffer_pool – bh->b_next; bh->b_size = b_size; restore_flags(flags); return bh; } disable end of path err enable end of path

Meta-Compilation System l l Compile Metal State Machine (SM) with mcc Dynamically link SM

Meta-Compilation System l l Compile Metal State Machine (SM) with mcc Dynamically link SM into xg++ – l It is “pushed down” “both paths” – l Compile-time, command line flag Paths are built and checked against SM All paths vs one pass (flow-sensitive vs. insensitive) – – Prune paths that reach join in same state Fixed point: loop until reach all possible paths

Prune Paths disable enable Choice of paths does not matter, so only one needs

Prune Paths disable enable Choice of paths does not matter, so only one needs to be kept

Assertion Checking – Side Effects { #include “linux-includes. h” } sm Assert flow-insensitive {

Assertion Checking – Side Effects { #include “linux-includes. h” } sm Assert flow-insensitive { // Match expressions decl { any } expr, x, y, z; decl { any_call } any_fcall; decl { any_args } args; // States: find asserts and detect side effects start: { assert(expr); } {mgk_expr_recurse(expr, in_assert); } ; in_assert: { any_fcall(args) } { err(“fn call”); } | { x = y } { err(“assignment”); } | { z++ } { err(“post-increment”); } | { z-- } { err(“post-decrement”); }

xgcc Extension (PLDI 2002) l Match patterns to statements – l Compute intraprocedural paths

xgcc Extension (PLDI 2002) l Match patterns to statements – l Compute intraprocedural paths – l Identify state transitions Prune those that cannot matter (no state changes) Combine intraprocedural paths into complete paths – – – Analysis instance based on a transition from a start state Paths are generated for each instance Assignments result in creating a new instance that is a copy

Checking memory management allocation unknown Conditional check on ptr implying null Conditional check on

Checking memory management allocation unknown Conditional check on ptr implying null Conditional check on ptr implying not null free, dereference null not-null end path overwrite free, dereference freed stop

Checking memory management l Intraprocedural control flow – l Interprocedural control flow – l

Checking memory management l Intraprocedural control flow – l Interprocedural control flow – l – None, pure syntactic comparison Assignment does result in replication of state machine for assigned variable Finds bugs, but does not guarantee absence – – l “Global analysis” done in PLDI by combining intraprocedural paths Data flow – l Distinguish between paths with null and non-null pointers No track of assignment to a structure field No Aliases False positives – Syntactic path-sensitivity keeps them moderate

Other Example Analyses l Example 3 – (check fcntl and set_fowner) – – –

Other Example Analyses l Example 3 – (check fcntl and set_fowner) – – – l If we know the required authorizations for each operation, we can define the states of these ops Don’t know this (tedious to specify) We use a consistency analysis (ACM TISSEC, May 2004) Example 4 – (distinguish between dentry inode and inode) – – Specify that { inode = dentry inode } links inode state with dentry state Note that this does not compute from 1 st principles, so manual effort is required to ensure it is correct

xgcc Postscript l Lots of papers on finding bugs using these techniques – l

xgcc Postscript l Lots of papers on finding bugs using these techniques – l Other aspects – – l Lots of simple errors in code Automating annotation Statistical analysis Coverity, Inc.

GCC Architecture l l Compilers for C, C++, Java Consists of a sequence of

GCC Architecture l l Compilers for C, C++, Java Consists of a sequence of compilation steps all of which can be hooked (3. 0 and greater) Eventually, has a single representation of all (gimple) Then converts to Register Transfer Language (RTL) at which point all typing is lost

MOPS l Aim to provide a ‘sound’ analysis architecture – l Program model –

MOPS l Aim to provide a ‘sound’ analysis architecture – l Program model – l Pushdown automata of program Property model – l That is, no false negatives for their model Finite state automata of security property Temporal properties Like xgcc, there is no real data flow analysis Unlike xgcc, language for properties is not defined

Formal Basis l FSA M accepts a language of security property violations B –

Formal Basis l FSA M accepts a language of security property violations B – l PDA P accepts all feasible program traces T – – l All operation sequences that obey M violate security property Traces are interprocedural combination of intraprocedural control flow paths Note that traces are control flow representation Problem: Decide if any trace violates security property – As whether T 3 B = null – Represented by L(M) 3 L(P) = null – – Intersection of PDA and FSA can be computed efficiently Note that T` L(P), so some infeasible traces are in L(P)

Example 2 enable get_free_buffer(struct stripe_head *sh, …) { struct buffer_head *bh; unsigned long flags;

Example 2 enable get_free_buffer(struct stripe_head *sh, …) { struct buffer_head *bh; unsigned long flags; disable save_flags(flags); cli(); if ((bh = sh->buffer_pool) == NULL) return NULL; sh->buffer_pool – bh->b_next; bh->b_size = b_size; restore_flags(flags); return bh; } enable double_enable End Op Exit w/ disabled double_disable

Example 1 assign check use unmediated assign zero, free check assign use Unassigned Use

Example 1 assign check use unmediated assign zero, free check assign use Unassigned Use unmediated

MOPS Distinguishing Features l Modularity – – l Pattern variables – – l Can

MOPS Distinguishing Features l Modularity – – l Pattern variables – – l Can create a hierarchy of FSAs Haven’t seen this used… “bound to any expression that satisfies context constraints” Difference from xgcc patterns? Modeling – – PDA and FSA a combined into a composite PDA that accepts L(M) 3 L(P) Can determine all the FSA states that an instruction can be executed in

Modeling OS for MOPS l Find all kernel variables that affect security – l

Modeling OS for MOPS l Find all kernel variables that affect security – l Determine the states in the FSA for each – l Done manually Determine transitions between states – – – Transition in FSA Automated state space explorer Execute all paths and create transitions automatically

Setuid l l Variable euid determines privilege Euid can be modified by several functions:

Setuid l l Variable euid determines privilege Euid can be modified by several functions: – l Value of euid depends on value of other variables on input to these system calls – – – l setuid, seteuid, setresuid ruid, suid cap_effective, cap_permitted Are found manually Transitions indicate system calls that lead to changes in variables

Impact of Soundness l Control flow sound – – – l Construction of FSA

Impact of Soundness l Control flow sound – – – l Construction of FSA has manual steps – – – l Combination of PDA and FSA is sound Context-sensitive Different than xgcc? Identification of variables Identification of system calls that impact variables Could these be automated? Data flow… FSA states are defined manually Support for finding transitions automatically once we know the system calls that matter – different than xgcc? Construction of PDA is automated – Different from xgcc?

Dataflow l Find variables – Manually determine syntactic matches l l Dependencies/States – Manually

Dataflow l Find variables – Manually determine syntactic matches l l Dependencies/States – Manually determine syntactic dependencies l l l Assignments Parameters Structure members Operations that change state Values associated with variables – Assume different for each variable; same for struct l l l Definitions Fget(fd) returns a different fd Dentry->inode is same for dentry Ignore aliases – Could detect in thread; Not usually there l Multiple possible assignments

Classification of Analysis Tools l Specialized checkers (syntactic bugs) – l Annotation checkers (buffer

Classification of Analysis Tools l Specialized checkers (syntactic bugs) – l Annotation checkers (buffer overflows, parse errors) – l xgcc/metal, MOPS Control and data flow for custom analyses (temporal ++) – l LCLint, CQual Automata checkers (temporal bugs – no data flow) – l Lint, ITS 4, JTest PREfix (C/C++), Ja. BA (Java) Predicate Refinement (driver – small program -- verification) – SLAM, MAGIC

More Analysis l Runtime Analysis – l Policy Analysis – l Consistency, buffer overflows

More Analysis l Runtime Analysis – l Policy Analysis – l Consistency, buffer overflows SELinux Intrusion Detection – Represent feasible paths through a program