Input Validation James Walden Northern Kentucky University Topics

  • Slides: 48
Download presentation
Input Validation James Walden Northern Kentucky University

Input Validation James Walden Northern Kentucky University

Topics 1. 2. 3. 4. The Nature of Trust Validating Input Entry Points Web

Topics 1. 2. 3. 4. The Nature of Trust Validating Input Entry Points Web Application Input

Trust Relationships Relationship between multiple entities. § Assumptions that certain properties are true. -

Trust Relationships Relationship between multiple entities. § Assumptions that certain properties are true. - example: input has a certain format § Assumptions that other properties are false. - example: input never longer than X bytes Trustworthy entities satisfy assumptions.

Who do you trust? Client users § example: encryption key embedded in client Operating

Who do you trust? Client users § example: encryption key embedded in client Operating system § example: dynamicly loaded libraries Calling program § example: environment variables Vendor § example: Borland Interbase backdoor 19942001, only discovered when program made open source

Trust is Transitive If you call another program, you are trusting the entities that

Trust is Transitive If you call another program, you are trusting the entities that it trusts. § Processes you spawn run with your privileges. § Did you run the program you think you did? - PATH and IFS environment variables § What input format does it use? - Shell escapes in editors and mailers § What output does it send you?

Validate All Input Never trust input. § Assume dangerous until proven safe. Prefer rejecting

Validate All Input Never trust input. § Assume dangerous until proven safe. Prefer rejecting data to filtering data. § Difficult to filter out all dangerous input Every component should validate data. § Trust is transitive. § Don’t trust calling component. § Don’t trust called component: shell, SQL

Validation Techniques Indirect Selection § Allow user to supply index into a list of

Validation Techniques Indirect Selection § Allow user to supply index into a list of legitimate values. § Application never directly uses user input. Whitelist § List of valid patterns or strings. § Input rejected unless it matches list. Blacklist § List of invalid patterns or strings. § Input reject if it matches list.

Trust Boundaries Syntax Validation Semantic Validation Raw Input Safe Syntax Raw Input Trust Boundaries

Trust Boundaries Syntax Validation Semantic Validation Raw Input Safe Syntax Raw Input Trust Boundaries App Logic

Wrap Dangerous Functions Custom Enterprise Web Application Input is context sensitive. § Need more

Wrap Dangerous Functions Custom Enterprise Web Application Input is context sensitive. § Need more context than is available at front end. OWASP ESAPI Security. Configuration Intrusion. Detector Logger Exception Handling Randomizer Encrypted. Properties Encryptor HTTPUtilities Encoder Validator Access. Reference. Map Access. Controller § Apply context-sensitive input validation to all input. § Maintain input validation login in one place. § Ensure validation always applied. § Use static analysis to check for use of dangerous functions replaced by API. User Authenticator Enterprise Security API Soln: create secure API

Usability Validation ≠ Security Validation Usability Validation helps legimitate users § Catch common errors.

Usability Validation ≠ Security Validation Usability Validation helps legimitate users § Catch common errors. § Provide easy to understand feedback. § Client-side feedback is helpful for speed. Security Validation mitigates vulnerabilities § Catches potential attacks, including unusual, unfriendly types of input. § Provide little to no feedback on reasons for blocking input. § Cannot trust client. Always server side.

Check Input Length Long input can result in buffer overflows. § Can also cause

Check Input Length Long input can result in buffer overflows. § Can also cause Do. S due to low memory. Truncation vulnerabilities § § § 8 -character long username column in DB. User tries to enter ‘admin x’ as username. DB returns no match since name is 9 chars. App inserts data into DB, which truncates. Later SQL queries will return both names, since My. SQL ignores trailing spaces on string comparisons.

Entry Points 1. Command line arguments 2. Environment variables 3. File descriptors 4. Signal

Entry Points 1. Command line arguments 2. Environment variables 3. File descriptors 4. Signal handlers 5. Format strings 6. Paths 7. Shell input 8. Web application input 9. Database input 10. Other input types

Command Line Arguments Available to program as **argv. execve() allows user to specify arguments.

Command Line Arguments Available to program as **argv. execve() allows user to specify arguments. May be of any length § even program name, argv[0] § argv[0] may even be NULL

Environment Variables Default: inherit parent’s environment. execve() allows you to specify environment variables for

Environment Variables Default: inherit parent’s environment. execve() allows you to specify environment variables for exec’d process. § environment variables can be of any length. Telnet environment propagation to server § Server receives client shell’s environment. § Server runs setuid program login. § ssh may user’s ~/. ssh/environment file.

Dangerous Environment Variables LD_PRELOAD § Programs loads functions from library specified in LD_PRELOAD before

Dangerous Environment Variables LD_PRELOAD § Programs loads functions from library specified in LD_PRELOAD before searching for system libraries. § Can replace any library function. § setuid root programs don’t honor this variable. LD_LIBRARY_PATH § Specify list of paths to search for shared libs. § Store hacked version of library in first directory. § Modern libc implementation disallow for setuid/setgid.

Dangerous Environment Variables PATH § Search path for binaries § Attacker puts directory with

Dangerous Environment Variables PATH § Search path for binaries § Attacker puts directory with hacked binary first in PATH so his ls used instead of system ls § Avoid “. ” as attacker may place hacked binaries in directory program sets CWD to IFS § Internal field separator for shell § Used to separate command line into arguments § Attacker sets to “/”: /bin/ls becomes “bin” and “ls”

Environment Storage Format Access Functions § setenv(), getenv() Internal Storage Format § § array

Environment Storage Format Access Functions § setenv(), getenv() Internal Storage Format § § array of character pointers, NULL terminated string format: “NAME=value”, NULL term Multiple env variables can have same name. Did you check the same variable that you fetched? First or last variable that matches?

Securing Your Environment /* BSS, pp. 318 -319 */ extern char **environ; static char

Securing Your Environment /* BSS, pp. 318 -319 */ extern char **environ; static char *def_env[] = { “PATH=/bin: /usr/bin”, “IFS= tn”, 0 }; static void clean_environment() { int i = -1; while( environ[++i] != 0 ); while(i--) environ[i] = 0; while(def_env[i]) putenv(def_env[i++]); }

Securing Your Environment Secure Environment in Shell /usr/bin/env – PATH=/bin: /usr/bin IFS=“ tn” cmd

Securing Your Environment Secure Environment in Shell /usr/bin/env – PATH=/bin: /usr/bin IFS=“ tn” cmd Secure Environment in Perl %ENV = ( PATH => “/bin: /usr/bin”, IFS => “ tn” );

File Descriptors § Default: inherited from parent process § stdin, stdout, stderr usually fd’s

File Descriptors § Default: inherited from parent process § stdin, stdout, stderr usually fd’s 0, 1, and 2 § Parent process may have closed or redirected standard file descriptors § Parent may have left some fd’s open § Cannot assume first file opened will have fd 3 § Parent process may not have left enough file descriptors for your program § Check using code from BSS, p. 315

Signal Handlers Default: inherited from parent process. /* BSS, p. 316 */ #include <signal.

Signal Handlers Default: inherited from parent process. /* BSS, p. 316 */ #include <signal. h> int main( int argc, char **argv ) { int i; for(i=0; i<NSIG; i++) signal(I, SIG_DFL); }

Format Strings Formatted output functions use format lang. § Percent(%) symbols in string indicate

Format Strings Formatted output functions use format lang. § Percent(%) symbols in string indicate substitutions. § %[flags][width][. precision][length]specifier Example format specifiers § “%010 d”, 2009: 0000002009 § “%4. 2 f”, 3. 1415926: 3. 14 Example functions § printf() § scanf() § syslog()

printf() family dangers User-specified format strings § userstring = “foo %x”; § printf( userstring

printf() family dangers User-specified format strings § userstring = “foo %x”; § printf( userstring ); § Where can it find arguments to replace %x? - The Stack: %x reads 4 -bytes higher in stack Solution: Use printf( “%s”, userstring ) or fputs( userstring )

printf() family dangers Buffer overflows char buf[256]; sprintf( buf, “The data is %sn”, userstring

printf() family dangers Buffer overflows char buf[256]; sprintf( buf, “The data is %sn”, userstring ); Specify “precision” of string substitution sprintf(buf, “The data is. 32%sn”, userstring ); Use snprintf (C 99 standard function) snprintf(buf, 255, “The data is %sn”, userstring );

%n format command Number of characters written so far is stored into the integer

%n format command Number of characters written so far is stored into the integer indicated by the int * pointer argument. char buf[] = "0123456789"; int *n; printf(“buf=%s%nn", buf, n); printf("n=%dn", *n); Output: § buf=0123456789 § n=14

%n format attack Plan of Attack § Find address of variable to overwrite §

%n format attack Plan of Attack § Find address of variable to overwrite § Place address of variable on stack (as part of format string) so %n will write to that address § Write # of characters equal to value to insert into variable (use precision, e. g. , %. 64 x) Use %n to write anywhere in memory § Address on stack can point to any location

Paths If attacker controls paths used by program § Can read files accessible by

Paths If attacker controls paths used by program § Can read files accessible by program. § Can write files accessible by program. Vuln if access is different than attackers § Privileged (SETUID) local programs. § Remote server applications, including web. Directory traversal § Use “. . /. . ” to climb out of application’s directory and access files.

Canonicalization § How to make correct access control decisions when there are many names?

Canonicalization § How to make correct access control decisions when there are many names? § § § config. /config /etc/program/config. . /program/config /tmp/. . /etc/program/config § Canonical Name: standard form of a name § Generally simplest form. § Canonicalize name then apply access control. § Use realpath() in C to canonicalize.

Common Naming Issues § § § . represents current directory. . represents previous directory

Common Naming Issues § § § . represents current directory. . represents previous directory Case sensitivity Windows allows both / and in URLs. Windows 8. 3 representation of long names § Two names for each file for backwards compat. § Trailing dot in DNS names § www. nku. edu. == www. nku. edu § URL encoding

Win/Apache Directory Traversal Found in Apache 2. 0. 39 and earlier. To view the

Win/Apache Directory Traversal Found in Apache 2. 0. 39 and earlier. To view the file winntwin. ini, use: http: //127. 0. 0. 1/error/%5 c%2 e%2 e%5 c%2 e%2 e%5 cwinnt %5 cwin. ini which is the escaped form of http: //127. 0. 0. 1/error/. . winntwin. ini

Command Injection Find program that invokes a subshell command with user input UNIX C:

Command Injection Find program that invokes a subshell command with user input UNIX C: system(), popen(), … Windows C: Create. Process(), Shell. Execute() Java: java. lang. Runtime. exec() Perl: system(), ``, open() Use shell meta-characters to insert userdefined code into the command.

UNIX Shell Metacharacters `command` will execute command ‘; ’ separates commands ‘|’ creates a

UNIX Shell Metacharacters `command` will execute command ‘; ’ separates commands ‘|’ creates a pipe between two commands ‘&&’ and ‘||’ logical operators which may execute following command ‘!’ logical negation—reverses truth value of test ‘-’ could convert filename into an argument ‘*’ and ‘? ’ glob, matching files, which may be interpreted as args: what if “-rf” is file? ‘#’ comments to end of line

Command Injection in C /* Mail to root with user-defined subject */ int main(

Command Injection in C /* Mail to root with user-defined subject */ int main( int argc, char **argv ) { char buf[1024]; sprintf( buf, “/bin/mail –s %s root </tmp/message”, argv[1] ); system( buf ); }

Command Injection in C How to exploit? . /mailprog `/path/to/hacked_bin` /path/to/hacked_bin will be run

Command Injection in C How to exploit? . /mailprog `/path/to/hacked_bin` /path/to/hacked_bin will be run by mailprog How to fix? Verify input matches list of safe strings. Run /bin/mail using fork/exec w/o a subshell.

Command Injection in Java String btype = request. get. Parameter("backuptype"); String cmd = new

Command Injection in Java String btype = request. get. Parameter("backuptype"); String cmd = new String("cmd. exe /K "c: \util\rman. DB. bat "+btype+"&&c: \utl\cleanup. bat""); System. Runtime. get. Runtime(). exec(cmd);

Command Injection in Java How to exploit? Edit HTTP parameter via web browser. Set

Command Injection in Java How to exploit? Edit HTTP parameter via web browser. Set bype to be “&& del c: \dbms\*. *” How to defend? Verify input matches list of safe strings. Run commands separately w/o cmd. exe.

Web-based Input Sources of Input: § § URLs, including paths + parameters POST form

Web-based Input Sources of Input: § § URLs, including paths + parameters POST form parameters HTTP headers Cookies Common Types of Input: § § HTML Javascript URL-encoded parameters XML/JSON

Different Perspectives Client Dangers Server Dangers § Dangerous code § No data sent to

Different Perspectives Client Dangers Server Dangers § Dangerous code § No data sent to client is secret: - Active. X Action. Script Javascript Java § Client-side storage - Cookies - Flash LSOs - DOM storage § Hidden fields § Cookies § User controls client. § Can bypass validation. § Can access URLs in any order. § Can alter client-side storage.

URL Parameters <proto>: //<user>@<host>: <port>/<path>? <qstr> Whitespace marks end of URL “@” separates userinfo

URL Parameters <proto>: //<user>@<host>: <port>/<path>? <qstr> Whitespace marks end of URL “@” separates userinfo from host “? ” marks beginning of query string “&” separates query parameters %HH represents character with hex values § ex: %20 represents a space

HTML Special Characters “<“ begins a tag “>” ends a tag some browsers will

HTML Special Characters “<“ begins a tag “>” ends a tag some browsers will auto-insert matching “<“ “&” begins a character entity ex: < represents literal “<“ character Quotes(‘ and “) used to enclose attribute values, but don’t have to be used.

Character Set Encoding § § Default: ISO-8859 -1 (Latin-1) Char sets dictate which chars

Character Set Encoding § § Default: ISO-8859 -1 (Latin-1) Char sets dictate which chars are special UTF-8 allows multiple representations Force Latin-1 encoding of web page with: § <META http-equiv=“Content-Type” content=“text/html; charset=ISO-8859 -1”>

Cookies Parameters § § § Name Value Expiration Date Domain Path Secure Connections Only

Cookies Parameters § § § Name Value Expiration Date Domain Path Secure Connections Only

Cookies Server to Client Content-type: text/html Set-Cookie: foo=bar; path=/; expires Fri, 20 -Feb -2004

Cookies Server to Client Content-type: text/html Set-Cookie: foo=bar; path=/; expires Fri, 20 -Feb -2004 23: 59: 00 GMT Client to Server Content-type: text/html Cookie: foo=bar

Secure Cookie Authentication § Encrypt cookie so user cannot read § Include expiration time

Secure Cookie Authentication § Encrypt cookie so user cannot read § Include expiration time inside cookie § Include client IP address to avoid hijacking § Use another cookie with MAC of first cookie to detect tampering § Use secret key as part of MAC so client does not have necessary information to forge

Database Input SQL Injection § Most common flaw in database input parsing. § Don’t

Database Input SQL Injection § Most common flaw in database input parsing. § Don’t pass unvalidated data to database. § Whitelist for known safe character set. - Alphanumerics - How many symbols do you need to accept? Don’t trust input from database. § Check that you receive expected # of rows. § Check for safe data to avoid stored XSS and second order SQL injection attacks.

Other Inputs Default file permissions § umask(066); Resource Limits § May suffer Do. S

Other Inputs Default file permissions § umask(066); Resource Limits § May suffer Do. S if parent imposes strict limits on CPU time, # processes, file size, stack size. § Use setrlimit() to limit core dump size to zero if program ever contains confidential data in memory, e. g. , unencrypted passwords.

Key Points 1. Validate input from all sources. CLI args, env vars, config files,

Key Points 1. Validate input from all sources. CLI args, env vars, config files, database, etc. 2. Use the strongest possible technique. 1. Indirect Selection 2. Whitelist 3. Blacklist 3. Reject bad input, don’t attempt to fix it. 4. Trust is transitive. 5. Architect for validation: establish trust boundaries, wrap dangerous functions.

References 1. 2. 3. 4. 5. 6. 7. 8. 9. Brian Chess and Jacob

References 1. 2. 3. 4. 5. 6. 7. 8. 9. Brian Chess and Jacob West, Secure Programming with Static Analysis, Addison-Wesley, 2007. Steve Mc. Connell, Code Complete, 2/e, Microsoft Press, 2004. Gary Mc. Graw, Software Security, Addison-Wesley, 2006. PCI Security Standards Council, PCI DSS Requirements and Security Assessment Procedures, v 1. 2, 2008. Mark Graff and Kenneth van Wyk, Secure Coding: Principles & Practices, O’Reilly, 2003. Michael Howard and David Le. Blanc, Writing Secure Code, 2 nd edition, Microsoft Press, 2003. Michael Howard, David Le. Blanc, and John Viega, 19 Deadly Sins of Software Security, Mc. Graw-Hill Osborne, 2005. John Viega, and Gary Mc. Graw, Building Secure Software, Addison-Wesley, 2002. David Wheeler, Secure Programming for UNIX and Linux HOWTO, http: //www. dwheeler. com/secure-programs/Secure. Programs-HOWTO/index. html, 2003.