Software fault isolation with API integrity and multiprincipal

Kernel security is important • Kernel is fully privileged • Kernel compromises are devastating

Linux kernel is vulnerable • Vulnerabilities in Linux are routinely discovered • CVE 2010:

Threat • Module programmer makes mistake • Attacker exploits mistake to mount attacks •

One approach: type safe languages • Write kernel and modules in Java, C# •

Software Fault Isolation (SFI[SOSP 93]) Can not bypass SFI check Module char *p =

Memory safety is insufficient for stopping attacks! • Challenge: module needs to call kernel

Problem: API abuse • Attacker tricks fully-privileged kernel code to overwrite UID Core Kernel

Challenge: lack of API integrity • Kernel APIs are not written defensively • Assume

State of the art for protecting APIs • SFI[SOSP 93]: memory safety • XFI[OSDI

Our approach: annotation language • Helps enforce two types of API integrity: • Argument

Contributions • LXFI: software fault isolation system for Linux kernel modules • Annotation language

Goals for annotation language • Enforce argument integrity, callback integrity and privilege separation within

Preventing module exploits Programmer annotates core kernel Compile time Runtime LXFI translates annotations to

Design of annotation language • Argument integrity annotations • Using the spin_lock_init example •

Enforce argument integrity • spin_lock_init: three annotations are required Part Syntax Description Capability write(ptr,

Example: enforce argument integrity for spin_lock_init Core Kernel void spin_lock_init(spinlock_t *lock) pre(check(write(lock, sizeof(spinlock_t))) Spin_module

Where does the capability come from? • Granted on allocation • Two more annotations

Example: grant spinlock Core Kernel Spin_module void *kmalloc(size) post(copy(write(return, size)) LXFI Runtime …… spinlock_t

What happens when memory is freed? • Need to revoke capability to safely reuse

Example: safely free a spinlock Core Kernel Spin_module LXFI Runtime void kfree(void *p) pre(transfer(write(p,

Why is spin_module able to call spin_lock_init, kmalloc, kfree? • Call capability • Granted

Core Kernel void *kmalloc(size) post(copy(write(return, size)) void spin_lock_init(spinlock_t *lock) pre(check(write(lock, sizeof(spinlock_t))) void kfree(void *p)

No way for compromised spin_module to gain root privilege • SFI ensures memory safety

Privilege separation within a module • dm_crypt: transparent encryption service for block devices •

Privilege separation User space write(“/etc/secret. txt”, “foo”) Kernel space int bdev_write(block_device *dev, const char

Privilege separation read(…) User space Kernel space int bdev_write(block_device *dev, const char * data,

How to define principals • Associate a principal with every instance a module supports

Specifying principals Part Syntax write(ptr, size) Capability ref(a, t) call(a) copy(cap) Capability check(cap) Action

$Privilege separation User space Kernel space struct dm_type { int (*map)(struct dm_target *di); principal(di)$

Principal name aliasing • Problem: Kernel identifies a LXFI principal by multiple addresses int

Other annotation language features Part Capability Syntax Description Save annotation effort for write(ptr, size)

Implementation • Linux 2. 6. 36, x 64, single-core • gcc plugin: kernel rewriting

$Example: annotation propagation //linux/drivers/net/e 1000_main. c //from linux/include/pci_driver. h struct pci_driver { int (*probe)(struct$

Evaluation • Security • Annotation effort • Performance overhead

Security • Test LXFI with three real privilege escalation exploits Exploit CAN_BCM CVE ID

Annotation effort • Annotate kernel APIs for 10 modules, one at a time •

Sharing reduces annotation effort Category Module net device driver sound device driver net protocol

LXFI performance • netperf, 1 Gigabit e 1000 network card, LAN • Stresses LXFI

CPU time of LXFI actions for netperf 80% • Room for improvement Capability action

Future work • Improve performance • Faster capability management such as BGI’s • Extend

Related work • Type-safe kernels: Singularity [MSR-TR 05] • LXFI provides similar guarantees in

Conclusion • Extend SFI with annotation language for: • Argument integrity • Callback integrity

Slides: 44

Download presentation

Software fault isolation with API integrity and multi-principal modules Yandong Mao, Haogang Chen (MIT CSAIL), CSAIL Dong Zhou (Tsinghua University IIIS), IIIS Xi Wang, Nickolai Zeldovich, Frans Kaashoek (MIT CSAIL) CSAIL

Kernel security is important • Kernel is fully privileged • Kernel compromises are devastating • Remote attacker takes control over the whole machine • Local user gains root privilege

Linux kernel is vulnerable • Vulnerabilities in Linux are routinely discovered • CVE 2010: 145 vulnerabilities in Linux kernel • Many exploits attack kernel modules • 67% of Linux kernel vulnerabilities (CVE 2010) • This talk focuses on vulnerabilities in kernel modules

Threat • Module programmer makes mistake • Attacker exploits mistake to mount attacks • Example: buffer overflow, set current UID to root Module Privilege escalation! Kernel memory Module memory UID

One approach: type safe languages • Write kernel and modules in Java, C# • No reference to UID object => cannot directly change UID • Attacker cannot synthesize references Module Most kernels are not written in type safe language! UID

Software Fault Isolation (SFI[SOSP 93]) Can not bypass SFI check Module char *p = 0 xf 7; sfi_check_memory(p); *p = 0; Module memory UID SFI Runtime void sfi_check_memory(p) { if p not in “Module memory” stop_module(); }

Memory safety is insufficient for stopping attacks! • Challenge: module needs to call kernel functions Core Kernel void spin_lock_init(spinlock_t *lock) { lock->v = 0; } Module memory UID Spin_module spinlock_t mylock; spin_lock_init(&mylock);

Problem: API abuse • Attacker tricks fully-privileged kernel code to overwrite UID Core Kernel Spin_module void spin_lock_init(spinlock_t *lock) { lock->v = 0; } spin_lock_init(&cur_proc->uid); Privilege escalation! Module memory UID

Challenge: lack of API integrity • Kernel APIs are not written defensively • Assume the calling module to obey implicit rules • Do not check arguments, permissions, etc • Problem: modules cannot be trusted to follow rules • Module can trick kernel into performing unexpected actions • Ideal system would enforce rules for kernel API • Analogy: system call code assumes nothing about caller, checks every assumption

State of the art for protecting APIs • SFI[SOSP 93]: memory safety • XFI[OSDI 06]: no argument checks • BGI[SOSP 09]: manually wrap functions, make kernel defensive when kernel code invokes callbacks • Error-prone and time-consuming • Works if kernel code is well-structured (not Linux)

Our approach: annotation language • Helps enforce two types of API integrity: • Argument integrity: programmer controls what arguments a module can pass to functions • Callback integrity: kernel invokes callback only if the module could have invoked callback directly • Allows programmers to specify principals for privilege separation within a module • Less error-prone than manual wrapping, applicable to complex APIs such as those in Linux

Contributions • LXFI: software fault isolation system for Linux kernel modules • Annotation language for • Argument integrity • Callback integrity • Privilege separation within a module • Evaluation • Few annotations for 10 Linux kernel modules • Stop three real exploits • 2 -4 X CPU overhead for netperf

Goals for annotation language • Enforce argument integrity, callback integrity and privilege separation within a module • Minimize programmer effort, e. g. : • Few annotations • Avoid data structure and API changes • Compatible with C

Preventing module exploits Programmer annotates core kernel Compile time Runtime LXFI translates annotations to runtime checks LXFI performs checks If annotations capture all implicit rules, compromised module cannot violate rules to gain additional privileges. Using compiler plugins; Provide safe default: reject a module if it calls an unannotated API Consulting a dynamic table of capabilities for each module

Design of annotation language • Argument integrity annotations • Using the spin_lock_init example • Callback integrity annotations • Not discussed; see paper • Privilege separation annotations • Using dm_crypt (real Linux kernel module)

Enforce argument integrity • spin_lock_init: three annotations are required Part Syntax Description Capability write(ptr, size) Capability check(cap) Action Write [ptr, ptr+size] Location Perform action before function call pre(action) Checks cap

Example: enforce argument integrity for spin_lock_init Core Kernel void spin_lock_init(spinlock_t *lock) pre(check(write(lock, sizeof(spinlock_t))) Spin_module capability table LXFI Runtime write(mylock, 8) Module memory …… lxfi_check_write(mylock, 8); spin_lock_init(mylock) …… lxfi_check_write(&cur_proc->uid, 8); spin_lock_init(&cur_proc->uid) Privilege escalation prevented UID

Where does the capability come from? • Granted on allocation • Two more annotations are required Part Capability Action Location Syntax Description write(ptr, size) Write [ptr, ptr+size] Check cap check(cap) Grant a copy of cap copy(cap) pre(action) Perform action before function call post(action) Perform action after function return

Example: grant spinlock Core Kernel Spin_module void *kmalloc(size) post(copy(write(return, size)) LXFI Runtime …… spinlock_t *mylock = kmalloc(8); lxfi_copy_write(mylock, 8); capability table write(mylock, 8)

What happens when memory is freed? • Need to revoke capability to safely reuse memory • Strawman: revoke capability from caller • Insufficient! Other modules may have copies of capability Part Capability Action Location Syntax Description write(ptr, size) Write [ptr, ptr+size] No other copies of Grant a copy of capthe capability remain copy(cap) check(cap) Check cap transfer(cap) Revoke cap from all modules, and grant pre(action) Perform action before function call post(action) Perform action after function return

Example: safely free a spinlock Core Kernel Spin_module LXFI Runtime void kfree(void *p) pre(transfer(write(p, no_size))) lxfi_transfer_write(mylock, -1); …… kfree(mylock); capability table write(mylock, 8) other_module capability table write(mylock, 8)

Why is spin_module able to call spin_lock_init, kmalloc, kfree? • Call capability • Granted initially according to the module’s symbol table • Trust module author not to call unnecessary functions • Dynamically granted when a callback function is passed Part Capability Action Location Syntax Description write(ptr, size) call(a) copy(cap) Write [ptr, ptr+size] check(cap) Check cap transfer(cap) Revoke cap from all modules, and grant pre(action) Perform action before function call post(action) Perform action after function return Call a Grant a copy of cap

Core Kernel void *kmalloc(size) post(copy(write(return, size)) void spin_lock_init(spinlock_t *lock) pre(check(write(lock, sizeof(spinlock_t))) void kfree(void *p) pre(transfer(write(p, no_size)) LXFI Runtime …… Spin_module capability table call(kmalloc) call(spin_lock_init) call(kfree) spinlock_t *mylock = kmalloc(8); lxfi_copy_write(mylock, 8); …… lxfi_check_write(mylock, 8); spin_lock_init(mylock)l …… lxfi_check_write(&cur_proc->uid, 8); spin_lock_init(&cur_proc->uid); …… lxfi_transfer_write(mylock, -1); kfree(mylock);

No way for compromised spin_module to gain root privilege • SFI ensures memory safety • Call capabilities ensure only 3 functions are allowed • None of the functions can modify UID because: • kmalloc never modifies allocated memory • spin_lock_init can only be called with writable memory (from kmalloc) • kfree ensures no capabilities remain after free • spin_module can not modify UID!

Privilege separation within a module • dm_crypt: transparent encryption service for block devices • This example requires a third type of capability Part Syntax Description write(ptr, size) Write [ptr, ptr+size] Call a Capability call(a) ref(a, t) Pass a as t copy(cap) Grant a copy of cap Pass argument a as type t Capability check(cap) Check cap Action transfer(cap) Revoke cap from all principals, and grant Location pre(action) Perform action before function call post(action) Perform action after function return

Privilege separation User space write(“/etc/secret. txt”, “foo”) Kernel space int bdev_write(block_device *dev, const char * data, …) pre(check(ref(block_device), dev) Core Kernel write(enc_disk, “foo”, …) dm_crypt capability table LXFI Runtime ref(block_device, enc_disk->bdev) Writing block device does not require writing to memory of enc_disk->bdev. lxfi_check_ref(block_device, enc_disk->bdev) bdev_write(enc_disk->bdev, E(“foo”), …)

Privilege separation read(…) User space Kernel space int bdev_write(block_device *dev, const char * data, …) pre(check(ref(block_device), dev) Core Kernel LXFI Runtime dm_crypt capability table ref(block_device, enc_disk->bdev) ref(block_device, enc_usb->bdev) Decrypt lxfi_check_ref(block_device, enc_disk->bdev) bdev_write(enc_disk->bdev, “/etc/pwd”, “foo”) /etc/pwd: rootpwd=foo

How to define principals • Associate a principal with every instance a module supports (e. g. block device in dm_crypt) • Problem: how to specify and name principals? • Recall goal: minimize changes to existing data structures • Idea: re-use address of data structure as the name of the principal • Can typically identify principal from one of the function arguments

Specifying principals Part Syntax write(ptr, size) Capability ref(a, t) call(a) copy(cap) Capability check(cap) Action transfer(cap) Location Principal Description Write [ptr, ptr+size] Pass a as t Call a Grant a copy of cap Check cap Revoke cap from all principals, and grant pre(action) Perform action before function call post(action) principal(ptr) Perform action after function return Run with privileges of principal ptr

$Privilege separation User space Kernel space struct dm_type { int (*map)(struct dm_target *di); principal(di)$

Privilege separation User space Kernel space struct dm_type { int (*map)(struct dm_target *di); principal(di) }; Core Kernel lxfi_set_princ(enc_usb) dm_crypt. map(enc_usb) LXFI Runtime dm_crypt capability table write(enc_disk->bdev, 100) write(enc_usb->bdev, 100) Decrypt lxfi_check_write(enc_disk->bdev, 100) bdev_write(enc_disk->bdev, “/etc/pwd”, “foo”) /etc/pwd: rootpwd=foo

Principal name aliasing • Problem: Kernel identifies a LXFI principal by multiple addresses int e 1000_probe(struct pci_dev *pcidev) { struct net_device *ndev = alloc_etherdev(. . . ); ndev->pcidev = pcidev; lxfi_princ_alias(pcidev, ndev); . . . } int e 1000_xmit(struct net_device *dev) { … } • Insert code into module to create alias • The same principal now has multiple names

Other annotation language features Part Capability Syntax Description Save annotation effort for write(ptr, size) complex objects that need. Write [ptr, ptr+size] multiplet)capabilities ref(a, Pass a as t call(a) Call a cap_iterator(obj) A function iterates all cap. of obj copy(cap) Grant a copy of cap if(c-expr) action Perform action only if c-expr Capability Action check(cap) Check cap. Global: principal with full Express conditional action such as grant atransfer(cap) privilege if return value is OK Revoke cap privilige from all principals, grant cap Shared: principal with pre(action) minimalbefore privilege Perform action function call Location post(action) Perform action after function return Principal principal(ptr) Run with privileges of principal ptr(global, shared)

Implementation • Linux 2. 6. 36, x 64, single-core • gcc plugin: kernel rewriting for callback integrity • Clang/LLVM plugin: module rewriting • Annotation propagation saves effort by inferring annotations of module functions

$Example: annotation propagation //linux/drivers/net/e 1000_main. c //from linux/include/pci_driver. h struct pci_driver { int (*probe)(struct$

Example: annotation propagation //linux/drivers/net/e 1000_main. c //from linux/include/pci_driver. h struct pci_driver { int (*probe)(struct pci_dev *pcidev) principal(pcidev) pre(copy(ref(struct pci_dev), pcidev) } LXFI propagates annotation on probe to modules int e 1000_probe(struct pci_dev *pcidev) { …. } struct pci_driver e 1000_driver = {. probe = e 1000_probe }; //linux/drivers/net/ixgbe_main. c int ixgbe_probe(struct pci_dev *pcidev) { …. } struct pci_driver ixgbe_driver = {. probe = ixgbe_probe };

Evaluation • Security • Annotation effort • Performance overhead

Security • Test LXFI with three real privilege escalation exploits Exploit CAN_BCM CVE ID CVE-2010 -2959 Violated Property Unmodified Linux LXFI Memory Safety CVE-2010 -3849 Econet CVE-2010 -3850 API Integrity CVE-2010 -4258 RDS CVE-2010 -3904 API Integrity • Stopping real attacks requires API integrity

Annotation effort • Annotate kernel APIs for 10 modules, one at a time • Count: • # of annotated core kernel functions a module calls • # of function pointer declarations a module exports to core kernel

Sharing reduces annotation effort Category Module net device driver sound device driver net protocol driver block device driver Total #Functions # Function Pointers All Unique e 1000 81 49 52 47 snd-intel 8 x 0 59 27 12 2 snd-ens 1370 48 13 12 2 rds 77 30 42 26 can 53 7 7 3 can-bcm 51 15 17 1 econet 54 15 20 3 dm-crypt 50 24 24 14 dm-zero 6 3 2 0 dm-snapshot 55 16 28 18 334 155

LXFI performance • netperf, 1 Gigabit e 1000 network card, LAN • Stresses LXFI Test Throughput CPU % Stock LXFI TCP_STREAM TX 836 M bits/sec 828 M bits/sec 13% 48% UDP_STREAM TX 3. 1 M/3. 1 M pkt/sec 2. 0 M/2. 0 M pkt/sec 54% 100% ~30% decrease

CPU time of LXFI actions for netperf 80% • Room for improvement Capability action Mem-write check Function Entry Function Exit Indirect call check

Future work • Improve performance • Faster capability management such as BGI’s • Extend annotation language to enforce other types of API integrity • Perhaps based on Singularity’s contracts

Related work • Type-safe kernels: Singularity [MSR-TR 05] • LXFI provides similar guarantees in C • Good support for revocation (transfer) and principals • Software fault isolation • LXFI extends existing SFI systems (SFI, XFI, BGI) with annotation language

Conclusion • Extend SFI with annotation language for: • Argument integrity • Callback integrity • Principals • LXFI: Prototype for Linux • Annotated 10 kernel modules • Prevented 3 real privilege escalation exploits • 2 -4 X CPU overhead when stressing with netperf

Q&A