Kernel Tracing David Ferry Chris Gill Brian Kocoloski
Kernel Tracing David Ferry, Chris Gill, Brian Kocoloski CSE 422 S - Operating Systems Organization Washington University in St. Louis, MO 63143 1
Things Happen: Kernel Oops vs Panic A kernel panic is unrecoverable and results in an instant halt An oops communicates something bad happened but the kernel tries to continue executing – An oops means the kernel is not totally broken, but is probably in an inconsistent state – An oops in interrupt context, the idle task (pid 0), or the init task (pid 1) results in a panic – How to figure out what went wrong? "Kernel-panic" by Kevin http: //flickr. com/photos/kevincollins/74279815/ CSE 422 S –Operating Systems Organization 2
Debugging Can take two forms 1. Correctness debugging – e. g. , your kernel is panic’ing when you load a kernel module 2. Performance debugging – e. g. , you parallelized a program to run on 4 cores rather than 1, but did not see 4 x speedup … CSE 422 S –Operating Systems Organization 3
Source: https: //opensourceforu. com/2011/01/understanding-a-kernel-oops/ CSE 422 S –Operating Systems Organization 4
Important Bits • BUG: unable to handle kernel NULL pointer dereference – You probably have a pointer that was never set to allocated memory – e. g: int * ptr = NULL; *ptr = x; • Instruction pointer (sometimes called “program counter”) – Address of instruction that generated BUG is stored in a special register • On x 86_64: RIP • On ARM: R 15 • Linux will map this instruction back to the function where this instruction occurred – In the previous example: my_oops_init CSE 422 S –Operating Systems Organization 5
Kernel Correctness Debugging • Debugging in the kernel is much more complicated than debugging in userspace – You don’t have simple to use debuggers (though some like kgdb do exist) – You don’t get graceful kill signals like segfaults – Small errors can bring down the machine (question: why? ) CSE 422 S –Operating Systems Organization 6
Simplest “Tracer”: printk() prints information to the system log – Messages stored in circular buffer – Can be read with dmesg – Eight possible log levels (set with dmesg –n) Example: printk(KERN_ALERT “bad thing %ld”, bad_thing); – Uses same format as printf() – Note there is no comma after log level (KERN_ALERT) CSE 422 S –Operating Systems Organization 7
Other Debugging Mechanisms int some_function(…) { // <code you know is correct> // <code never reached> } CSE 422 S –Operating Systems Organization 8
Other Debugging Mechanisms int some_function(…) { // <code you know is correct> // <code that must have a bug> // <code never reached> } CSE 422 S –Operating Systems Organization 9
Other Debugging Mechanisms int some_function(…) { printk(“Value of some variable: %dn”, var); // stop executing code to avoid // triggering bug while (1) {} Sits in a loop forever, but will never relinquish the CPU (Linux generally does not preempt tasks executing kernel code) CSE 422 S –Operating Systems Organization 10
Other Debugging Mechanisms int some_function(…) { printk(“Value of some variable: %dn”, var); // stop executing code to avoid // triggering bug while (1) { schedule(); } Relinquishes the current CPU, giving it back to Linux to run other tasks CSE 422 S –Operating Systems Organization 11
Performance Debugging The more interesting type of debugging We’ll look at two things today: – monitoring system calls executed by a process (fyi: this is something a rootkit or sandboxer might also want to do ) – Monitoring preemption and scheduling operations experienced by a process CSE 422 S –Operating Systems Organization 12
System Call Tracing: Strace Allows one userspace process (tracer) to inspect the system calls made by another thread (tracee). 1. Tracer calls ptrace() on tracee 2. Tracee halts at every system call, system call return, and signal (except SIGKILL) 3. Tracer records info, and releases tracee to continue Note: • Tracing is per-thread • Seriously warps program timing CSE 422 S –Operating Systems Organization 13
strace. /program CSE 422 S –Operating Systems Organization 14
strace. /program strace KERNEL CSE 422 S –Operating Systems Organization 15
strace. /program strace fork <kernel creates new process> KERNEL . /program CSE 422 S –Operating Systems Organization 16
strace. /program strace fork <kernel creates new process> <kernel begins intercepting syscalls of. /program> KERNEL ptrace (PTRACE_TRACEME). /program CSE 422 S –Operating Systems Organization 17
strace. /program strace ptrace (PTRACE_ATTACH) fork <kernel begins forwarding syscalls of. /program to strace> <kernel creates new process> <kernel begins intercepting syscalls of. /program> KERNEL ptrace (PTRACE_TRACEME). /program CSE 422 S –Operating Systems Organization 18
strace. /program strace ptrace (PTRACE_ATTACH) fork <kernel begins forwarding syscalls of. /program to strace> <kernel stops. /program and wakes up strace> <kernel creates new process> <kernel begins intercepting syscalls of. /program> ptrace (PTRACE_TRACEME) KERNEL open (…) . /program CSE 422 S –Operating Systems Organization 19
strace. /program strace ptrace (PTRACE_ATTACH) fork Inspect open system call <kernel begins forwarding syscalls of. /program to strace> <kernel stops. /program and wakes up strace> <kernel creates new process> <kernel begins intercepting syscalls of. /program> ptrace (PTRACE_TRACEME) KERNEL open (…) . /program CSE 422 S –Operating Systems Organization 20
strace. /program strace ptrace (PTRACE_ATTACH) fork Inspect open system call <kernel begins forwarding syscalls of. /program to strace> <kernel stops. /program and wakes up strace> <kernel creates new process> <kernel begins intercepting syscalls of. /program> ptrace (PTRACE_TRACEME) ptrace (PTRACE_CONT) <kernel stops strace, executes system calls and wakes up. /program> KERNEL open (…) . /program CSE 422 S –Operating Systems Organization 21
strace. /program strace ptrace (PTRACE_ATTACH) fork Inspect open system call <kernel begins forwarding syscalls of. /program to strace> <kernel stops. /program and wakes up strace> <kernel creates new process> <kernel begins intercepting syscalls of. /program> ptrace (PTRACE_TRACEME) ptrace (PTRACE_CONT) <kernel stops strace, executes system calls and wakes up. /program> KERNEL open (…) . /program CSE 422 S –Operating Systems Organization 22
Ftrace – the Function Tracer Tracing beyond system calls; many features: – – – Event tracepoints (scheduler, interrupts, etc. ) Trace any kernel function Call graphs Kernel stack size Latency tracing • How long interrupts disabled • How long preemption disabled Has a user interface called trace-cmd Very nice graphical trace browser called Kernelshark CSE 422 S –Operating Systems Organization 23
Ftrace Internals When tracing is enabled, the kernel maintains: – Per-CPU ring buffer for holding events – Per-CPU kernel thread that empties ring buffer If readers can’t keep up, data is lost (dropped) Tracepoints in kernel: – Kernel maintains list of tracepoint locations – Locations normally converted to no-ops (ftrace_make_nop()) – Trace code is runtime-patched into kernel code when activated (ftrace_make_call()) CSE 422 S –Operating Systems Organization 24
Today’s Studio • By the end of the studio, you will understand how to perform basic analysis of program and kernel interaction – What system calls are made? – Where and when does the kernel schedule processes? • You will use strace to inspect system calls made by a program strace. /dense_mm 100 &> dense 100 trace. txt strace. /dense_mm 300 &> dense 300 trace. txt • You will use ftrace and kernelshark to inspect scheduling behavior CSE 422 S –Operating Systems Organization 25
- Slides: 25