Tracing System Calls Akash Lal and Saurabh Goyal
Tracing System Calls Akash Lal and Saurabh Goyal 3 rd May 2005
The Problem … Network ? File System ?
Application Level Debugging • Motivation: – Is the application stuck on some device ? – File system too slow ? – Network too slow ?
Application Level Debugging • Motivation: – Is the application stuck on some device ? – System calls too slow ? • Problem: – Monitor the system calls • Solution requirements: – Low overhead – Non-intrusive
Approach • Tracking all system calls takes too much time – Overhead: 30% - 300% FIXME • Our solution: pstrace – Don’t monitor all calls, only some. – Overhead: user controllable – Precision: minimal loss
Summary • Probabilistic monitoring of system calls is good ! – Minimal loss in precision – Low overhead • Can keep pstrace running all the time
Outline • Tracking system calls in user mode (Linux) – Evolution of pstrace as it stands today • Implementation • Evaluation – overhead – precision
Tracking system calls • ptrace(…) system call – Can attach to any process – Raises SIGTRAP on syscall entry and exit – Caught by parent process
Tracking system calls ptrace(pid, ATTACH) while(true) { wait 4(pid); <process call start> Syscall start ptrace(pid, CONT); wait 4(pid); <process call end> ptrace(pid, CONT); } Syscall end
Tracking system calls ptrace(pid, ATTACH) while(true) { wait_n_process_start(pid); wait_n_process_end(pid); }
Tracking system calls ptrace(pid, ATTACH) while(true) { wait_n_process_start(pid); wait_n_process_end(pid); ptrace(pid, DETACH); sleep(1 system call); ptrace(pid, ATTACH); }
Tracking system calls ptrace(pid, ATTACH) while(true) { wait_n_process_start(pid); wait_n_process_end(pid); ptrace(pid, DETACH); sleep(n system calls); Tracks (1/n+1) of the total calls ptrace(pid, ATTACH); } Take home slide !
Tracking system calls ptrace(pid, ATTACH) while(true) { wait_n_process_start(pid); wait_n_process_end(pid); ptrace(pid, DETACH); sleep(n system calls); ptrace(pid, ATTACH); }
Tracking system calls sleep(n system calls); • Linux does not provide syscall information • Solution: – Guess the time it will take the process to make n syscalls
Tracking system calls while(true) { do(b times) { wait_n_process_start(pid); wait_n_process_end(pid); } ptrace(pid, DETACH); sleep(n*b system calls); ptrace(pid, ATTACH); } user_time(pid) user_time(pstrace) real_time
Tracking system calls while(true) { do(b times) { wait_n_process_start(pid); wait_n_process_end(pid); } ptrace(pid, DETACH); sleep(n*b system calls); ptrace(pid, ATTACH); } b: burst size
Burst size • User configurable parameter • Better time measurements • Preserves syscall sequences – Useful for diagnostic purposes • Typical value – Syscall intensive (toy programs) : 10000 – Real applications : 100 - 1000
Tracking system calls while(true) { do(b times) { wait_n_process_start(pid); • Sufficient model ? – No while(true) { wait_n_process_end(pid); do (100 times) { } read(…) ptrace(pid, DETACH); } sleep(n*b system calls); do (100 times) { ptrace(pid, ATTACH); write(…) } } b = 100, n = 1 }
Probabilistic Model • We want equal probability of sampling each system call read(…) if(rand(2)) read(…) write(…) if(rand(2)) write(…) . . .
Probabilistic Model • We want equal probability of sampling each system call if(rand(p)) read(…) if(rand(p)) write(…) Pr(skip) = (1 -p) Pr(measure) = p if(rand(p)) read(…) if(rand(p)) write(…). . . We’ll get p percent of the syscalls where each syscall was measured with equal probability
Probabilistic Model • We want equal probability of sampling each system call if(rand(p)) read(…) if(rand(p)) write(…). . . Pr(Skip 0, Measure 1 st call) = p
Probabilistic Model • We want equal probability of sampling each system call if(rand(p)) read(…) if(rand(p)) write(…). . . Pr(Skip 1, Measure 2 nd call) = p(1 -p)
Probabilistic Model • We want equal probability of sampling each system call if(rand(p)) read(…) if(rand(p)) write(…). . . Pr(Skip 2, Measure 3 rd call) = p(1 -p)2
Probabilistic Model • We want equal probability of sampling each system call if(rand(p)) read(…) Pr(Skip k, Measure (k+1)th call) = p(1 -p)k if(rand(p)) read(…) if(rand(p)) write(…). . . Geometric distribution !
Probabilistic Model • We want equal probability of sampling each system call read(…) write(…). . . Measure kth call where k comes from a geometric distribution
Tracking system calls while(true) { do(b times) { wait_n_process_start(pid); wait_n_process_end(pid); } ptrace(pid, DETACH); k = geo(n); sleep(k*b system calls); ptrace(pid, ATTACH); } pstrace !
Tracking system calls while(true) { do(b times) { wait_n_process_start(pid); wait_n_process_end(pid); } ptrace(pid, DETACH); k = geo(n); sleep(k*b system calls); ptrace(pid, ATTACH); } Burst size: b Better time measurements Preserves syscall sequences Syscall percentage: n Controls overhead Randomization: k Gives a good probabilistic guarantee
Implementation • Built pstrace over strace – Linux, Sun. OS 4. x, System V release 4, Solaris 2. x and Irix 5. x – Handles pretty printing of system calls • Can print out – Syscalls made – Time per syscall – Number of syscalls made
Implementation
Experiments Toy-fs strace 848 sec pstrace 50% 376 sec pstrace 25% 212 sec Time (85 sec) Syscalls traced write 10 M 3. 5 M 1. 6 M 43. 77 % 43. 64 % 43. 74 % read 30. 05 % 29. 88 % 29. 90 % lseek 26. 18 % 26. 48 % 26. 36 %
Experiments postmark strace pstrace 50% Syscalls traced open 492 K 200 K pstrace 5% 12 K 29. 22 % 20. 98 % 20. 56 % write 24. 74 % 30. 19 % 25. 76 % unlink 19. 59 % 20. 85 % 29. 39 %
Experiments Web-server (thttpd) Without strace pstrace 25% Time 75 sec 83 sec 79 sec 280 K 90 K Syscalls traced
Experiments • Host-based intrusion detection systems – Monitors system calls, builds model under normal operation – Enforce that model under production run
HIDS: Modeling phase Trusted users Application syscalls strace Operating System Model of system calls
HIDS: Enforcement Untrusted users Application syscalls strace Operating System Check Model Yes / No
HIDS • wu-ftpd – wu. ftpd is misconfigured at compile time, allowing users SITE EXEC access to /bin. Users can then run executables such as bash with root privilege.
HIDS • Requires continuous monitoring • Can turn up monitoring when we start seeing strange system call sequences • Preliminary experiments (postmark): – A sequence of 480 K syscalls contained only 159 signatures (0. 033%) – Sampling the sequence down to 30 K still had 125 signatures (0. 42%)
Conclusions • Probabilistic monitoring of system calls is good – Low overhead – Minimal loss in precision • Gives fair estimate of time spent per syscall • Useful for applications that require continuous monitoring of syscalls
Limitations • System timing still too coarse • System call rate of processes might vary – Model syscall rate relative to PC • Following forks – need to track child processes
read(…) Questions … open(…) writev(…) write(…) poll(…) read(…) select(…) read(…) gettimeofday(…) write(…) open(…) read(…) write(…) close(…) read(…) ioctl(…) write(…) select(…) write(…) poll(…)
- Slides: 40