Overview of the Lab 3 Assignment Kernel Module

Overview of the Lab 3 Assignment: Kernel Module Concurrent Memory Use Chris Gill CSE 422 S – Operating Systems Organization 1

Overall Kernel Module Design When loaded, module will allocate and initialize memory, spawn threads init The threads will find primes in an integer array using a “sieve” algorithm – – Barrier synchronize before starting Concurrently crossing out multiples Barrier synchronize when finished Then mark an atomic completion flag work When unloaded, module will report results and then clean up – – Print out remaining prime numbers Print out efficiency statistics Print out timing statistics De-allocate memory exit CSE 422 S – Operating Systems Organization 2

Arrays and Memory Management numbers (data) 2 data_array 3 4 5 6 current 7 8 9 10 11 12 13 14 15 16 17 counters (metadata) 0 ctr_array 0 0 0 data_array + upper_bound - 1 ctr_array + num_threads Module init() function needs to kmalloc() arrays for numbers and counters – Sizes are given by module parameters (minus 1 for the numbers since they start at 2) – If first allocation succeeds but second fails, must clean up correctly Module init() function spawns as many threads as were specified – Each thread is given a pointer to its own “cross-out” counter (see next slide) – Threads are allowed to be migrated by Linux (are not pinned to cores) Module exit() function needs to deallocate memory for arrays – If initialization succeeded needs to kfree() arrays for numbers and counters – Also may need to kfree() numbers array if second allocation failed (your choice) CSE 422 S – Operating Systems Organization 3

Concurrency and Futile Work (futile work) Even single threaded sieve may cross out the same number multiple times – Doesn’t impact correctness – Degrades performance somewhat Concurrency can make this worse – Data race for non-prime elements – Would have been crossed out earlier in a single-threaded implementation – Thread’s entire “job” will duplicate work done by other threads’ jobs – In this lab you’ll evaluate this effect, rather than trying to “fix” it Each thread will count its cross-outs – Cross-outs in excess of the number of non -primes in the array are futile – Offers a good measure of efficiency – Will also measure completion times, since parallelism may help reduce them 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 CSE 422 S – Operating Systems Organization 4

Lab Write-up As you work, again record your observations It’s a good idea to read the entire assignment, and to plan and design a bit before starting to work on it It’s also a good idea to develop and test incrementally Write a cohesive report that analyzes, integrates, and offers explanations for what you observed – Run different combinations of upper bounds, #s of threads – Think (and write) about what trends emerge initially – Run additional experiments as needed to confirm trends CSE 422 S – Operating Systems Organization 5
- Slides: 5