System Mechanisms Santosh Kumar Singh Trap Dispatching Trap

  • Slides: 34
Download presentation
System Mechanisms Santosh Kumar Singh

System Mechanisms Santosh Kumar Singh

Trap Dispatching Trap Handler l Interrupt vs. Exception l What is an Interrupt ?

Trap Dispatching Trap Handler l Interrupt vs. Exception l What is an Interrupt ? l What is an Exception ? l Distinguished by the kernel

Trap Dispatching Interrupt Service Routines System Service Calls Hardware / Software Exception Virtual address

Trap Dispatching Interrupt Service Routines System Service Calls Hardware / Software Exception Virtual address exceptions System Services Ex Frame Exception Dispatcher Virtual Memory manager’s pager Exception Handler

Trap Dispatching After an exception or interrupt, the processor records enough machine state on

Trap Dispatching After an exception or interrupt, the processor records enough machine state on the kernel stack. l Windows switches the thread’s kernelmode stack and generates a “trap frame”. l The kernel also has a interrupt service routine for trap handling tasks mainly for device interrupts l

Task Dispatching l The trap handlers in the ISR typically execute the system function

Task Dispatching l The trap handlers in the ISR typically execute the system function Ke. Bug. Check. Ex, which halts the computer when the kernel detects incorrect behavior

Interrupt Dispatching l l l Hardware generated interrupts originates from I/O devices and they

Interrupt Dispatching l l l Hardware generated interrupts originates from I/O devices and they must notify the processor when they need service. System Soft wares can also generate interrupts Kernel installs interrupt trap handlers to respond to device interrupts. Interrupt trap handlers transfer control to either to ISR that handles the interrupt or to an internal kernel routine that responds to the interrupt. Device drivers supply ISRs to service device interrupts.

Hardware Interrupt Processing l l l Hardware platforms in windows, come into one of

Hardware Interrupt Processing l l l Hardware platforms in windows, come into one of the lines on an interrupt controller. The controller in turn interrupts the processor on a single line. Once kernel is interrupted, it asks the controller to get the Interrupt Request (IRQ). The controller translates the IRQ to an interrupt number and uses this number as an index into a structure called Interrupt dispatch table (IDT). During boot time, windows fills the IDT with pointers to kernel routines that handle each interrupt and exception. Each processor has separate IDT so that different processors can run different ISRs.

X 86 / x 64 / IA 64 Interrupt Controller X 86 Interrupt Controller

X 86 / x 64 / IA 64 Interrupt Controller X 86 Interrupt Controller l Most x 86 systems has either i 8259 A Programmable Interrupt Controller (PIC) or i 82489 Advanced Programmable Interrupt Controller. l PIC work only on uniprocessor systems and has 15 interrupt lines. l APIC work with multiprocessor systems and has 256 interrupt lines. X 64 Interrupt Controller l It has the same interrupt controllers as with x 86 as it is compatible with X 86. IA 64 Interrupt Controllers l It uses Streamlined Advanced Programmable Interrupt Controller (SAPIC), which is an evolution of APIC. l The major difference is that the I/O APICs on an APIC system deliver interrupts to local APICs over a private APIC bus, where as on a SAPIC system interrupts traverse the I/O and system bus for faster delivery. l Another difference is that interrupt routing and load balancing is handled by APIC private bus, but a SAPIC system requires that the system to be present in the firmware.

X 86 APIC architecture Device Interrupts CPU 0 CPU 1 Local APIC I/O APIC

X 86 APIC architecture Device Interrupts CPU 0 CPU 1 Local APIC I/O APIC I 8259 AEquivalent PIC

Software Interrupt Request levels (IRQLs) l l l Windows impose its interrupt priority scheme

Software Interrupt Request levels (IRQLs) l l l Windows impose its interrupt priority scheme known as Interrupt request levels (IRQLs). The kernel represents IRQLs internally as numbers from 0 through 31 on x 86 and 0 through 15 on IA 64, with higher numbers representing higher-priority interrupts. The HAL maps hardware interrupt numbers to the IRQLs. Interrupts are served in priority order, and a higher priority interrupt preempts the servicing of a lower priority interrupt. Thread scheduling priority is an attribute of a thread, where as an IRQL, is an attribute of an interrupt source.

X 86 interrupt request levels (IRQLs) 31 30 29 28 27 26 3 2

X 86 interrupt request levels (IRQLs) 31 30 29 28 27 26 3 2 1 0 High Power fail Inter-processor interrupt Clock Profile Device n …. Device 1 DPC/dispatch APC Passive Hardware Interrupts Software interrupts Normal thread execution

Mapping interrupts to IRQLs l l l In windows, a type of device driver

Mapping interrupts to IRQLs l l l In windows, a type of device driver called a bus driver determines the presence of devices on its bus and what interrupt can be assigned to a device. The bus driver reports the information of the interrupt type to the plug and play manager for decision. Then it calls the HAL function Halp. Get. System. Interrupt. Vector, which maps interrupts to IRQLs.

Pre-defined IRQLs l l l l The kernel uses high level when its halting

Pre-defined IRQLs l l l l The kernel uses high level when its halting the system in Ke. Bug. Check. Ex and masking out all interrupts. Power fail level is used for system power failure code. Inter-processor interrupt level is used to request another processor to perform an action. Clock level is used for the system’s clock for keeping track of time of day and schedule tasks. The system’s real-time clock uses profile level when kernel profiling, a performance measurement mechanism is enabled. The device IRQLs are used to prioritize device interrupts DPC/dispatch-level and APC level interrupts are software interrupts that the kernel and device drivers generate. The lowest IRQL, passive level, isn't really an interrupt level at all; it’s the setting at which normal thread execution takes place and all interrupts are allowed to occur. Interrupt Objects – The kernel provides a portable mechanism- a kernel control object called an interrupt object that allows device drivers to register ISRs for their devices.

Software Interrupts The kernel generates software interrupts for a variety of tasks l Initiating

Software Interrupts The kernel generates software interrupts for a variety of tasks l Initiating thread dispatching. l Non-time critical interrupt processing. l Handling timer expiration. l Asynchronously executing a procedure in a context of a particular thread. l Supporting asynchronous I/O operations. Dispatch or Deferred Procedure call Interrupts When a thread can no longer continue executing, perhaps because it has terminated or because it voluntarily enters a wait state, the kernel calls the dispatcher directly to effect an immediate context switch. A Deferred Procedure calls (DPC) is a function that performs a system task- a task that is less time critical than the current one. The kernel uses DPCs to process timer expiration and to reschedule the processor after a thread’s quantum expires. A DPC is represented by the DPC Object, that is not visible in user mode but visible to device drivers and other system code. A DPC object contains the address of the sytem function that the kernel will call when it processes the DPC interrupt, which are stored in DPC queues.

Asynchronous Procedure Call (APC) Interrupts l Asynchronous procedure calls provide a way for user

Asynchronous Procedure Call (APC) Interrupts l Asynchronous procedure calls provide a way for user programs and system code to execute in the context of a particular user thread. l APCs are described by a kernel control object, APC Object, waiting to execute reside in a kernel managed APC queue. l There are 2 kinds of APCs: Kernel mode and user mode. l Kernel mode APCs don’t require permissions from a target thread to run in that thread’s context, while user-mode APCs do. l Several windows APIs such as Read. File. Ex, Write. File. Ex, and Queue. User. APC, user-mode APCs. l Device drivers use kernel mode APCs. POSIX subsystem uses kernel-mode APCs to deliver POSIX signals to POSIX processors.

Exception Dispatching l l l Exceptions are conditions that result directly from the execution

Exception Dispatching l l l Exceptions are conditions that result directly from the execution of the program that is running. Windows introduced a facility known as structured exception handling which allows applications to gain control when exception occur. On x 86, all exceptions have predefined interrupt numbers that directly correspond to the entry in the IDT that points to the trap handler for a particular exception. Interrupt Number Exception 0 Divide Error 4 Overflow B Segment Not Present E Page Fault All exceptions, except those simple enough to be resolved by the trap handler are serviced by a kernel module called the exception dispatcher. Encountering a breakpoint while executing a program being debugged generates an exception, which kernel handles by calling the debugger.

Dispatching an Exception l The exception handler’s job is to find an exception handler

Dispatching an Exception l The exception handler’s job is to find an exception handler that can dispose the exception. Trap Handler Debugger Port Frame based handlers Exception Record LPC Function calls Trap Handler Debugger (first chance) Debugger Port Debugger (second chance) Exception Port Environment Subsystem Kernel default handler

Unhandled Exceptions l l All windows threads have an exception handler at the top

Unhandled Exceptions l l All windows threads have an exception handler at the top of stack that processes unhandled exceptions, which is declared in the internal windows start-of-process or start-of-thread function. The generic code for these internal start functions – Void win 32 Startof. Process( LPTHREAD_START_ROUTINE lp. Start. Addr, LPVOID lpv. Thread. Parm){ _try { DWORD dw. Thread. Exit. Code = lp. Start. Addr(lpv. Thread. Parm); Exit. Thread(dw. Thread. Exit. Code); } _except(unhandled. Exception. Filter( Get. Exceptoin. Information())) { Exit. Process(Get. Exception. Code()); } } The behavior is based on the contents of the HKLMSOFTWAREMicrosoft NTCurrent. VersionAe. Debug registry key. There are two values: Auto and Debugger. Auto tells the unhandled exception filter whether to automatically run the debugger or ask the user what to do. By default, it is set to 1, which means it will launch the debugger automatically. However, installing Visual studio sets the value to 0.

Unhandled Exceptions l A tool captures the state of the application “crash” and record

Unhandled Exceptions l A tool captures the state of the application “crash” and record its in a log file (Drwtsn 32. log) and a process crash dump file (User. dmp) which is found in windowssystem 32Drwtsn 32. exe Windows Error Reporting Windows error reporting automates the submission of both user mode process crashes and kernel mode system crashes. This can be configured in My Computer->Properties->Advanced->Error Reporting or in HKLMSoftwareMicrosoft PCHealthError. Reporting In environments where systems are not connected to the internet or where the administrator wants to control which error reports are submitted to Microsoft, the destination for the error report can be configured to an internal file server. Corporate Error Reporting that understands the directory structure created by Windows Error Reporting and provides the administrator with the option to take selective error reports and submit them to Microsoft.

System Service Dispatching l A system service dispatch is triggered as a result of

System Service Dispatching l A system service dispatch is triggered as a result of executing an instruction assigned to a system service dispatching. The instruction that windows uses for system service dispatching depends on the processor in which its executing. 32 bit System service dispatching On x 86 processors prior to the Pentium II, windows used the int 0 x 2 e instruction (46) decimal, which results in a trap. The trap causes the executing thread to transition into kernel mode and enter the system service dispatcher. On x 86 Pentium II processors and higher, windows uses the special sysenter instruction. At boot time, windows detects the type of processor on which its executing and sets up the appropriate system call code to be used. The system code for Nt. Read. File in user mode – ntdll!Nt. Read. File: 77 f 5 bfa 8 b 8 b 700000 mov eax, 0 xb 7 77 f 5 bfad ba 0003 fe 7 f mov edx, 0 x 7 ffe 0300 77 f 5 bfb 2 ffd 2 call edx 77 f 5 bfb 4 c 22400 ret 0 x 24

Kernel mode System Service Dispatching l The kernel uses this argument to locate the

Kernel mode System Service Dispatching l The kernel uses this argument to locate the system service information in the system service dispatch table. User Mode Kernel Mode System Service Dispatch System Service call Table 0 System Service Dispatcher 32 bit 1 2 . 3 …. . n System Service 2

Service Descriptor tables 31 13 11 Table Index 0 0 Native API 1 Unused

Service Descriptor tables 31 13 11 Table Index 0 0 Native API 1 Unused 2 IIS Spud Driver 3 Ke. Service. Descriptor Table. Shadow

Objects l Windows Object Manager is an executive component responsible for creating, deleting, protecting

Objects l Windows Object Manager is an executive component responsible for creating, deleting, protecting and tracking objects. Win. Obj (available at www. sysinternals. com) displays the internal object manager’s namespace. Structure of an Object Name Owned by the Handle. Count object Manager Reference. Count Type Owned by the kernel Kernel Object Name Object directory Security descriptor Quota charges Open handles count Open handles list Object type Reference count Object Specific Data Owned by the Executive Object executive Executive objects that contain kernel objects Object header Process 1 Process 2 Process 3 Object body Type Name Pool Type Default quota charges Access types Generic access rights mapping Synchronizable? (Y/N) Methods: Open, close, delete Parse, security, Query name

Object Specifics l Process objects and the process type object Process Type Object Process

Object Specifics l Process objects and the process type object Process Type Object Process Object 1 Process Object 2 If object tracking debug flag is set Process Object 3 Process Object 4 Object Methods Method When Method is called Open When an object handle is Opened Close When an object handle is closed Delete Before an object manager deletes an object Query Name When a thread requests the name of an object, such as a file, that exists in a secondary object namespace. Parse When the object manager is searching for an object name that exists in a secondary object namespace Security When a process reads or changes the protection of an object, such as a file, that exists in a secondary object namespace

Object Handles When a process creates or opens an object by name, it receives

Object Handles When a process creates or opens an object by name, it receives a handle that represents its access to the object. Windows 2000 process handle table architecture 0 0 0 l Handle Table 255 Top level pointers 255 Middle level pointers Subhandle table Audit on close Inheritable Protect from close Process Structure of a handle table entry Lock Pointer to object header Access mask 32 bits A I P

Object Properties l l Object Security – Whenever a process creates an object or

Object Properties l l Object Security – Whenever a process creates an object or opens a handle to an existing object, the process must specify a set of desired access rights. Object Retention – Objects are of two types: Permanent and Temporary Objects. Resource Accounting – A open object handle count indicates that some process is using that resource. Windows object manager provides a central facility for resource accounting. Windows uses quota charges. The registry values are 0 by default and is set at HKLM System Current. Control. Set Session Manager Memory Management Object Names – Means of accessing an object. Standard Object Directories Directory Types of Object Names Stored GLOBAL? ? MS-DOS device names (Dos. Devices is a symbolic link to this directory. ) Base. Named. Objects Mutexes, events, semaphores, waitable timers, and section objects Callback objects Device Objects File. System File system driver objects and file system recognizer device objects Known. Dlls Section names and path for known DLLs (DLLs mapped by the system at startup time) NIs Section names for mapped national language support tables Object. Types Names of types of Objects RPC Control Port objects used by remote procedure calls (RPCs) Security Names of objects specific to the security subsystem Windows subsystem ports and window stations.

Synchronization l The scenario when two threads running on different processors both write data

Synchronization l The scenario when two threads running on different processors both write data to a circular queue. Time Processor A Processor B Get queue tail Insert data at current location Get queue tail Increment tail pointer Insert data at current location /* Error */ Increment tail pointer Incorrect Sharing of Memory

High IRQL Synchronization l l Interlocked Operations Spinlocks Processor A Do Do Try to

High IRQL Synchronization l l Interlocked Operations Spinlocks Processor A Do Do Try to acquire DPC queue Spinlock Try to acquire Spinlock Until SUCCESS Begin Remove DPC from queue End DPC queue Spinlock Until SUCCESS DPC Begin Remove DPC from queue End DPC queue Release DPC queue spinlock Critical Section Release DPC queue spinlock

High IRQL Synchronization l Queued Spinlocks – Form of spinlock that scales better on

High IRQL Synchronization l Queued Spinlocks – Form of spinlock that scales better on multiprocessor than a standard spinlock. Windows defines a number of global queued spinlocks by storing pointers to them in an array contained in each processor’s processor control region (PCR). A global spinlock can be acquired by calling Ke. Acquire. Queued. Spinlock with the index into the PCR array at which the pointer to the spinlock is stored. l Instack Queued Spinlocks – Windows XP and Windows Server 2003 kernels support dynamically allocated queued spinlocks with the Ke. Acquire. In. Stack. Queued. Spinlock and De. Release. In. Stack. Queued. Spinlock functions. l Executive Interlocked Operations – The kernel supplies a number of simple synchronization functions constructed on spinlocks for more advanced operations, such as adding and removing entries from singly and doubly linked lists. Examples include Ex. Interlocked. Pop. Entry. List and Ex. Interlocked. Push. Entry. List for singly linked lists, and Ex. Interlocked. Insert. Head. List and Ex. Interlocked. Remove. Head. List for doubly linked lists. All these functions require a standard spinlock as a parameter and are used throughout the kernel and device drivers.

Low – IRQL Synchronization l Executive software outside the kernel also needs to synchronize

Low – IRQL Synchronization l Executive software outside the kernel also needs to synchronize access to global datastructures in multiprocessor environment. Kernel Dispatcher Objects The kernel furnishes additional synchronization mechanisms to the executive in the form of kernel objects, known as dispatcher objects. The user visible synchronization objects acquire their synchronization capabilities from these kernel dispatcher objects. The executive’s synchronization semantics are visible to Windows programmers through the Wait. For. Single. Object and Wait. For. Multiple. Objects functions. Executive synchronization object executive resources provide both exclusive access ( like a mutex) as well as shared read access.

Low – IRQL Synchronization l Waiting for Dispatcher Objects A thread can synchronize with

Low – IRQL Synchronization l Waiting for Dispatcher Objects A thread can synchronize with a dispatcher object by waiting for the object’s handle. The kernel suspends the thread and changes the dispatcher state from running to waiting. A synchronization object can be in one of the two states: either in signaled state or the nonsignaled state. Waiting for a dispatcher object Initialized Set object to signaled state Terminated Thread waits on an object handle Running Ready Waiting Transition Standby

Low – IRQL Synchronization A thread object is in the nonsignaled state during its

Low – IRQL Synchronization A thread object is in the nonsignaled state during its lifetime and is set to the signaled state by the kernel when the thread terminates. l When an object is set to the signaled state, waiting threads are generally released from their wait states immediately. Data Structures Two data structures are key to tracking who is waiting for what: dispatcher headers and wait blocks. Both are publicly defined in the DDK include file Ntddk. h l Typedef struct _DISPATCHER_HEADER { UCHAR Type; UCHAR Absolute; UCHAR Size; UCHAR Inserted; LONG Signal. State; LIST_ENTRY wait. List. Head; } DISPATCHER_HEADER; Typedef struct _KWAIT_BLOCK{ LIST_ENTRY wait. List. Entry; Struct _KTHREAD *RESTRICTED_POINTER Thread; PVOID object; Struct _KWAIT_BLOCK *RESTRICTED_POINTER Next. Wait. Block; USHORT Wait. Key; USHORT Wait. TYpe; } KWAIT_BLOCK, *RESTRICTED_POINTER PRKWAIT_BLOCK;

Low – IRQL Synchronization Thread objects Wait Data Structures Thread 1 Thread 2 Wait

Low – IRQL Synchronization Thread objects Wait Data Structures Thread 1 Thread 2 Wait block list Dispatcher objects Size Object A Type Wait Blocks State List Entry Wait List head Thread Object type specific data Object Key Type Next Link Size Type State Object B Thread 2 wait block List Entry Thread Wait List head Object type specific data List Entry Object Key Type Next Link Thread 2 wait block Thread 1 wait block

Low – IRQL Synchronization l Fast Mutexes and Guarded Mutexes – They avoid waiting

Low – IRQL Synchronization l Fast Mutexes and Guarded Mutexes – They avoid waiting for the event object if there’s no contention for the fast mutex. This gives better performance in multiprocessor systems. Executive defines Ex. Acquire. Fast. Mutex and Ex. Acquire. Fast. Mutex. Unsafe functions. Guarded mutexes are primarily used by the memory manager, which uses them to protect global operations. Executive Resources – Executive resources is a synchronization mechanism that is used throughout the system, especially in file-system drivers. Threads waiting to acquire a resource for shared access wait for a semaphore associated with the resource, and threads waiting for acquire a resource for exclusive access wait for an event. Push Locks – They were introduced in Windows XP is an optimized synchronization mechanism built on the event object and like fast mutexes, they wait for an event object only when there’s contention on the lock. There are two types of Push Locks l Normal l Cache aware