Windows NT Internals David Solomon Expert Seminars Microsoft

  • Slides: 119
Download presentation
Windows NT Internals ® David Solomon Expert Seminars Microsoft Corporation

Windows NT Internals ® David Solomon Expert Seminars Microsoft Corporation

Agenda u u u Introduction Tools System Architecture Processes and Threads Memory Management

Agenda u u u Introduction Tools System Architecture Processes and Threads Memory Management

About The Speaker David Solomon u u u 14 years at Digital - the

About The Speaker David Solomon u u u 14 years at Digital - the last 10 as a developer in the VMS operating system development group Started Windows NT developer training company in 1992 Author of Inside Windows NT, 2 nd edition (Microsoft Press) and Windows NT for Open. VMS Professionals (Digital Press) Regular speaker at industry conferences (Win. Dev, Tech • Ed, Software Development, DECUS. . . ) Recipient of past Microsoft MVP award for MSWIN 32 technical support

About The Company u David Solomon Expert Seminars offers high-quality Windows developer training l

About The Company u David Solomon Expert Seminars offers high-quality Windows developer training l u Instructors include: l u Doug Boling, Brian Catlin, Jamie Hanrahan, Jeff Prosise, Jeffrey Richter, and David Solomon Topics include: l l l l u Taught by well known industry experts and authors who develop and teach their own courses Windows CE Windows NT Internals Windows NT and WDM Device Drivers Windows NT® Server Applications Win 32® Programming Visual C++® and MFC COM/Active. X® Programming To be notified of new classes and other developments, join our e-mail interest list

Session Goals u Goals l l u Audience assumptions l l u Explain internal

Session Goals u Goals l l u Audience assumptions l l u Explain internal architecture and operation of core Windows NT components Use various tools that demonstration internal Windows NT behavior Familiar with basic 32 -bit OS concepts Familiar with Win 32 API (processes, threads, memory management) Acknowledgements l l Jamie Hanrahan (jeh@cmkrnl. com - www. cmkrnl. com), co-author of the Windows NT internals seminar from which these slides were taken Dave Cutler, Helen Custer, John Balciunas, Lou Perazzoli, Mark Lucovsky, Steve Wood, Tom Miller, Gary Kimura, and Landy Wang for their support and assistance in understanding Windows NT internals

Windows NT Architecture Environment System Processes User Mode Services Alerter RPC Event Logger System

Windows NT Architecture Environment System Processes User Mode Services Alerter RPC Event Logger System Threads File systems POSIX Replicator Service Controller Win. Logon Session Manager Kernel Mode I/O Manager Subsystems Applications User Application Subsystem DLLs OS/2 Win 32 NTDLL. DLL Cache Manager Executive API Processes Security & Threads Virtual Memory Win 32 User, GDI Object management / Executive RTL Device drivers Kernel Hardware Abstraction Layer (HAL) Hardware interfaces (buses, I/O, interrupts, timers, clocks, DMA, cache control, etc. ) Copyright by Microsoft Corporation. Used by permission.

Windows NT 5. 0 Internal changes u In one sense, much is the same

Windows NT 5. 0 Internal changes u In one sense, much is the same l u Basic architecture of many components unchanged: l Win 32 subsystem, memory manager, process model, thread scheduling, security model, file system But lots of additions of major new functionality: l Active Directory, distributed security, Kerberos, Microsoft management console, Intelli. Mirror™, NTFS extensions (content indexing, quotas, reparse points, sparse files, link tracking)

Windows NT 5. 0 Internal changes u Kernel/core changes include: l l u u

Windows NT 5. 0 Internal changes u Kernel/core changes include: l l u u I/O system (plug and play and power management) 64 -bit Very Large Memory support for Alpha Job object Integration of Terminal Server Comparable to level of change from 3. 51 to 4. 0 Also many incremental performance improvements: l Object Manager, Memory manager (e. g. , working set management algorithms), SMP scalability…

Agenda u u u Introduction Tools System Architecture Processes and Threads Memory Management

Agenda u u u Introduction Tools System Architecture Processes and Threads Memory Management

Tools Preview tool Performance Monitor Registry Editor Windows NT Diagnostics Kernel Debugger Pool Monitor

Tools Preview tool Performance Monitor Registry Editor Windows NT Diagnostics Kernel Debugger Pool Monitor Global Flags Open Handles Quick. Slice Process Viewer Process Exploder Process Status Pmon Object Viewer Process Walker Page Fault Monitor Spy++ executable Perf. Mon Reg. Edt 32 Win. MSD i 386 kd, alphakd poolmon gflags oh qslice pviewer, pview pstat pmon Win. Obj PWalk PFMon origin Windows NT Widows NT CD supportdebug Windows NT Resource Kit Windows NT Resource Kit Platform SDK, VC++ Windows NT Resource Kit 4. 0 Windows NT Resource Kit Platform SDK Visual C++

Windows NT Resource Kits u Full “Windows NT 5. 0 Resource Kit” l l

Windows NT Resource Kits u Full “Windows NT 5. 0 Resource Kit” l l u 250+ utilities Combines what was in the 4. 0 Server and Workstation resource kits Subset “Windows NT 5. 0 Resource Kit Support Tools” l l 50 utilities Ships in supportreskit on Windows NT CD

www. sysinternals. com u Windows NT internals articles and tools l u Some examples:

www. sysinternals. com u Windows NT internals articles and tools l u Some examples: l l l u Some generated using reverse engineering (e. g. , no source access) winobj - view object manager namespace and objects nthandlex - show open handles by process ntfilmon - log all file I/O operations ntregmon - log all registry accesses cpufrob - change thread quantum Caveat: Most include a device driver, hence you’re added “trusted code” l No warranty on using these on your system!

GFLAGS (Global Flags) u u u Changes system-wide or image-wide debugging flags Poolmon requires

GFLAGS (Global Flags) u u u Changes system-wide or image-wide debugging flags Poolmon requires “enable pool taggin” Oh (open handles) requires “maintain a list of objects for each type”

Windows NT Kernel Debugger (1 Of 4) u Two versions: l l Command line:

Windows NT Kernel Debugger (1 Of 4) u Two versions: l l Command line: I 386 KD. EXE, ALPHAKD, etc. , shipped with Windows NT l In NTcdrom: supportdebugi 386, … debugalpha, etc. l Select directory to match host system (where you will run the debugger executable); select executable to match target system (system being debugged) l Also need many DLLs from this directory l Also need symbol files from NTcdrom: supportdebugtargetarchsymbols … Extended via Win. Dbg shipped with Platform SDK (part of MSDN Professional) l Provides GUI, fully-symbolic, source-level debugging l Needs same DLLs and symbol files

Windows NT Kernel Debugger (2 Of 4) u Documentation: l l Windows NT Workstation

Windows NT Kernel Debugger (2 Of 4) u Documentation: l l Windows NT Workstation Resource Guide (see “Windows NT Debugger”) Windows NT Device Driver Kit (DDK) See i 386 kd -? Help within debugger: commands “? ” and “!help”

Windows NT Kernel Debugger (3 Of 4) u Two modes of operation: l l

Windows NT Kernel Debugger (3 Of 4) u Two modes of operation: l l Open a crash dump file: C: > set _NT_SYMBOL_PATH= ntcdrom: supportdebugi 386symbols C: > i 386 kd -Z dumpfilename Connect to a live system via null modem cable (must boot target system with /DEBUGPORT=COMn in boot. ini) C: > set _NT_SYMBOL_PATH=ntcdrom: supportdebugi 386symbols C: > set _NT_DEBUG_PORT=COMn default COM 1 C: > set _NT_DEBUG_BAUD_RATE=nnnnn default 19200 C: > i 386 kd serial “null modem” cable (for debugger) host target

Windows NT Kernel Debuggers (4 Of 4) u Third-party product: Soft. ICE for Windows

Windows NT Kernel Debuggers (4 Of 4) u Third-party product: Soft. ICE for Windows NT (Nu. Mega) l l l Runs on same system - e. g. , doesn’t require second system for live debugging x 86 only See www. numega. com

Agenda u u u Introduction Tools System Architecture l l l l u u

Agenda u u u Introduction Tools System Architecture l l l l u u Kernel Mode Environment Executive, Kernel, HAL, Drivers Product Packaging System Threads Environment Subsystems System Service Dispatching Process-based Windows NT code Summary Processes and Threads Memory Management

Kernel Mode Versus User Mode u A processor state l l l Controls access

Kernel Mode Versus User Mode u A processor state l l l Controls access to memory Each memory page is tagged to show the required mode for reading and for writing l Protects the system from the users l Protects the user (process) from themselves l System is not protected from system Code regions are tagged “no write in any mode” Controls ability to execute privileged instructions A Windows NT abstraction l Intel: Ring 0, Ring 3 l Perf. Mon, Processor: “Privileged Time” and “User Time” u Associated with threads l l l Threads can change from user to kernel and back Part of saved context, along with registers, etc. Does not affect scheduling Components Access mode Applications User Subsystem processes User Executive Kernel Drivers Kernel HAL Kernel

Getting Into Kernel Mode Code is run in kernel mode for one of three

Getting Into Kernel Mode Code is run in kernel mode for one of three reasons: 1. Requests from user mode l l Via the system service dispatch mechanism Kernel-mode code runs in the context of the requesting thread 2. Interrupts from external devices l l Windows NT-supplied interrupt dispatcher invokes the interrupt service routine ISR runs in the context of the interrupted thread (so-called “arbitrary thread context”) ISR often requests the execution of a “DPC routine, ” which also runs in kernel mode Time not charged to interrupted thread 3. Dedicated kernel-mode system threads l l Some threads in the system stay in kernel mode at all times (mostly in the “System” process) Scheduled, preempted, etc. , like any other threads

Interrupt Dispatching user or kernel mode code interrupt ! kernel mode Interrupt dispatch routine

Interrupt Dispatching user or kernel mode code interrupt ! kernel mode Interrupt dispatch routine Note, no thread or process context switch! Disable interrupts Interrupt service routine Record machine state (trap frame) to allow resume Mask equal- and lower-IRQL interrupts Find and call appropriate ISR Dismiss interrupt Restore machine state (including mode and enabled interrupts) Tell the device to stop interrupting Interrogate device state, start next operation on device, etc. Request a DPC Return to caller

Interrupt Precedence Via IRQLs u IRQL = Interrupt Request Level l u The “precedence”

Interrupt Precedence Via IRQLs u IRQL = Interrupt Request Level l u The “precedence” of the interrupt with respect to other interrupts u Different interrupt sources have different IRQLs Not the same as IRQ High 31 Power fail 30 29 Interprocessor Interrupt Clock 28 Device n. . . Device 1 Dispatch/DPC 2 APC 1 Low 0 u IRQL is also a state of the processor Servicing an interrupt raises processor IRQL to that interrupt’s IRQL l This masks subsequent interrupts at equal and lower IRQLs User mode is limited to IRQL 0 Hardware interrupts Deferrable software interrupts normal thread execution

Alpha IRQLs u IRQL on Alpha implemented in PAL code 7 6 5 4

Alpha IRQLs u IRQL on Alpha implemented in PAL code 7 6 5 4 3 2 1 0 High Interprocessor Interrupt Clock Device High Device Dispatch/DPC APC Low

DPCs (Deferred Procedure Calls) u A list of “work requests” l l u One

DPCs (Deferred Procedure Calls) u A list of “work requests” l l u One queue per processor (but processors can run each others’ DPCs) Implicitly ordered by time of request (FIFO) Used to defer processing from higher (device) interrupt level to a lower (dispatch) level l l Used heavily for driver “after interrupt” functions Used for quantum end and timer expiration queue head DPC object Dfrd. Ctx Sys. Arg 1 Sys. Arg 2 Xydriver. Dpc. Rtn(Dpc. Obj, Dfrd. Ctx, Sys. Arg 1, Sys. Arg 2) { //. . . }

Accounting For Kernel-Mode Time u u “Processor Time” = total busy time of processor

Accounting For Kernel-Mode Time u u “Processor Time” = total busy time of processor (equal to elapsed real time - idle time) “Processor Time” = “User Time” + “Privileged Time” = time spent in kernel mode “Privileged Time” includes: l l u Interrupt Time DPC Time Again note: interrupts and DPCs are not charged to any process or thread Screen snapshot from: Programs | Administrative Tools | Performance Monitor click on “+” button, or select Edit | Add to chart…

Agenda u u u Introduction Tools System Architecture l l l l u u

Agenda u u u Introduction Tools System Architecture l l l l u u Kernel Mode Environment Executive, Kernel, HAL, Drivers Product Packaging System Threads Environment Subsystems System Service Dispatching Process-based Windows NT code Summary Processes and Threads Memory Management

Windows NT Executive u u Upper layers of operating system Provides “generic OS” services

Windows NT Executive u u Upper layers of operating system Provides “generic OS” services l u u Almost completely portable C code Exports functions (“services”) which may be invoked via user-mode APIs l l u Processes, threads, memory management, I/O, interprocess communication, synchronization, security Interface is NTDLL. DLL E. g. , Win 32 Read. File -> executive Nt. Read. File Most interfaces to executive services not documented l Used by subsystem writers

Windows NT Kernel u Abstracts differences between processor architectures l u Main services l

Windows NT Kernel u Abstracts differences between processor architectures l u Main services l l u x 86 vs. Alpha vs. , etc. Thread scheduling and context switching Generic wait operations Exception and interrupt dispatching Operating system synchronization primitives (MP and UP) Machine Independent C Not a classic “microkernel” l shares address space withrest of kernel-mode components Assembler Machine Dep. C

HAL - Hardware Abstraction Layer u A separate loaded binary (c: winntsystem 32hal. dll)

HAL - Hardware Abstraction Layer u A separate loaded binary (c: winntsystem 32hal. dll) l l u Several different versions for different motherboards, UP vs. MP, etc. Installation procedure selects appropriate HAL for platform and copies to Hal. Dll on system disk Purpose: Isolate (abstract) Kernel and Executive from platform-specific details l Present uniform model for ease of driver development Sample HAL routines: u HAL abstracts: l I/O system specifics (bus interfaces, DMA…) Hal. Get. Bus. Data l System timers, Cache coherency and flushing Hal. Get. Bus. Data. By. Offset Hal. Assign. Slot. Resources l SMP support, Hardware interrupt priorities Hal. Set. Bus. Data l u u OEM Development Kit needed to build. HALs HAL contains some Executive and Kernel subroutines Hal. Set. Bus. Data. By. Offset Hal. Translate. Bus. Address Hal. Get. Interrupt. Vector Hal. Get. Adapter READ_REGISTER_ULONG WRITE_PORT_UCHAR

Kernel-Mode Device Drivers u Separate loadable modules (drivername. SYS) l l u u Only

Kernel-Mode Device Drivers u Separate loadable modules (drivername. SYS) l l u u Only way to add “kernel extensions” or to access kernel mode system routines Defined in registry l l u u Linked like. EXEs Linked against NTOSKRNL. EXE and HAL. DLL Same area as Win 32 services (t. b. d. ) Differentiated by Type value View loaded drivers with pstat. exe, drivers. exe Several types: l l “Ordinary” hardware drivers File system NDIS miniport, SCSI miniport (linked against port drivers) Win 32 K. Sys - Windowing system

WDM (Win 32 Driver Model) u u u Extension to Windows NT driver model

WDM (Win 32 Driver Model) u u u Extension to Windows NT driver model to support for Plug and Play and Power Management Allows source/(x 86) binary-compatible drivers across Windows 98 and Windows NT 5. 0 Non trivial additions to existing drivers: l l 3 new major IRP types 36 new minor IRPs added 6 new miniport driver types Supporting WDM affects every area of a driver

WDM Drivers u What’s covered in WDM: l l l l u IEEE 1394

WDM Drivers u What’s covered in WDM: l l l l u IEEE 1394 (Firewire) Universal Serial Bus (USB) Audio: Speakers, microphone, CODEC Human Interface Devices: mouse, keyboard, monitor controls, game devices Still Imaging: Cameras, scanners Video Devices: Video capture, DVD Advanced Power and Configuration Interface (ACPI) BIOS support Not covered by WDM: l l Network Storage File System Video

Agenda u u u Introduction Tools System Architecture l l l l u u

Agenda u u u Introduction Tools System Architecture l l l l u u Kernel Mode Environment Executive, Kernel, HAL, Drivers Product Packaging System Threads Environment Subsystems System Service Dispatching Process-based Windows NT code Summary Processes and Threads Memory Management

NTOSKRNL. EXE System Processes Services User Mode Alerter RPC Event Logger System Threads User

NTOSKRNL. EXE System Processes Services User Mode Alerter RPC Event Logger System Threads User Application Subsystem DLLs OS/2 Win 32 NTDLL. DLL Cache Manager Executive API Processes Security & Threads Virtual Memory Win 32 User, GDI Object management / Executive RTL Device drivers N to s K r n l. E x e File systems POSIX Replicator Service Controller Win. Logon Session Manager Kernel Mode I/O Manager Applications Environment Subsystems Kernel Hardware Abstraction Layer (HAL) Hardware interfaces (buses, I/O, interrupts, timers, clocks, DMA, cache control, etc. ) Copyright by Microsoft Corporation. Used by permission.

NTOSKRNL. EXE u NTOSKRNL. EXE l u HAL. DLL l u Windows NT executive

NTOSKRNL. EXE u NTOSKRNL. EXE l u HAL. DLL l u Windows NT executive and kernel Hardware Abstraction Layer - interface to hardware platform BOOTVID. DLL l Boot video driver

Naming Convention For Internal Windows NT Routines u Two- or three-letter component code in

Naming Convention For Internal Windows NT Routines u Two- or three-letter component code in beginning of function name Executive Ex - General executive routine Ob - Object management Exp - Executive private (not exported) Io - I/O subsystem Cc - Cache manager Se - Security Mm - Memory management Ps - Process structure Rtl - Run-Time Library Lsa - Security Authentication Fs. Rtl - File System Run-Time Lib Zw - File access, etc. Kernel Ke - Kernel Ki - Kernel internal (not available outside the kernel) HAL Hal - Hardware Abstraction Layer READ_, WRITE_ - I/O port and register access

Multiprocessor Support u Code comprising NTOSKRNL compiled twice: Once for uniprocessor, once for multiprocessor

Multiprocessor Support u Code comprising NTOSKRNL compiled twice: Once for uniprocessor, once for multiprocessor l u Two files on Windows NT media: l l l u u Avoids penalizing uniprocessor systems for added MP complexity UP version: NTOSKRNL. EXE MP version: NTKRNLMP. EXE Selected at installation time, but copied to NTOSKRNL All drivers, DLLs, EXEs are built to run on on MP Upgrading from Uniprocessor vs Multiprocessor l l l See uptomp. exe (in Resource Kit) 2 files replaced with different code l NTKRNLMP. EXE replaces NTOSKRNL. EXE l new HAL replaces HAL. DLL 4 files replaced with same code, but modified image header l KERNEL 32. DLL, NTDLL. DLL, WINSRV. DLL, WIN 32 K. SYS

Identifying Your NTOSKRNL u Build numbers l u Service packs l l u Incremented

Identifying Your NTOSKRNL u Build numbers l u Service packs l l u Incremented each time Windows NT is built from sources (i. e. , different for beta releases) Replaces. EXEs (including usually NTOSKRNL), . DLLs, etc. Do not change Windows NT build number Free versus Checked build l l Free = retail version; Checked = debug version Used primarily in driver testing Build number is the same Recompilation of system with DEBUG flag true l Therefore a different NTOSKRNL. EXE l Note: MP only (NTOSKRNL and NTKRNLMP. EXE identical) Screen snapshot from: Programs | Administrative Tools | Windows NT Diagnostics

Workstation Vs Server u Core operating system executables are identical l u Windows NT

Workstation Vs Server u Core operating system executables are identical l u Windows NT Server a superset of Workstation l l l u NTOSKRNL. EXE, HAL. DLL, xxx. DRIVER. SYS, etc. , (t. b. d. ) domains, host-based RAID 5, Net. Ware gateway, DHCP server, WINS, DNS, full Internet Information Server… Enterprise Server adds yet more functionality (Clusters, 3 GB address space) Terminal Server enables multi-user thin client support MP limits: Workstation: 2 CPUs, Server: 4 CPUs, Server Enterprise: 8 CPUs

Workstation Vs Server u Registry indicates system type l u HKLMCurrent. Control. SetControlProduct. Options

Workstation Vs Server u Registry indicates system type l u HKLMCurrent. Control. SetControlProduct. Options l Product. Type: Win. NT=Workstation, Server. NT=Server not a domain controller, Lan. Man. NT=Server that is a Domain Controller l Product. Suite: Indicates Enterprise Edition, Terminal Server… Code in the operating system tests these values and behaves slightly differently in a few places l l Licensing limits (number of processors, number of inbound network connections, etc. ) Boot-time calculations (memory manager) Default length of time slice See DDK: Mm. Is. This. An. Ntas. System

Agenda u u u Introduction Tools System Architecture l l l l u u

Agenda u u u Introduction Tools System Architecture l l l l u u Kernel Mode Environment Executive, Kernel, HAL, Drivers Product Packaging System Threads Environment Subsystems System Service Dispatching Process-based Windows NT code Summary Processe and Threads Memory Management

System Threads u u Internal worker routines that need thread context Drivers or Executive

System Threads u u Internal worker routines that need thread context Drivers or Executive can create system threads l l l u Always run in kernel mode Usually associated with the “System” process by default l But can be tied to any process Not non-preemptible (unless they raise IRQL to 2 or above) Kernel mode APIs: l l Ps. Create. System. Thread Ps. Terminate. System. Thread Ke. Set. Base. Priority. Thread Ke. Set. Priority. Thread

Threads In The “System” Process u u Note CPU time is 100% kernel mode

Threads In The “System” Process u u Note CPU time is 100% kernel mode “Start address” is address of thread function l l On Intel (at least): Addresses 8 xxxxxxx will correspond to symbols in Ntos. Krnl. Exe Addresses Axxxxxxx are routines in Win 32 K. Sys Addresses Fxxxxxxx are routines in loaded device drivers Screen snapshot from: Programs | Resource Kit | Diagnostics | Process Viewer select “System” process

Threads In The “System” Process u u u Memory Management l l l Modified

Threads In The “System” Process u u u Memory Management l l l Modified Page Writer for mapped files Modified Page Writer for paging files Balance Set Manager Swapper (kernel stack, working sets) Zero page thread (thread 0, priority 0) l Command Server Thread l Redirector and Server Worker Threads l Examples: Floppy driver, parallel port driver l l Used by drivers, file systems… Accessed via Ex. Queue. Work. Item Security Reference Monitor Network Threads created by drivers for their exclusive use Pool of Executive Worker Threads

Threads In System Process (Observed on Intel Windows NT Workstation 4. 0 ) Routine

Threads In System Process (Observed on Intel Windows NT Workstation 4. 0 ) Routine Name Phase 1 Initialization Exp. Worker. Thread Priority Notes 0 9 -16 Mi. Dereference. Segment. Thread 18 Mi. Modified. Page. Writer 17 Ke. Balance. Set. Manager 16 Ke. Swap. Process. Or. Stack 23 Fs. Rtl. Worker. Thread 16, 17 Sep. Rm. Command. Server. Thread 15 First thread in life of system; becomes zero page thread Pool of worker threads Dereferences segments; also expands paging file Writes modifed pages to paging file Reclaims memory from processes, with aid of. . . Scheduled by balance set manager Dedicated worker threads for FSDs Mi. Mapped. Page. Writer 17 Security Reference Monitor Command Server Writes modified pages to mapped files (Win 32 threads) 16 routines in Win 32 K. Sys (0 x. A 0000000) (driver threads) various routines in *driver. Sys (0 x. F 0000000)

Agenda u u u Introduction Tools System Architecture l l l l u u

Agenda u u u Introduction Tools System Architecture l l l l u u Kernel Mode Environment Executive, Kernel, HAL, Drivers Product Packaging System Threads Environment Subsystems System Service Dispatching Process-based Windows NT code Summary Processes and Threads Memory Management

Environment Subsystems u Expose “native API” l l u Two main components l l

Environment Subsystems u Expose “native API” l l u Two main components l l u “Wrap” and extend Windows NT native functionality Interfaces to write subsystems not documented Subsystem DLLs - convert documented API to native API Environment Subsystem Process - maintain state of client processes; implement some subsystem APIs Three provided with Windows NT: l l l Win 32 Posix l Bare minimum Posix standards, no optional components OS/2 l Support for 1. x character-mode applications only

Subsystem Extensions u OS/2 l l u Microsoft sells an add-on to the OS/2

Subsystem Extensions u OS/2 l l u Microsoft sells an add-on to the OS/2 subsystem Supports 1. x Presentation Manager Posix l l l Open. NT from Soft. Way More-featured replacement for Posix subsystem www. opennt. com

Environment Subsystems u Subsystem for each. exe specified in image header l See winnt.

Environment Subsystems u Subsystem for each. exe specified in image header l See winnt. h IMAGE_SUBSYSTEM_UNKNOWN 0 // Unknown subsystem IMAGE_SUBSYSTEM_NATIVE 1 // Image doesn't require a subsystem IMAGE_SUBSYSTEM_WINDOWS_GUI 2 // Win 32 subsystem (graphical app) IMAGE_SUBSYSTEM_WINDOWS_CUI 3 // Win 32 subsystem (character cell) IMAGE_SUBSYSTEM_OS 2_CUI 5 // OS/2 subsystem IMAGE_SUBSYSTEM_POSIX_CUI 7 // Posix subsystem l l See Explorer / Quick. View (right-click on. exe or. dll file) Or reskitexetype image. exe

Showing. exe Type With Quick. View u In Explorer: l l l Right-click on

Showing. exe Type With Quick. View u In Explorer: l l l Right-click on an executable file or. DLL “Context menu” appears Select Quick View

Environment Subsystems Loading u Subsystems to load specified in registry: l u Values: l

Environment Subsystems Loading u Subsystems to load specified in registry: l u Values: l l u SYSTEMCurrent. Control. SetControlSession ManagerSub. Systems Required Optional Windows csrss. exe os 2 ss. exe psxss. exe Kmode - list of value names for subsystems to load at boot time - list of value names for subsystems to load when needed - value giving filespec of Win 32 subsystem (csrss. exe) Win 32 APIs required (Client Server Runtime Sub. System) OS/2 APIs optional Posix APIs optional - value giving filespec of Win 32 K. Sys (kernel-mode component of Win 32) Some Win 32 API DLLs are in “known DLLs” registry entry: l SYSTEMCurrenct. Control. SetControlSession ManagerKnown. DLLs

Environment Subsystems Components u Subsystem process l u API DLLs l u For Win

Environment Subsystems Components u Subsystem process l u API DLLs l u For Win 32: CSRSS. EXE For Win 32: Kernel 32. DLL, Gdi 32. DLL, User 32. DLL, etc. Kernel-mode extension to executive l Win 32 only: Win 32 K. SYS Environment Subsystems User Mode System and Server Processes Kernel Mode User Application OS/2 Subsystem DLL NTDLL. DLL Win 32 Executive Device Drivers Kernel Hardware Abstraction Layer (HAL) POSIX Win 32 User/GDI

Windows NT Simplified Architecture (3. 51 and earlier) User Mode System and Server Processes

Windows NT Simplified Architecture (3. 51 and earlier) User Mode System and Server Processes Environment Subsystems User Application Subsystem DLL 1 2 OS/2 Win 32 POSIX NTDLL. DLL Kernel Mode Executive LPC Device Drivers Kernel Hardware Abstraction Layer (HAL) 1 2 Most Win 32 Kernel APIs All other Win 32 APIs, including User and GDI APIs

Windows NT Simplified Architecture (4. 0 and later) User Mode System and Server Processes

Windows NT Simplified Architecture (4. 0 and later) User Mode System and Server Processes Environment Subsystems User Application Subsystem DLL 1 2 3 OS/2 Win 32 POSIX NTDLL. DLL Kernel Mode Executive LPC Device Drivers Hardware Abstraction Layer (HAL) 1 Most Win 32 Kernel APIs 2 Most Win 32 User and GDI APIs A 3 few Win 32 APIs Kernel Win 32 User/GDI

(Reduced) Role Of Win 32 Subsystem Process u u u u Process creation and

(Reduced) Role Of Win 32 Subsystem Process u u u u Process creation and deletion Thread creation and deletion Get temporary file name Drive letters Security checks for file system redirector Window management for console (character cell) applications Some support for 16 -bit DOS support (NTVDM. EXE)

Agenda u u u Introduction Tools System Architecture l l l l u u

Agenda u u u Introduction Tools System Architecture l l l l u u Kernel Mode Environment Executive, Kernel, HAL, Drivers Product Packaging System Threads Environment Subsystems System Service Dispatching Process-based Windows NT code Summary Processes and Threads Memory Management

Invoking System Functions From User Mode u Kernel-mode functions (“services”) are invoked from user

Invoking System Functions From User Mode u Kernel-mode functions (“services”) are invoked from user mode via a protected mechanism l l u x 86: INT 2 E; Alpha: SYSCALL (PALcode) I. e. , on a call to an OS service from user mode, the last thing that happens in user mode is this “change mode to kernel” instruction Causes an interrupt, handled by the system service dispatcher (Ki. System. Service) in kernel mode Return to user mode is done by dismissing the interrupt or exception The desired system function is selected by the “system service number” l l Every Windows NT function exported to user mode has a unique number Push this number on the stack just before the “change mode” instruction (after pushing the arguments to the service) This number is an index into the system service dispatch table Table gives kernel-mode entry point address and argument list length for each exported function

Invoking System Functions From User Mode u All validity checks are done after the

Invoking System Functions From User Mode u All validity checks are done after the user to kernel transition l l l u Ki. System. Service probes argument list, copies it to kernel-mode stack, and calls the executive or kernel routine pointed to by the table Service-specific routine checks argument values, probes pointed-to buffers, etc. Once past that point, everything is “trusted” This is safe, because: l l l The system service table is in kernel-protected memory; and The kernel mode routines pointed to by the system service table are in kernel-protected memory; therefore: User mode can’t supply the code to be run in kernel mode; it can only select from among a predefined list Arguments are copied to the kernel mode stack before validation; therefore: Other threads in the process can’t corrupt the arguments “out from under” the service

NTDLL. DLL u PUSH of service # and INT 2 E are “wrapped” by

NTDLL. DLL u PUSH of service # and INT 2 E are “wrapped” by small “jacket” procedures in NTDLL. DLL l l u Entry points in Nt. Dll are not supported or documented for use from user mode apps l l l u u These user-mode routines have the same function names and arguments as the kernel mode routines they call l E. g. , Nt. Write. File in Nt. Dll invokes Nt. Write. File in Ntos. Krnl. Exe Therefore exports of NTDLL are the “NT native API” A few are documented in the DDK for call from kernel mode A few images that come with Windows NT are written to the “native API” exposed by Nt. Dll (“Windows NT native images”) See article on www. sysinternals. com NTDLL also contains image loader and other support functions What about getting to USER and GDI functions in Win 32 K. SYS? l l System service wrapper exists in USER 32. DLL, GDI 32. DLL Does not go through NTDLL. DLL

Tracing An Example Win 32 Call Win 32 application call Write. File(…) Write. File

Tracing An Example Win 32 Call Win 32 application call Write. File(…) Write. File in Kernel 32. Dll call Nt. Write. File return to caller Win 32 specific Nt. Write. File in Nt. Dll Int 2 E return to caller used by all subsystems software interrupt Ki. System. Service in Ntos. Krnl. Exe Nt. Write. File in Ntos. Krnl. Exe U K call Nt. Write. File dismiss interrupt do the operation return to caller Source: MSJ, August 1996, page 21 (by Matt Pietrek)

Tracing An Example Win 32 Call u u Depends. Exe in Resource Kit and

Tracing An Example Win 32 Call u u Depends. Exe in Resource Kit and Platform SDK Allows viewing of image->DLL relationships, imports, and exports

Examining Symbols In Key Images u Examine imports and exports of an. EXE down

Examining Symbols In Key Images u Examine imports and exports of an. EXE down to the OS l l In Explorer, right mouse click on EXE or DLL, then “quick view” (built in) or “View Dependencies” (Dependency Walker tool in Res. Kit and Platform SDK) Or use LINK /DUMP /EXPORTS, /IMPORTS 1. Look at imports of winntsystem 32notepad. exe 2. Look at exports and imports of kernel 32. dll l Most of the exports are documented Win 32 calls 3. Look at exports and imports of ntdll. dll l l None of the exports are documented Some are the same as exports from ntoskrnl. exe, documented in DDK, with identical

Examining Symbols In Key Images 4. Look at exports and imports of ntoskrnl. exe

Examining Symbols In Key Images 4. Look at exports and imports of ntoskrnl. exe l l l About 1000 total exported symbols About 300 of the exported routine names are documented in DDK Callable only from kernel mode 5. Look at all global symbols in ntoskrnl. exe l l Defined in supportsymbolsxxxdebugexentoskrnl. dbg Quick viewer won’t display - use Kernel Debugger “x *” with just this. dbg file loaded About 4000 total symbols (Includes executive data cells in addition to routines) Exports of ntoskrnl. exe are a subset of this list

Agenda u u u Introduction Tools System Architecture l l l l u u

Agenda u u u Introduction Tools System Architecture l l l l u u Kernel Mode Environment Executive, Kernel, HAL, Drivers Product Packaging System Threads Environment Subsystems System Service Dispatching Process-based Windows NT code Summary Processes and Threads Memory Management

Process-Based Windows NT Code u Pieces of Windows NT that run in separate executables

Process-Based Windows NT Code u Pieces of Windows NT that run in separate executables (. exe’s), in separate processes l l u u Started by system Not tied to a user logon Have full process context Three types: l l l Environment Subsystems (already described) Win 32 Services System startup processes l Note: “system startup processes” is not an official MS-defined name

Process Creation Hierarchy u u u tlist. exe (from resource kit) tlist /t shows

Process Creation Hierarchy u u u tlist. exe (from resource kit) tlist /t shows creation hierarchy Creating process can exit, leaving created process running - hence this display does not show all creators l Explorer. exe is actually started by userinit. exe, which then exits

Process-Based Windows NT Code Win 32 services u Win 32. EXEs (applications) that run

Process-Based Windows NT Code Win 32 services u Win 32. EXEs (applications) that run independently of a logged on user l l l l u Start at boot or logon time, survive logoff Defined by Create. Service API - view through Control Panel See srvany. exe, sc. exe, srvinstw. exe, instsrv. exe in Resource Kit Typically do not interact with the desktop l Get startup configuration parameters from Registry l Log errors to Windows NT Event Log Use some form of IPC mechanism for client communication and control Services will likely make use of Windows NT security impersonation Remotely manageable (start, stop, user-defined codes) l Server Manager allows remote control of services l Code is the same to control services locally vs. remotely Examples of built-in Windows NT Services l Schedule service (at command), Event Log, Remote Access Server, etc.

Life Of A Service u Install time l Setup application tells Service Controller about

Life Of A Service u Install time l Setup application tells Service Controller about the service Setup Application Registry Create. Service System boot / initialization l u SCM reads registry, starts services as directed Management / maintenance l Control panel can start and stop services and change startup parameters Control Panel Service Controller Service Processes

Where Are Services Defined? u Maintained in Windows NT Registry: l l u Mandatory

Where Are Services Defined? u Maintained in Windows NT Registry: l l u Mandatory information kept on each service: l l l u Type of service (Win 32, Driver…) Imagename of service. EXE l NOTE: Some service. EXEs contain more than one service Start type (automatic, manual, or disabled) Optional information: l l l u HKEY_LOCAL_MACHINESYSTEMCurrent. Control. SetServices One key per installed service Display Name Dependencies Account and password to run under Can store application-specific configuration parameters l “Parameters” under service key

Process-Based Windows NT Code System startup processes u u Separate processes loaded or started

Process-Based Windows NT Code System startup processes u u Separate processes loaded or started at boot time (not as services or environment subsystems) Names of images are not in registry l u “Hardwired” in the source code Most are Win 32 executables, one (smss) is a “native image” (Idle) (System) first Process id 0 Part of the loaded system image Home for idle thread(s) (not a real process nor real threads) Called “System Process” in many displays Process id 2 Part of the loaded system image Home for kernel-defined threads (not a real process) Thread 0 (routine name Phase 1 Initialization) launches the “real” process, running smss. exe… …and then becomes the zero page thread

Process-Based Windows NT Code System startup processes u u u smss. exe Session Manager

Process-Based Windows NT Code System startup processes u u u smss. exe Session Manager The first “created” process Takes parameters from RegistryMachineSystemCurrent. Control. SetControlSession Manager Launches required subsystems (csrss) and winlogon. exe Logon process Presents first login prompt Presents “enter username and password” dialog Launches services. exe, lsass. exe, and nddeagnt. exe When someone logs in, launches userinit. exe services. exe Service Controller; also, home for many NT-supplied services Starts processes for services not part of services. exe (driven by RegistryMachineSystemCurrent. Control. SetServices ) lsass. exe Local Security Authentication Server userinit. exe Started after logon; starts desktop (Explorer. Exe) and exits (hence does not show up in tlist output; Explorer appears to be an orphan) explorer. exe and its children are the creators of all interactive apps

Agenda u u u Introduction Tools System Architecture l l l l u u

Agenda u u u Introduction Tools System Architecture l l l l u u Kernel Mode Environment Executive, Kernel, HAL, Drivers Product Packaging System Threads Environment Subsystems System Service Dispatching Process-based Windows NT code Summary Processes and Threads Memory Management

Four Contexts For Executing Code u Full process and thread context: l l u

Four Contexts For Executing Code u Full process and thread context: l l u Have thread context but no “real” process: l u Threads in “System” process Routines called by other threads / processes: l l l u User applications Win 32 Services Environment subsystem processes System startup processes Subsystem DLLs Executive system services (Nt. Read. File, etc. ) GDI routines in Win 32 K. Sys (and graphics drivers) No process or thread context l l l (“Arbitrary thread context”) Interrupt dispatching Device drivers

Where Is The Code? u Kernel 32. Dll, Gdi 32. Dll, User 32. Dll

Where Is The Code? u Kernel 32. Dll, Gdi 32. Dll, User 32. Dll l u Nt. Dll l l u l The loadable module that includes the now-kernel-mode Win 32 code (formerly in csrss. exe) Hal. Dll l u Executive and kernel Includes most routines that run as threads in “system” process Win 32 K. Sys l u Provides user-mode access to system-space routines Also contains heap manager, image loader, thread startup routine Ntoskrnl. Exe (or Ntkrnlmp. exe) l u Export Win 32 entry points Hardware Abstraction Library drivername. Sys l Loadable kernel drivers

Agenda u u u Introduction Tools System Architecture Processes and Threads Memory Management

Agenda u u u Introduction Tools System Architecture Processes and Threads Memory Management

Processes And Threads u What is a process? l l u Thread What is

Processes And Threads u What is a process? l l u Thread What is a thread? l l l u Represents an instance of a running program l You create a process to run a program l Starting an application creates a process Primary argument to Create. Process is image file name (or command line) Per-process address space An execution context within a process Primary argument to Create. Thread is a function entry point address All threads in a process share the same perprocess address space Thread Every process starts with one thread l l l Running the program’s “main” function Can create other threads in the same process Can create additional processes Systemwide Address Space

Tools To Examine Processes u u u Task Manager Performance Monitor pviewer. exe (pview

Tools To Examine Processes u u u Task Manager Performance Monitor pviewer. exe (pview in Platform SDK): shows processes, threads within processes, memory details pview. exe (process explode): thread and process ACLs and tokens tlist. exe - tlist /t shows parent/child relationships Quick. Slice l l u u qlice. exe CPU usage by process, and by thread within each process Pulist - process user list Vadump - dump virtual address space of a process

Tools To Examine Processes u Page fault monitor (pfmon. exe) l l u Pstat

Tools To Examine Processes u Page fault monitor (pfmon. exe) l l u Pstat l l l u pstat. exe (char mode, no icon) One-time snapshot of system Shows state of threads within all processes, with wait reasons Kernel debugger l u Shows page fault type and origin of subject application Can provide data to working set tuner (part of Platform SDK) Shows various internal structures See Windows NT® Workstation Resource Kit documentation oh. exe (Res. Kit), nthandleex (www. sysinternals. com) - show open handles Ntpmon (www. sysinternals. com)

Windows NT 5. 0 Job Object u New kernel object to collect a group

Windows NT 5. 0 Job Object u New kernel object to collect a group of related processes l u System enforces job quotas and security context l l u Create. Job. Object/Open. Job. Object Limits: Total and current CPU time, total and active processes, per-process and per-job CPU time, min and max working set, CPU affinity, priority class Security limits: No administrators token, only restricted token, only specific token, filter token, no accessing windows outside the job, no reading/writing the clipboard To examine: See new performance counters + new !job command in kernel debugger

Processes And Threads Internal Structures Access token VAD Process object VAD Virtual address space

Processes And Threads Internal Structures Access token VAD Process object VAD Virtual address space descriptors Handle table See kernel debugger commands: !processfields !threadfields !process !thread !tokenfields !token !handle !object Thread … Access token

!processfields Pcb: Exit. Status: Lock. Event: Lock. Count: Create. Time: Exit. Time: Lock. Owner:

!processfields Pcb: Exit. Status: Lock. Event: Lock. Count: Create. Time: Exit. Time: Lock. Owner: Unique. Process. Id: Active. Process. Links: Quota. Peak. Pool. Usage[0]: Quota. Pool. Usage[0]: Pagefile. Usage: Commit. Charge: Peak. Pagefile. Usage: Peak. Virtual. Size: Vm: Last. Proto. Pte. Fault: Debug. Port: Exception. Port: Object. Table: Token: Working. Set. Lock: Working. Set. Page: Process. Outswap. Enabled: Process. Outswapped: Address. Space. Initialized: Address. Space. Deleted: Address. Creation. Lock: 0 x 0 0 x 68 0 x 6 c 0 x 7 c 0 x 80 0 x 88 0 x 90 0 x 94 0 x 98 0 xa 0 0 xa 8 0 xb 0 0 xb 4 0 xb 8 0 xbc 0 xc 0 0 xc 8 0 xfc 0 x 100 0 x 104 0 x 108 0 x 10 c 0 x 12 c 0 x 130 0 x 131 0 x 132 0 x 133 0 x 134 Fork. In. Progress: Vm. Operation. Event: Page. Directory. Pte: Last. Fault. Count: Vad. Root: Vad. Hint: Clone. Root: Number. Of. Private. Pages: Number. Of. Locked. Pages: Fork. Was. Successful: Exit. Process. Called: Create. Process. Reported: Section. Handle: Peb: Section. Base. Address: Quota. Block: Last. Thread. Exit. Status: Working. Set. Watch: Lpc. Port: Inherited. From. Unique. Process. Id: Granted. Access: Default. Hard. Error. Processing Ldt. Information: Vad. Free. Hint: Vdm. Objects: Process. Mutant: Image. File. Name[0]: Vm. Trim. Fault. Value: 0 x 158 0 x 15 c 0 x 160 0 x 164 0 x 168 0 x 170 0 x 174 0 x 178 0 x 17 c 0 x 180 0 x 184 0 x 186 0 x 187 0 x 188 0 x 18 c 0 x 190 0 x 194 0 x 198 0 x 19 c 0 x 1 a 0 0 x 1 a 4 0 x 1 a 8 0 x 1 ac 0 x 1 b 0 0 x 1 b 4 0 x 1 b 8 0 x 1 bc 0 x 1 dc 0 x 1 ec

!threadfields Tcb: Create. Time: Exit. Status: Post. Block. List: Termination. Port. List: Active. Timer.

!threadfields Tcb: Create. Time: Exit. Status: Post. Block. List: Termination. Port. List: Active. Timer. List. Lock: Active. Timer. List. Head: Cid: Lpc. Reply. Semaphore: Lpc. Reply. Message. Id: Client: Irp. List: Top. Level. Irp: Read. Cluster. Size: Forward. Cluster. Only: Disable. Page. Fault. Clustering: Dead. Thread: Has. Terminated: Event. Pair: Granted. Access: Threads. Process: Start. Address: Win 32 Start. Address: Lpc. Exit. Thread. Called: Hard. Errors. Are. Disabled: 0 x 0 0 x 1 b 8 0 x 1 c 0 0 x 1 c 4 0 x 1 cc 0 x 1 d 4 0 x 1 d 8 0 x 1 e 0 0 x 1 e 8 0 x 1 fc 0 x 200 0 x 208 0 x 20 c 0 x 214 0 x 21 c 0 x 220 0 x 221 0 x 222 0 x 223 0 x 224 0 x 228 0 x 22 c 0 x 230 0 x 234 0 x 238 0 x 239

Looking At Waiting Threads u pstat. exe (Resource Kit) l l Shows state of

Looking At Waiting Threads u pstat. exe (Resource Kit) l l Shows state of every thread in every process But for threads that are waiting, that’s all we know…

Looking At Waiting Threads u !thread command in kernel debugger shows what a thread

Looking At Waiting Threads u !thread command in kernel debugger shows what a thread is waiting on

Dispatcher Objects u Any kernel object you can wait for is a “dispatcher object”

Dispatcher Objects u Any kernel object you can wait for is a “dispatcher object” l l l u u Some exclusively for synchronization l E. g. , events, mutexes (“mutants”), semaphores, queues, timers Others can be waited for as a side effect of their prime function l E. g. , processes, threads, file objects Non-waitable kernel objects are called “control objects” All dispatcher objects have a common header All dispatcher objects are in one of two states l l “Signalled” versus “nonsignalled” When signalled, a wait on the object is satisfied Different object types differ in terms of what changes their state Wait and unwait implementation is common to all types of dispatcher objects Dispatcher object Size Type State Wait listhead Object-typespecific data (see ddkincnttddk. h)

Thread Objects Wait. Block. List Dispatcher Objects Wait Blocks u u Size Type State

Thread Objects Wait. Block. List Dispatcher Objects Wait Blocks u u Size Type State Wait blocks Wait listhead List entry Object-typespecific data Thread Object Key Type Next link u u Represent a thread’s reference to something it’s waiting for (one per handle passed to Wait. For…) All wait blocks from a given wait call are chained to the waiting thread Type indicates wait for “any” or “all” Key denotes argument list position for Wait. For. Multiple. Objects Size Type State Wait listhead List entry Object-typespecific data Thread Object Key Type Next link

Agenda u u u Introduction Tools System Architecture Processes and Threads Memory Management l

Agenda u u u Introduction Tools System Architecture Processes and Threads Memory Management l l Virtual Address Space Layout Process Memory Usage Global System Cache System Memory Usage

4 GB Virtual Address Space u 0000 Unique per process, accessible in user or

4 GB Virtual Address Space u 0000 Unique per process, accessible in user or kernel mode 7 FFFFFFF 80000000 Per process, accessible only in kernel mode C 0000000 System wide, accessible only in kernel mode FFFF . EXE code Globals Per-thread user mode stacks Process heaps. DLL code 2 GB per-process l u 2 GB systemwide l Exec, Kernel, HAL, drivers, perthread kernel mode stacks, Process page tables, Win 32 K. Sys hyperspace File system cache Paged pool Non-paged pool Address space of one process is not directly reachable from other processes l The operating system is loaded here, and appears in every process’s address space There is no process for “the operating system” (though there are processes that do things for the OS, more or less in “background”)

System Space Layout x 86 Alpha AXP 80000000 System code (NTOSKRNL, HAL, boot drivers);

System Space Layout x 86 Alpha AXP 80000000 System code (NTOSKRNL, HAL, boot drivers); initial nonpaged pool A 0000000 A 4000000 C 0400000 C 0800000 C 0 C 00000 C 1000000 EB 000000 (min) FFBE 0000 FFC 00000 System Mapped Views (e. g. WIN 32 K. SYS) or session space (Terminal Server only) Additional System PTEs (& big cache) Process Page Tables and Page Directory Hyperspace and process working set list Unused No Access System Working Set List System Cache Paged Pool System PTEs Non-Paged Pool expansion Crash dump information HAL usage System code (NTOSKRNL, HAL, boot drivers) and initial nonpaged pool C 0000000 C 1000000 C 2000000 C 3000000 C 4000000 DE 000000 E 1000000 EB 000000 (min) FDFEC 000 Process Page Tables and Page Directory Hyperspace and process working set list Unused No Access System Working Set List System Cache System Mapped Views (e. g. WIN 32 K. SYS) Paged Pool System PTEs Non-Paged Pool expansion Crash dump information & HAL usage

3 GB Process Space Option 0000 Unique per process, accessible in user or kernel

3 GB Process Space Option 0000 Unique per process, accessible in user or kernel mode Per process, accessible only in kernel mode BFFFFFFF C 0000000 System wide, accessible only in kernel mode FFFF u Unique per process. EXE code (= per appl. ), Globals user mode Per-thread user mode stacks. DLL code Process heaps Only available on x 86 Server Enterprise Edition l l u Expands per-process address space l Process page tables, hyperspace Exec, kernel, HAL, drivers, etc. u Boot with /3 GB option in BOOT. INI Chief “loser” in system space is file system cache But image must be marked as “large address space aware” A stopgap while we wait for 64 -bit Windows NT (Merced and Alpha; post. Windows NT 5. 0)

64 -bit Very Large Memory In Windows NT 5. 0 00000000 7 FFFFFFF 00000001

64 -bit Very Large Memory In Windows NT 5. 0 00000000 7 FFFFFFF 00000001 0000 2 GB user space 2 GB process space u 28 GB Large Memory Area u Alpha Windows NT Server Enterprise Edition only Referenced by 64 -bit pointers l l 00000007 FFFF 00000008 0000 l Invalid (inaccesible) (about 1. 8 x 10^19 bytes; not to scale!) FFFF 7 FFFFFFF 80000000 FFFFFFFF 2 GB system space l Cannot be paged out - must be resident at all times Cannot be used for code, only data file mapping New APIs: Virtual. Alloc. Vlm, Map. View. Of. File. Vlm, Read/Write. Process. Memory. Vl m, etc. ) Yet another stopgap prior to 64 -bit Windows NT

Application Startup Maps V. A. S. To Code On Disk 0000 paging file .

Application Startup Maps V. A. S. To Code On Disk 0000 paging file . dll . exe 7 FFFFFFF u u u See link/dump/header, or Quick. View for. exe’s and. dll’s Create. File. Mapping, Map. View. Of. File simply make the mechanism available to application-level code All of these files may simultaneously be mapped by other processes

Process Virtual Address Layout Screen snapshot from: Programs | SDK Tools | Process Walker

Process Virtual Address Layout Screen snapshot from: Programs | SDK Tools | Process Walker Process | Load Process | notepad

Agenda u u u Introduction Tools System Architecture Processes and Threads Memory Management l

Agenda u u u Introduction Tools System Architecture Processes and Threads Memory Management l l Virtual Address Space Layout Process Memory Usage Global System Cache System Memory Usage

Process Memory Usage u Working set: All the physical pages “owned” by a process

Process Memory Usage u Working set: All the physical pages “owned” by a process l l l u Essentially, all the pages the process can reference without incurring a page fault Upper limit on size for each process When limit is reached, a page must be released for every page that’s brought in (“working set replacement”) Working set limit: The maximum pages the process can own l l Maximum is calculated as (available pages - 512 pages) Result stored in Mm. Maximum. Working. Set. Size

Working Set List A FIFO list for each process newer pages older pages Perf.

Working Set List A FIFO list for each process newer pages older pages Perf. Mon Process “Working. Set”

Working Set Replacement Perf. Mon Process “Working. Set” u u To standby or modified

Working Set Replacement Perf. Mon Process “Working. Set” u u To standby or modified page list When working set “count” = working set size, must give up pages to make room for new pages Page replacement is ”modified FIFO” l l MP x 86 and Alpha: no regard to accessed bit Windows NT 5. 0 on uniprocessor x 86 takes into account age

Locking Pages u Pages may be locked into the process working set l Locked

Locking Pages u Pages may be locked into the process working set l Locked pages are guarenteed in physical memory (“resident”) when any thread in process is executing Win 32: status = Virtual. Lock(base. Address, size); status = Virtual. Unlock(base. Address, size); u Number of lockable pages is a fraction of the maximum working set size l u Changed by Set. Process. Working. Set. Size Pages can be locked into physical memory (by drivers only) l Pages are then immune from outswapping as well as paging Mm. Probe. And. Lock. Pages

Memory Management Information Task manager processes tab 1 u “Mem Usage” = physical 2

Memory Management Information Task manager processes tab 1 u “Mem Usage” = physical 2 u 3 4 u memory used by process (working set size, not working set limit) “VM Size” = private (not shared) committed virtual space in processes “Mem Usage” in status bar is total of “VM Size” column/maximum allowed i. e. , same as “commit charge” in “Performance” tab (see next slide) - not same as “Mem Usage” column here! 1 2 3 4 Screen snapshot from : Task Manager | Processes tab

Memory Management Information Perf. Mon - process object 1 u 2 u 6 u

Memory Management Information Perf. Mon - process object 1 u 2 u 6 u “Working Set” = working set size (not limit) “Private Bytes” = same as “VM Size” from Task Manager Processes list “Virtual Bytes” = committed virtual space, including 2 shared pages 6 1 Screen snapshot from: Performance Monitor counters from Process object

Memory Management Information Task manager performance tab 3 4 “Commit charge total” = total

Memory Management Information Task manager performance tab 3 4 “Commit charge total” = total of private (not shared) committed virtual space in all processes (i. e. total of “VM Size” from processes display) “Commit charge limit” = sum of available physical memory + free space in paging file 3 3 4 Screen snapshot from: Task Manager | Performance tab

Agenda u u u Introduction Tools System Architecture Processes and Threads Memory Management l

Agenda u u u Introduction Tools System Architecture Processes and Threads Memory Management l l Virtual Address Space Layout Process Memory Usage Global System Cache System Memory Usage

File System Virtual Block Cache u u Shared by all file systems (local or

File System Virtual Block Cache u u Shared by all file systems (local or remote) Caches all files l u Virtual block cache (not logical block) l l l u Including file system metadata files Managed in terms of blocks within files, not blocks within partition Uses standard Windows NT virtual memory mechanisms Coherency maintained between mapped files and read/write access Virtual size: 64 -512 mb (960 MB if large cache size set) l l In system virtual address space, so visible to all Divided into 256 kb “views”

Cached File Operations u Open a file: l l u Find an available view

Cached File Operations u Open a file: l l u Find an available view Map the first 256 kb of the file into the view Read from or write to a cached file: l l l Remap as necessary to map referenced section of file into the cache Copy data between application buffer and cache’s virtual address space Actual I/O is due to paging Process address space System address space File

Fast I/O Cache Manager I/O Subsystem API (Ntxxx) Fast I/O path I/O Manager (Ioxxx)

Fast I/O Cache Manager I/O Subsystem API (Ntxxx) Fast I/O path I/O Manager (Ioxxx) Driver Support Routines (Io, Ex, Ke, Mm, Hal, Fs. Rtl, . . . ) File System drivers (e. g. NTFS) u Fast I/O path l Disk device driver l HAL I/O access routines I/O ports and registers l Allows executive I/O APIs to access cache directly Bypasses file system driver Bypasses IRP generation, probeand-lock of user buffer, etc.

Cache Size u Physical size: Depends on available memory l l l Competes for

Cache Size u Physical size: Depends on available memory l l l Competes for physical memory with processes, paged pool, pageable system code Part of “system working set” l Automatically expanded / shrunk by system l Normal working set adjustment mechanisms l Relies on Memory Manager for global memory policy l Performance Monitor: Memory object | System cache resident bytes shows current physical space occupied by cache See SYSTEMCurrent. Control. SetControlSession Manager Memory ManagementLarge. System. Cache l Default is 0 for both Workstation and Server l 1 = favor system working set vs. process working set l also allows cache to be >512 MB virtual size l Can modify with Control Panel->Network->Services-> Server properties

Cache Functions And Control u Automatic asynchronous readahead l l u Done by separate

Cache Functions And Control u Automatic asynchronous readahead l l u Done by separate “Readahead” system thread 64 kb readaheads by default Predicts next read location based on history of last 3 reads Readahead hints can be provided to Create. File: l FILE_FLAG_SEQUENTIAL does 192 kb read ahead l FILE_FLAG_RANDOM_ACCESS disables read ahead Write-back, not write-through l l l Dirty page threshold forces writing l Small system: Physical Pages / 8; medium system: Physical Pages / 4 l Large system: add above 2 together “Lazy writer” thread queues 1/4 of dirty pages every second to separate “Write Behind” system thread (note, does not flush mapped files) Can override via Create. File with FILE_FLAG_WRITE_THROUGH l Or explicitly call Flush. File. Buffers when you care (does flush mapped files)

Cache Functions And Control u Can disable cache completely on a per-file basis l

Cache Functions And Control u Can disable cache completely on a per-file basis l l l Create. File with FILE_FLAG_NO_BUFFERING Requires reads/writes to be done on sector boundaries Buffers must be aligned in memory on sector boundaries

Agenda u u u Introduction Tools System Architecture Processes and Threads Memory Management l

Agenda u u u Introduction Tools System Architecture Processes and Threads Memory Management l l Virtual Address Space Layout Process Memory Usage Global System Cache System Memory Usage

System Paged Memory u Just as processes have working sets, Windows NT’s pageable system-space

System Paged Memory u Just as processes have working sets, Windows NT’s pageable system-space code and data lives in the “system working set” l u Pageable components of system working set: l l u Cache is one of 4 components of “system working set” Paged pool Pageable code and data in the exec Pageable code and data in kernel-mode drivers, Win 32 K. Sys, graphics drivers, etc. Global file system data cache To get physical (resident) size of these with Perf. Mon, look at: l l l Memory | Pool Paged Resident Bytes Memory | System Code Resident Bytes Memory | System Driver Resident Bytes Memory | System Cache Resident Bytes Memory | Cache bytes counter is total of these four “resident” (physical) counters (not just the cache; same as “File Cache” on Task Manager / Performance tab

Sessions u u New memory management object to support Windows NT® Server 5. 0

Sessions u u New memory management object to support Windows NT® Server 5. 0 All processes in an interactive session share a: l l l Session-specific copy of Win 32 K. Sys Instance of Winlogon Session working set x 86 80000000 System code (NTOSKRNL, HAL, boot drivers); initial nonpaged pool A 0000000 Win 32 k. sys *8 MB) A 0800000 Session Working Set Lists A 0 C 00000 Mapped Views for Session A 2000000 Paged Pool for Session

System Nonpaged Memory u Nonpageable components: l l u Nonpageable parts of Ntos. Krnl.

System Nonpaged Memory u Nonpageable components: l l u Nonpageable parts of Ntos. Krnl. Exe, drivers Nonpaged pool (see Perf. Mon, Memory object: Pool nonpaged bytes) To get size of nonpageable system code, run ntreskitpstat. exe & add columns 1 & 2 7 non-paged code 8 non-paged data 9 pageable code+data l output of “drivers” (ntreskitdrivers. exe) is similar l Win 32 K. Sys is paged, even though it shows up as nonpaged 7 8 9

Monitoring Pool Usage u u Poolmon. exe in supportdebug Must first turn on pool

Monitoring Pool Usage u u Poolmon. exe in supportdebug Must first turn on pool tagging with gflags “p” to toggle between nonpaged, paged pool, or both Sorting: “b” to sort by total # of bytes “a” to sort by # of allocations “t” to sort by structure tag

“Free” Memory u System keeps unassigned physical pages (those not part of any working

“Free” Memory u System keeps unassigned physical pages (those not part of any working set) on five lists l l l Free page list Modified page list Standby page list Zero page list Bad page list - pages that failed memory test at system startup

Managing Physical Pages demand zero page faults pages read from disk Standby Page List

Managing Physical Pages demand zero page faults pages read from disk Standby Page List Process Working Sets “soft” page faults working set replacement modified page writer Modified Page List Free Page List zero page thread Zero Page List Bad Page List

Memory Management Information Task manager performance tab 1 “Available” memory = total of free,

Memory Management Information Task manager performance tab 1 “Available” memory = total of free, zero, and standby lists (majority usually are standby pages) 2 “File cache” is really total physical size of pageable portions of: paged pool, Ntos. Krnl. Exe code and data, drivers code and data, and file system cache (same as Perf. Mon “cache bytes” counter) 3 “Kernel Memory Paged” is resident size of paged pool 4 “Kernel Memory Nonpaged” is actual size of nonpaged pool 1 2 3 4 Screen snapshot from: Task Manager | Performance tab

Summary: Accounting For Physical Memory Usage u Process working sets l l l u

Summary: Accounting For Physical Memory Usage u Process working sets l l l u Perfmon: Memory / Pool nonpaged bytes Free, zero, and standby page lists l l Perfmon: Memory / Available bytes Or: Task Manager / Performance tab: Physical memory: Available Pageable, but currently-resident, system-space memory l l l See total displayed by DRIVERS utility in Windows NT Resource Kit Nonpageable pool l u Perfmon: Process / Working set Note, shared resident pages are counted the process working set of every process that’s faulted them in Hence, the total of all of these may be greater than physical memory Nonpageable system code (NTOSKRNL + drivers, including win 32 k. sys &graphics drivers) l u u u Perfmon: Memory / Pool paged resident bytes Perfmon: Memory / System cache resident bytes Perfmon: Memory / System code resident bytes Perfmon: Memory / System driver resident bytes Memory | Cache bytes counter is really total of these four “resident” (physical) counters Modified, Bad page lists l can only see size with !memusage command in Kernel Debugger

Windows NT Internals Information Sources u Books l l l u MSDN Library l

Windows NT Internals Information Sources u Books l l l u MSDN Library l l l u u u u Inside Windows NT (Solomon, MS Press) Advanced Windows (Richter, MS Press) Windows NT Workstation Resource Guide (MS Press) Platform SDK API documentation Windows NT Device Driver Kit (DDK) documentation Win 32 Knowledge Base - has some Windows NT internals articles Past Windows NT conferences audio/video tapes (www. mobiletape. com) www. sysinternals. com - Windows NT internals articles and tools www. microsoft. com/hwdev - hardware developers and driver writers www. microsoft. com/hwdev/ntifskit - Installable File System Developers Kit comp. os. ms-windows. programmer. nt. kernel-mode - drivers newsgroup www. cmkrnl. com - Windows NT device driver FAQ