Chapter 11 Implementing File Systems n FileSystem Structure








































- Slides: 40

Chapter 11: Implementing File Systems n File-System Structure n File-System Implementation n Directory Implementation n Allocation Methods n Free-Space Management n Efficiency and Performance n Recovery (skip 11. 8, 11. 9) n Objectives l To describe the details of implementing local file systems and directory structures l To discuss block allocation and free-block algorithms and tradeoffs Operating System Principles 11. 1 Silberschatz, Galvin and Gagne © 2005

11. 1 File-System Structure n Disk characteristics for storing multiple files Can be rewritten in place l Can access directly any block of information it contains l n File structure Logical storage unit l Collection of related information l n File system resides on secondary storage (disks) l Allow the data in disk to be stored, located, and retrieved easily 4 How the file system should look to the user 4 How to map the logical file system to the physical secondary storage devices n File system organized into layers n File control block (FCB) – storage structure consisting of information about a file Operating System Principles 11. 2 Silberschatz, Galvin and Gagne © 2005

Layered File System calls like create( ), open( ), close( ) Manages metadata information Translates logical block addresses to physical block addresses Device driver Operating System Principles 11. 3 Silberschatz, Galvin and Gagne © 2005

11. 2 File-System Implementation n On-disk and in-memory structures are used to implement a file system n On disk A boot control block l A volume control block l A per-file FCB: file permissions, ownership, size, and location of the data blocks l 4 In UNIX File System, it is called inode 4 In Windows NTFS, it is stored as a record in master file table l A directory structure 4 In Unix File System, this include file names and associated inode numbers 4 In NTFS, it is stored in the master file table Operating System Principles 11. 4 Silberschatz, Galvin and Gagne © 2005

11. 2 File-System Implementation n In-memory information is used for file-system management and performance improvement via caching l In-memory mount table l In-memory directory-structure cache l System-wide open-file table 4 A l copy of the FCB for each open file Per-process open-file table 4 A pointer to the appropriate entry in system-wide open-file table 4 In Unix, it is called a file descriptor 4 In Windows, it is called a file handler Operating System Principles 11. 5 Silberschatz, Galvin and Gagne © 2005

A Typical File Control Block Operating System Principles 11. 6 Silberschatz, Galvin and Gagne © 2005

In-Memory File System Structures n The following figure illustrates the necessary file system structures provided by the operating systems. Figure 11 -3(a) refers to opening a file. Operating System Principles 11. 7 Silberschatz, Galvin and Gagne © 2005

In-Memory File System Structures n. Figure 11 -3(b) refers to reading a file. Operating System Principles 11. 8 Silberschatz, Galvin and Gagne © 2005

Partitions and Mounting n A disk can be sliced into multiple partitions. A volume can span multiple partitions on multiple disks (RAID, Section 12. 7) n Raw disk: no file system. l Used in Unix swap space, and database management systems n Boot information has its own format, and is usually a sequential series of blocks, loaded as an image into memory l Allow dual-booted for installing multiple OS n The root partition, containing the OS kernel and other system files, is mounted at boot time Other volumes can be automatically mounted at boot time or manually mounted later Skip l OS maintains a mount table for mounted file systems l 11. 2. 3 Operating System Principles 11. 9 Silberschatz, Galvin and Gagne © 2005

11. 3 Directory Implementation n Linear list of file names with pointer to the data blocks. simple to program but time-consuming to execute l To create a new file, directory must be searched to be sure that no existing file has the same name. To delete a file, we search the directory for the named file, then release the space allocated to it l l To reuse the directory entry, several options 4 Mark – – the entry as unused by Assigning it a special name, such as an all-blank name Or with a used-unused bit 4 Attach it to a list of free directory entries 4 Copy the last entry in the directory into the freed location and decrease the length of the directory l Disadvantage: finding a file requires a linear search 4 Make Operating System Principles a list sorted would complicate the creating and deleting of files 11. 10 Silberschatz, Galvin and Gagne © 2005

Directory Implementation n Hash Table – linear list stores the directory entries with a hash table l The hash table takes a value computed from the file name and returns a pointer to the file name in the linear list l Decreases directory search time l Some provisions must be made for collisions – situations where two file names hash to the same location l Difficulties: 4 Fixed 4 The l size (because it is a table) dependence of the hash function on that size Alternatively, a chained-overflow hash table can be used instead 4 Each hash entry is a linked list instead of an individual value 4 Collisions Operating System Principles resolved by adding the new entry to the linked list 11. 11 Silberschatz, Galvin and Gagne © 2005

11. 4 Allocation Methods n An allocation method refers to how disk blocks are allocated for files. l How to allocate space to these files so that disk space is utilized effectively and files can be accessed quickly n Three major methods l Contiguous allocation l Linked allocation l Indexed allocation Operating System Principles 11. 12 Silberschatz, Galvin and Gagne © 2005

Contiguous Allocation n Each file occupies a set of contiguous blocks on the disk l Simple – only starting location (block #) and length (number of blocks) are required l Random access (next page) n Problems l Dynamic storage-allocation problem 4 First-fit, best-fit, worst-fit 4 Repacking l off-line or on-line Determining how much space is needed for a file when it is created. 4 If we allocate too little space to a file, it cannot be extended 4 Pre-allocation Operating System Principles may be inefficient 11. 13 Silberschatz, Galvin and Gagne © 2005

Contiguous Allocation n Mapping from logical to physical Q (Quotient) Logical Address/512 R (Remainder) Block to be accessed = Q + starting address Displacement into block = R Operating System Principles 11. 14 Silberschatz, Galvin and Gagne © 2005

Contiguous Allocation of Disk Space Operating System Principles 11. 15 Silberschatz, Galvin and Gagne © 2005

Extent-Based Systems n Many newer file systems (I. e. Veritas File System) use this modified contiguous allocation scheme n Extent-based file systems allocate disk blocks in extents l A contiguous chunk of space is allocated initially l If that amount is not large enough later, another chunk of contiguous space, called extent, is added l A file consists of one or more extents. 4 The location of a file’s blocks is recorded as a location and a block count, plus a link to the first block of the next extent Operating System Principles 11. 16 Silberschatz, Galvin and Gagne © 2005

Linked Allocation n Each file is a linked list of disk blocks: blocks may be scattered anywhere on the disk. n The directory contains for each file a pointer to the first and last blocks of the file Pointer to the next block Contents of a block Operating System Principles 11. 17 Silberschatz, Galvin and Gagne © 2005

Linked Allocation (Cont. ) n Simple – need only starting address n Free-space management system – no waste of space n No random access n Mapping Q (Quotient) Logical Address/511 R (Remainder) Block to be accessed is the Q-th block in the linked chain of blocks representing the file. Displacement into block = R + 1 File-allocation table (FAT) – disk-space allocation used by MS-DOS and OS/2. Operating System Principles 11. 18 Silberschatz, Galvin and Gagne © 2005

Linked Allocation Disadvantages: 1. Can be used effectively only for sequential-access files 2. The space required for the pointers. Solution: collect blocks into clusters, and allocate clusters rather than blocks. Cost is increased internal fragmentation. 3. Reliability: disaster if pointers were lost or damaged. Solution: doubly linked lists or store file name and relative block number in each block. 10 16 25 1 Operating System Principles 11. 19 Silberschatz, Galvin and Gagne © 2005

Linked Allocation n Linear list of file names with pointer to the data blocks l simple to program l time-consuming to execute n Hash Table – linear list with hash data structure l decreases directory search time l collisions – situations where two file names hash to the same location l fixed size Operating System Principles 11. 20 Silberschatz, Galvin and Gagne © 2005

File-Allocation Table (MS-DOS and OS/2) Unused block: a 0 table value Allocating a new block to a file: Finding the first 0 -valued table entry and replacing the previous end-of-file value with the address of the new block. The 0 table entry is then replaced by the end-of-file value. end-of-file Operating System Principles 11. 21 Silberschatz, Galvin and Gagne © 2005

Indexed Allocation n Brings all pointers together into the index block n Each file has its own index block n Logical view. index table Operating System Principles 11. 22 Silberschatz, Galvin and Gagne © 2005

Example of Indexed Allocation Operating System Principles 11. 23 Silberschatz, Galvin and Gagne © 2005

Indexed Allocation n Need index table n Random access n Dynamic access without external fragmentation, but have overhead of index block n With only 1 block for index table and a block size of 512 words, mapping from logical to physical in a file of maximum size of 256 K (=512 * 512) words. Q (Quotient) Logical Address/512 R (Remainder) Q = displacement into index table R = displacement into block Operating System Principles 11. 24 Silberschatz, Galvin and Gagne © 2005

Indexed Allocation n If the index block is too small, it will not be able to hole enough pointers for a large file. Mechanisms to handle this issue: l Linked scheme 4 The last word in the index block is nil (for a small file) or is a pointer to another index block (for a large file) l Multilevel index 4 Use first-level index block to point to a set of second-level index blocks, which point to the file blocks l Combined scheme 4 Example: In Unix File System, for the 15 pointers of the index block in the file’s inode – The first 12 point to data of the file – The next three pointers point to (single, double, triple) indirect blocks Operating System Principles 11. 25 Silberschatz, Galvin and Gagne © 2005

Indexed Allocation – Mapping (Cont. ) n Mapping from logical to physical in a file of unbounded length (block size of 512 words). n Linked scheme – Link blocks of index table (no limit on size). Q 1 LA / (512 x 511) R 1 Q 1 = block of index table R 1 is used as follows: Q 2 R 1 / 512 R 2 Q 2 = displacement into block of index table R 2 displacement into block of file: Operating System Principles 11. 26 Silberschatz, Galvin and Gagne © 2005

Indexed Allocation – Mapping (Cont. ) n Two-level index (maximum file size is 5123) LA / (512 x 512) Q 1 R 1 Q 1 = displacement into outer-index R 1 is used as follows: Q 2 R 1 / 512 R 2 Q 2 = displacement into block of index table R 2 displacement into block of file: outer-index table Operating System Principles 11. 27 file Silberschatz, Galvin and Gagne © 2005

Combined Scheme: UNIX (4 K bytes per block) Operating System Principles 11. 28 Silberschatz, Galvin and Gagne © 2005

Performance n Before selecting an allocation method, we need to know how the system would be used l Contiguous allocation requires only one access to get a disk block. l Linked allocation is only good for sequential access. 4 Some system supports both, but require the declaration of the type of access in file creation. l Keeping index block in memory requires considerable space. The performance of indexed allocation depends on the index structure (how many level), on the size of the file, and on the position of the block desired. 4 Some system uses contiguous allocation for small files and automatically switching to an index allocation if the file grows large l Many other optimizations are in use 4 It is reasonable to add (hundreds of) thousands of instructions to save a few disk-head movements Operating System Principles 11. 29 Silberschatz, Galvin and Gagne © 2005

11. 5 Free-Space Management n Bit vector (n blocks) 0 1 2 n-1 … bit[i] = 1 block[i] free 0 block[i] occupied The first non-0 word (one word may be 8, 16, or 32 bits) is scanned to find the first 1 bit, which is the location of the first free block. Its block number calculation is: (number of bits per word) *(number of 0 -value words) +offset of first 1 bit Operating System Principles 11. 30 Silberschatz, Galvin and Gagne © 2005

Free-Space Management n Bit map requires extra space l Example: block size = 29 bytes = 512 bytes disk size = 230 bytes (1 gigabyte) n = 230/29 =221 bits (or 218 bytes = 28 K bytes = 256 K bytes) l Example: block size = 1024 bytes = 210 bytes disk size = 40 GB =10 * 232 bytes n =10*232 / 210 = 10* 222 bits (or 10*219 bytes = 5*2*219 bytes = 5 * 220 bytes = 5 M bytes) n Clustering the blocks in groups of four reduces this number to 64 KB n Easy to get contiguous files Operating System Principles 11. 31 Silberschatz, Galvin and Gagne © 2005

Free-Space Management n Linked list (free list) l Linking together all the free disk blocks, keeping a pointer to the first free block in a special location on the disk and cache it in memory. The first block contains a pointer to the next free disk block, and so on. l Cannot get contiguous space easily l No waste of space Operating System Principles 11. 32 Silberschatz, Galvin and Gagne © 2005

Free-Space Management n Grouping l A modification of the linked free-list l Stores the address of n free blocks in the first free block. l l l The first n-1 of these blocks are actually free l The last block contains the address of another n free blocks, and so on. The addresses of a large number of free blocks can be found quickly now. Counting l A modification of the linked free-list l Keeps the address of the first free block and the number n of free contiguous blocks that follow the first block. Each entry in the free space list then consists of a disk address and a count. l Useful when space is allocated with the contiguous-allocation algorithm or through clustering Operating System Principles 11. 33 Silberschatz, Galvin and Gagne © 2005

Free-Space Management (extra material) n Need to protect: l Pointer to free list Bit map 4 Must be kept on disk 4 Copy in memory and disk may differ 4 Cannot allow for block[i] to have a situation where bit[i] = 1 in memory and bit[i] = 0 on disk l Solution: 4 Set bit[i] = 1 in disk l 4 Allocate block[i] 4 Set bit[i] = 1 in memory Operating System Principles 11. 34 Silberschatz, Galvin and Gagne © 2005

11. 6 Efficiency and Performance n Efficiency dependent on: l disk allocation and directory algorithms l types of data kept in file’s directory entry n In UNIX, the file system’s performance is improved by pre- allocating the inodes and spreading them across the volume, and by keeping a file’s data blocks near that file’s inode block to reduce seek time. l BSD UNIX varies the cluster size as a file grows to reduce internal fragmentation. n Consideration of kept data in a file’s directory entry l Last write date, last access date n Size of pointers in the allocation list will limit the length of a file l Need to plan for changing technology (FAT from 12 to 16 to 32) n Early Solaris needs to reboot the system to change system table sizes Operating System Principles 11. 35 Silberschatz, Galvin and Gagne © 2005

Efficiency and Performance n Performance l Most disk controllers include local memory to form on-board cache that is large enough to store entire tracks at a time l Buffer cache – separate section of main memory for blocks that will be used shortly l A page caches pages rather than disk blocks using virtual memory techniques l free-behind and read-ahead – techniques to optimize sequential access 4 Free-behind removes a page from the buffer as soon as the next page is requested 4 With read-ahead, a requested page and several subsequent pages are read and cached l improve PC performance by dedicating section of memory as virtual disk, or RAM disk Operating System Principles 11. 36 Silberschatz, Galvin and Gagne © 2005

Page Cache n Memory-mapped I/O uses a page cache n Routine I/O through the file system uses the buffer (disk) cache n unified virtual memory: Many OS’s use page caching to cache both process pages and file data n This leads to the following figure Operating System Principles 11. 37 Silberschatz, Galvin and Gagne © 2005

A unified buffer cache uses the same page cache to cache both memory-mapped pages and ordinary file system I/O double caching I/O Without a Unified Buffer Cache Operating System Principles 11. 38 I/O using a Unified Buffer Cache Silberschatz, Galvin and Gagne © 2005

Issues n Whether writes to the file system occur synchronously or asynchronously n Synchronous writes: the calling routine must wait for the data to reach the disk drive before it can proceed n Asynchronous writes: done the majority of the time. Data are stored in the cache, and control return to the caller. Metadata writes can be synchronous. n Interactions among the page cache, the file system, and the disk drivers n When data are written to a disk file, the pages are buffered in the cache, and the disk driver sorts its output queue according to disk address, to minimize disk-head seeks and to write data at times optimized for disk rotation n Thus, output to the disk through the file system is often faster than input for large transfers Skip: line 3 to 17 of p. 485 Operating System Principles 11. 39 Silberschatz, Galvin and Gagne © 2005

11. 7 Recovery n Consistency checking – compares data in directory structure with data blocks on disk, and tries to fix inconsistencies fsck in UNIX, chkdsk in MS-DOS l A special program is run at reboot time to check for and correct disk inconsistencies l The allocation and free-space management algorithms dictate what type of problems the checker can find and how successful it will fix them l n Use system programs to back up data from disk to another storage device (floppy disk, magnetic tape, other magnetic disk, optical) Full back and incremental backup l Full back and each day back up all files that have changed since the full backup l Full back may have to saved forever l n Recover lost file or disk by restoring data from backup Operating System Principles 11. 40 Silberschatz, Galvin and Gagne © 2005