File Systems and Disk Management Sarah Diesburg Operating
File Systems and Disk Management Sarah Diesburg Operating Systems CS 3430
Design Goals of File Systems Physical reality File system abstraction Block-oriented Byte-oriented Physical sectors Named files No protection Users protected from one another Robust to machine failures Data might be corrupted if machine crashes
File System Components n Disk management organizes disk blocks into files n Naming provides file names and directories to users, instead of tracks and sector numbers (e. g. Diesburg) n Protection keeps information secure from other users n Reliability protects information loss due to system crashes
User vs. System View of a File n User level: individual files n System call level: collection of bytes n Operating system level: n A block is a logical transfer unit n n Even for getc() and putc() n 4 Kbytes under UNIX A sector is a physical transfer unit n 512 -byte sectors on disks n File: a named collection of blocks
User vs. System View of a File n A process n Read bytes 2 to 12 n OS n Fetch the block containing those bytes n Return those bytes to the process
User vs. System View of a File n A process n Write bytes 2 to 12 n OS n Fetch the block containing those bytes n Modify those bytes n Write out the block
Ways to Access a File n People use file systems n Design of file systems involves understanding how people use file systems Sequential access—bytes are accessed in order n Random access (direct access)—bytes are accessed in any order n Content-based access—bytes are accessed according to constraints on bye contents n n e. g. , return 100 bytes starting with “recipe”
File Usage Patterns n Most files are small, and most references are to small files n e. g. , . login and. c files n Large files use up most of the disk space n e. g. , mp 3 files n Large files account for most of the bytes transferred between memory and disk n Bad news for file system designers
File System Design Constraints n High performance n Efficient access of small files n n n Many small files Used frequently Efficient access of large files n n Consume most disk space Account for most of the data movement
Some Definitions n A file contains a file header, which associates the file with its disk sectors name data block location File header
Some Definitions n A file system needs a disk allocation bitmap to represent free space on the disk, one bit per block
Disk Allocation Policies n Contiguous allocation n Link-list allocation n Segment-based allocation n Indexed allocation n Multi-level indexed allocation n Hashed allocation
Contiguous Allocation n File blocks are stored contiguously on disk n To allocate a file, n Specify the file size n Search the disk allocation bitmap for consecutive free blocks data block location number of blocks File header
Pros and Cons of Contiguous Allocation + Fast sequential access + Ease of computing random file locations n Adding an offset to the first disk block location - External fragmentation - Difficulty in growing files
Linked-List Allocation n Each file block on a disk is associated with a pointer to the next block n A special marker to indicate the end of the file n e. g. , MS-DOS file system n File attribute table (FAT) data block location next block entry File header
Pros and Cons of Linked-List Allocation + Files can grow dynamically with incremental allocation of blocks - Sequential access may suffer n Blocks may not be contiguous - Horrible random accesses n May involve multiple sequential searches - Unreliable n A corrupted pointer can lead to loss of the remaining file
Indexed Allocation n Uses a preallocated index to directly track the file block locations data block location File header
Pros and Cons of Indexed Allocation + Fast lookups and random accesses - File blocks may be scattered all over the disk Poor sequential access n Needs defragmenter n - Needs to reallocate index as the file size increases
Segment-Based Allocation n Needs a segment table to allocate multiple, contiguous regions of blocks begin, end blocks File header
Pros and Cons of Segment-Based Allocation + Relax the requirements for large contiguous disk regions - Fragmentation 100% n Segment-based allocation Indexed allocation - Random accesses not as fast as pure contiguous allocation
Multilevel Indexed Allocation n Certain index entries point to index blocks, as opposed to data blocks (e. g. , Linux ext 2) data block location 12 index block location File header
Multilevel Indexed Allocation n A single indirect block contains pointers to data blocks n A double indirect block contains pointers to single indirect blocks n A triple indirect block contains pointers to double indirect blocks
Pros and Cons of Multilevel Indexed Allocation + Optimized for small and large files Small files accessed through the first 12 pointers n Large files can grow incrementally n - Multiple disk accesses to fetch a data block under triple indirect block - Largest file size capped by the number of pointers - Arbitrary file size boundaries among levels
Hashed Allocation n Allocates a disk block by hashing the block content to a disk location data block location Old file header New file header
Pros and Cons of Hashed Allocation + File blocks of the same content can share the same disk block to save storage n e. g. , empty blocks + Good for backups and archival n Small modifications to a large file result in only additional storage of the changes - Poor disk performance
- Slides: 25