File Systems and Disk Management Andy Wang Operating

  • Slides: 25
Download presentation
File Systems and Disk Management Andy Wang Operating Systems COP 4610 / CGS 5765

File Systems and Disk Management Andy Wang Operating Systems COP 4610 / CGS 5765

Design Goals of File Systems Physical reality File system abstraction Block-oriented Byte-oriented Physical sectors

Design Goals of File Systems Physical reality File system abstraction Block-oriented Byte-oriented Physical sectors Named files No protection Users protected from one another Robust to machine failures Data might be corrupted if machine crashes

File System Components n Disk management organizes disk blocks into files n Naming provides

File System Components n Disk management organizes disk blocks into files n Naming provides file names and directories to users, instead of tracks and sector numbers (e. g. Takashita) n Protection keeps information secure from other users n Reliability protects information loss due to system crashes

User vs. System View of a File n User level: individual files n System

User vs. System View of a File n User level: individual files n System call level: collection of bytes n Operating system level: n A block is a logical transfer unit n n Even for getc() and putc() n 4 Kbytes under UNIX A sector is a physical transfer unit n 512 -byte sectors on disks n File: a named collection of blocks

User vs. System View of a File n A process n Read bytes 2

User vs. System View of a File n A process n Read bytes 2 to 12 n OS n Fetch the block containing those bytes n Return those bytes to the process

User vs. System View of a File n A process n Write bytes 2

User vs. System View of a File n A process n Write bytes 2 to 12 n OS n Fetch the block containing those bytes n Modify those bytes n Write out the block

Ways to Access a File n People use file systems n Design of file

Ways to Access a File n People use file systems n Design of file systems involves understanding how people use file systems Sequential access—bytes are accessed in order n Random access (direct access)—bytes are accessed in any order n Content-based access—bytes are accessed according to constraints on bye contents n n e. g. , return 100 bytes starting with “aye carumba”

File Usage Patterns n Most files are small, and most references are to small

File Usage Patterns n Most files are small, and most references are to small files n e. g. , . login and. c files n Large files use up most of the disk space n e. g. , mp 3 files n Large files account for most of the bytes transferred between memory and disk n Bad news for file system designers

File System Design Constraints n High performance n Efficient access of small files n

File System Design Constraints n High performance n Efficient access of small files n n n Many small files Used frequently Efficient access of large files n n Consume most disk space Account for most of the data movement

Some Definitions n A file contains a file header, which associates the file with

Some Definitions n A file contains a file header, which associates the file with its disk sectors name data block location File header

Some Definitions n A file system needs a disk allocation bitmap to represent free

Some Definitions n A file system needs a disk allocation bitmap to represent free space on the disk, one bit per block

Disk Allocation Policies n Contiguous allocation n Link-list allocation n Segment-based allocation n Indexed

Disk Allocation Policies n Contiguous allocation n Link-list allocation n Segment-based allocation n Indexed allocation n Multi-level indexed allocation n Hashed allocation

Contiguous Allocation n File blocks are stored contiguously on disk n To allocate a

Contiguous Allocation n File blocks are stored contiguously on disk n To allocate a file, n Specify the file size n Search the disk allocation bitmap for consecutive free blocks data block location number of blocks File header

Pros and Cons of Contiguous Allocation + Fast sequential access + Ease of computing

Pros and Cons of Contiguous Allocation + Fast sequential access + Ease of computing random file locations n Adding an offset to the first disk block location - External fragmentation - Difficulty in growing files

Linked-List Allocation n Each file block on a disk is associated with a pointer

Linked-List Allocation n Each file block on a disk is associated with a pointer to the next block n A special marker to indicate the end of the file n e. g. , MS-DOS file system n File attribute table (FAT) data block location next block entry File header

Pros and Cons of Linked-List Allocation + Files can grow dynamically with incremental allocation

Pros and Cons of Linked-List Allocation + Files can grow dynamically with incremental allocation of blocks - Sequential access may suffer n Blocks may not be contiguous - Horrible random accesses n May involve multiple sequential searches - Unreliable n A corrupted pointer can lead to loss of the remaining file

Indexed Allocation n Uses a preallocated index to directly track the file block locations

Indexed Allocation n Uses a preallocated index to directly track the file block locations data block location File header

Pros and Cons of Indexed Allocation + Fast lookups and random accesses - File

Pros and Cons of Indexed Allocation + Fast lookups and random accesses - File blocks may be scattered all over the disk Poor sequential access n Needs defragmenter n - Needs to reallocate index as the file size increases

Segment-Based Allocation n Needs a segment table to allocate multiple, contiguous regions of blocks

Segment-Based Allocation n Needs a segment table to allocate multiple, contiguous regions of blocks begin, end blocks File header

Pros and Cons of Segment-Based Allocation + Relax the requirements for large contiguous disk

Pros and Cons of Segment-Based Allocation + Relax the requirements for large contiguous disk regions - Fragmentation 100% n Segment-based allocation Indexed allocation - Random accesses not as fast as pure contiguous allocation

Multilevel Indexed Allocation n Certain index entries point to index blocks, as opposed to

Multilevel Indexed Allocation n Certain index entries point to index blocks, as opposed to data blocks (e. g. , Linux ext 2) data block location 12 index block location File header

Multilevel Indexed Allocation n A single indirect block contains pointers to data blocks n

Multilevel Indexed Allocation n A single indirect block contains pointers to data blocks n A double indirect block contains pointers to single indirect blocks n A triple indirect block contains pointers to double indirect blocks

Pros and Cons of Multilevel Indexed Allocation + Optimized for small and large files

Pros and Cons of Multilevel Indexed Allocation + Optimized for small and large files Small files accessed through the first 12 pointers n Large files can grow incrementally n - Multiple disk accesses to fetch a data block under triple indirect block - Largest file size capped by the number of pointers - Arbitrary file size boundaries among levels

Hashed Allocation n Allocates a disk block by hashing the block content to a

Hashed Allocation n Allocates a disk block by hashing the block content to a disk location data block location Old file header New file header

Pros and Cons of Hashed Allocation + File blocks of the same content can

Pros and Cons of Hashed Allocation + File blocks of the same content can share the same disk block to save storage n e. g. , empty blocks + Good for backups and archival n Small modifications to a large file result in only additional storage of the changes - Poor disk performance