File Systems and Disk Management Andy Wang Operating

  • Slides: 27
Download presentation
File Systems and Disk Management Andy Wang Operating Systems COP 4610 / CGS 5765

File Systems and Disk Management Andy Wang Operating Systems COP 4610 / CGS 5765

Design Goals of File Systems Physical reality File system abstraction Block-oriented Byte-oriented Physical sectors

Design Goals of File Systems Physical reality File system abstraction Block-oriented Byte-oriented Physical sectors Named files No protection Users protected from one another Robust to machine failures Data might be corrupted if machine crashes

File System Components n Storage management organizes storage blocks into files n Naming provides

File System Components n Storage management organizes storage blocks into files n Naming provides file names and directories to users, instead of tracks and sector numbers (e. g. Takashita) n Protection keeps information secure from other users n Reliability protects information loss due to system crashes

User vs. System View of a File n User level: individual files n System

User vs. System View of a File n User level: individual files n System call level: collection of bytes n Operating system level: n A block is a logical transfer unit n n Even for getc() and putc() n 4 Kbytes under UNIX A sector is a physical transfer unit n 4 -Kbyte sectors on disks n File: a named collection of blocks

User vs. System View of a File n A process n Reads bytes 2

User vs. System View of a File n A process n Reads bytes 2 to 12 n OS n Fetches the block containing those bytes n Returns those bytes to the process

User vs. System View of a File n A process n Writes bytes 2

User vs. System View of a File n A process n Writes bytes 2 to 12 n OS n Fetches the block containing those bytes n Modifies those bytes n Writes out the block

Ways to Access a File n People use file systems n Design of file

Ways to Access a File n People use file systems n Design of file systems involves understanding how people use file systems Sequential access—bytes are accessed in order n Random access (direct access)—bytes are accessed in any order n Content-based access—bytes are accessed according to constraints on byte contents n n e. g. , return 100 bytes starting with “aye caramba”

File Usage Patterns n Most files are small, and most references are to small

File Usage Patterns n Most files are small, and most references are to small files n e. g. , . login and. c files n Large files use up most of the disk space n e. g. , mp 4 files n Large files account for most of the bytes transferred between memory and disk n Bad news for file system designers

File System Design Constraints n High performance n Efficient access of small files n

File System Design Constraints n High performance n Efficient access of small files n n n Many small files Used frequently Efficient access of large files n n Consume most disk space Account for most of the data movement

Some Definitions n A file contains a file header, which associates the file with

Some Definitions n A file contains a file header, which associates the file with its disk sectors name data block location File header

Some Definitions n A file system needs an allocation bitmap to track free space

Some Definitions n A file system needs an allocation bitmap to track free space on the disk, one bit per block

Disk Allocation Policies n Contiguous allocation n Link-list allocation n Segment-based allocation n Indexed

Disk Allocation Policies n Contiguous allocation n Link-list allocation n Segment-based allocation n Indexed allocation n Multi-level indexed allocation n Hashed allocation

Contiguous Allocation n File blocks are stored contiguously on disk n To allocate a

Contiguous Allocation n File blocks are stored contiguously on disk n To allocate a file, n Specify the file size n Search the disk allocation bitmap for consecutive free blocks data block location number of blocks File header

Pros and Cons of Contiguous Allocation + Fast sequential access + Ease of computing

Pros and Cons of Contiguous Allocation + Fast sequential access + Ease of computing random file locations n Adding an offset to the first disk block location - External fragmentation - Difficulty in growing files

Linked-List Allocation n Each file block on a disk is associated with a pointer

Linked-List Allocation n Each file block on a disk is associated with a pointer to the next block n A special marker to indicate the end of the file n e. g. , MS-DOS file system n File attribute table (FAT) data block location next block entry File header

Pros and Cons of Linked-List Allocation + Files can grow dynamically with incremental allocation

Pros and Cons of Linked-List Allocation + Files can grow dynamically with incremental allocation of blocks - Sequential access may suffer n Blocks may not be contiguous - Horrible random accesses n May involve multiple sequential searches - Unreliable n A corrupted pointer can lead to loss of the remaining file

Indexed Allocation n Uses a preallocated index to directly track the file block locations

Indexed Allocation n Uses a preallocated index to directly track the file block locations data block location File header

Pros and Cons of Indexed Allocation + Fast lookups and random accesses - File

Pros and Cons of Indexed Allocation + Fast lookups and random accesses - File blocks may be scattered all over the disk Poor sequential access (for disks) n Needs defragmenter n - Needs to reallocate index as the file size increases

Segment-Based Allocation n Needs a segment table to allocate multiple, contiguous regions of blocks

Segment-Based Allocation n Needs a segment table to allocate multiple, contiguous regions of blocks begin, end blocks File header

Pros and Cons of Segment-Based Allocation + Relax the requirements for large contiguous disk

Pros and Cons of Segment-Based Allocation + Relax the requirements for large contiguous disk regions - Fragmentation 100% n Segment-based allocation Indexed allocation - Random accesses not as fast as pure contiguous allocation

Multilevel Indexed Allocation n Certain index entries point to index blocks, as opposed to

Multilevel Indexed Allocation n Certain index entries point to index blocks, as opposed to data blocks (e. g. , Linux ext 3) data block location 12 index block location File header

Multilevel Indexed Allocation n A single indirect block contains pointers to data blocks n

Multilevel Indexed Allocation n A single indirect block contains pointers to data blocks n A double indirect block contains pointers to single indirect blocks n A triple indirect block contains pointers to double indirect blocks

Pros and Cons of Multilevel Indexed Allocation + Optimized for small and large files

Pros and Cons of Multilevel Indexed Allocation + Optimized for small and large files Small files accessed via the first 12 pointers n Large files can grow incrementally n - Multiple disk accesses to fetch a data block under triple indirect block - File size capped by the number of pointers - Arbitrary file size boundaries among levels

Extent-tree-based Allocation n Uses a tree of segment tables to allocate multiple, contiguous regions

Extent-tree-based Allocation n Uses a tree of segment tables to allocate multiple, contiguous regions of blocks data block extent indexlocation begin, end blocks extent data block indexlocation begin, end blocks File header begin, end blocks

Pros and Cons of Extent-tree-based Allocation + Reduces per-block pointers n E. g. ,

Pros and Cons of Extent-tree-based Allocation + Reduces per-block pointers n E. g. , indirect blocks + Maximum file size no longer limited by the number of pointers - External fragmentation - Complexity

Hashed Allocation n Allocates a disk block by hashing the block content to a

Hashed Allocation n Allocates a disk block by hashing the block content to a disk location data block location Old file header New file header

Pros and Cons of Hashed Allocation + File blocks of the same content can

Pros and Cons of Hashed Allocation + File blocks of the same content can share the same disk block to save storage n e. g. , empty blocks + Good for backups and archival n Small modifications to a large file result in only additional storage of the changes - Poor disk performance