Operating Systems ECE 344 File System Design Ashvin































- Slides: 31

Operating Systems ECE 344 File System Design Ashvin Goel ECE University of Toronto

File System Design q OS needs to store and retrieve files and directories o q q Needs to maintain information about where they are stored Needs to store files durably, i. e. , ensure that files exist after machine reboot Needs to survive machine crashes On a crash, OS stops suddenly, perhaps in the middle of a file system operation o On restart, the file system should be able to recover data and bring the file system back to a good or consistent state o 2

Recall: Key resources q Storage Media --- blocks q Files q Directories q Buffer cache q Open file structures 3

File Block Allocation q Block allocation o Maps, potentially non-contiguous, blocks to the file File: q Options Disk blocks Contiguous allocation o Linked list allocation o File allocation table (FAT) o I-node based allocation o 4

Contiguous Allocation q All blocks in a file are contiguous on the disk After deleting D, F q File identified by index to first page + size 5

When to Use Contiguous Allocation q Advantages o q Performance is good for sequential reading Disadvantages File growth requires copying o Disk becomes fragmented after deletion o Will need periodic compaction o q Good for CD-ROMs All file sizes are known in advance o Files are never deleted o 6

Linked List Allocation q Each file is a linked list of blocks q First word in a block contains number of next block q Disadvantage o Random accesses to file data are slow 7

File Allocation Table (FAT) q q Keep linked list information in memory Uses an index table with one entry per disk block o q Advantages o q Each entry contains the address of the next block Random access needs linked-list search, which is now in memory Disadvantages Entire table stored in memory o doesn’t scale with large file systems (1 TB disk 1 GB FAT) o End of file marker 8

Inode Based Allocation q q q Linked list allocation spreads index information on disk, slowing random access FAT keeps linked-list index information in memory but that limits size of file system Idea Store index information for locating the blocks of a file close together on disk o Cache this information in memory when file is opened o This approach avoids the problems above o q Problem with the idea The index information may grow with file growth o It cannot be stored contiguously o 9

Inode Based Allocation q Use a tree to store index information o q Tree structure allows growth of index information, without spreading this information too much Root of tree is called inode (index node) Inode is stored on disk o There is one inode per file or directory o 10

Inode Structure q Twelve direct block pointers o q q Point directly to file data blocks (called direct or data blocks) o One indirect pointer o Points to an indirect block that contains pointers to direct blocks One double indirect pointer q Points to a double indirect block that contains pointers to indirect blocks One triple indirect pointer o Points to a triple indirect block that contains pointers to double indirect blocks (not shown below) Why this allocation strategy? Not shown: triply indirect block 1 K * 1 K blocks= 1 M blocks 11

Inodes q Each inode has the same size q Inodes are organized in an array (on disk) q The index into the array – called i-number – is used to identify the inode and correspondingly the file. o effectively is the "low-level" name of the file o 12

Maximum File Size q Say block size is 4 KB and block ptr size is 4 bytes q So each block can hold 1024 block pointers q Max number of blocks in a file: sum of: 12 direct blocks o 1024 blocks via indirect block pointer o 1024 * 1024 blocks via double indirect block pointer o 1024 * 1024 blocks via triple indirect block pointer o q Max file size o (12 + 10242 + 10243) * 4 KB ≈ 2 10*3 * 4 * 210 = 242 = 4 TB q Problem: 8 TB disks available today! 13

Outline q Storage Media --- blocks q Files q Directories q Buffer cache q Open file structures 14

Directory Management q A directory contains zero or more entries One entry per file or sub-directory that resides in the directory o Directory entries are kept in directory data blocks o Unix: directory stored in a (special) file with inode o q Entry maps file names to location of starting block, has File name, file attributes o Block number of first block of the file o Kernel. C attributes Block Nr. Kernel. h attributes Block Nr. os attributes Block Nr. Data blocks … 15

Unix Directories q In Unix, each entry has (File name, Inode number) o Inode number helps locate i-node of the file o Inode contains file attributes o

File Names q Short, Fixed Length Names o MS-DOS/Windows § FILE 3. BAK (8+3) § Name has 11 bytes o Original Unix § Name has 14 bytes 17

File Names q q Variable Length Names E. g. , Unix Each name can be 4096 bytes o Size of directory entry is variable o q Options o Entries are allocated contiguously § Each entry has length of entry and then name of file name § Fragmentation occurs when files are removed o Allocate set of pointers to file names in the beginning of the directory § Use heap at the end to store names 18

Path Lookup in Unix FS q Say File F located in directory /D 1/D 2 has to be read q What blocks need to be read from disk? 19

Path Lookup in Unix FS q Say File F located in directory /D 1/D 2 has to be read q What blocks need to be read from disk? o Super block (provides location of inode blocks area) § Normally this block is read when a file system code performs initialization and this block is cached in memory o o o o Inode of the / directory (from the inode blocks area) Data blocks of / directory (provides directory entry for D 1) Inode of the D 1 directory Data blocks of D 1 directory (provides directory entry for D 2) Inode of the D 2 directory Data blocks of the D 2 directory (provides directory entry for F) Inode of F file Data blocks of F File 20

Outline q Storage Media --- blocks q Files q Directories q Buffer cache q Open file structures 21

Buffer Cache Management q Notice each file access requires many block accesses q File operations often access the same disk block o E. g. , block containing contents of root (/) directory q Caching disk blocks in memory can reduce disk I/O q Traditionally block cache is called a buffer cache q Cache operations o Block lookup § If block in memory, returns data from buffer o Block miss § Read disk block into buffer, update buffer cache o Block flush § If buffer is modified, write it back to disk block 22

Buffer Cache Organization q Many blocks can be cached in memory With 16 GB machine, say 8 GB for buffer cache o Block size = 4 K, nr. of blocks cached = 2 M o q Use a hash table to lookup block in memory efficiently key Device Block # Disk blocks in memory 23

Buffer Cache Write Policy q q When an application writes to a file, the corresponding block is updated in the buffer cache When is the disk block updated? o Immediately (synchronously) § Write-through cache § Very slow o Later (asynchronously) § Write-back cache § Fast, but what if system crashes? § File system can become inconsistent because some blocks in memory are not on disk § We discuss this problem in detail later 24

Buffer Cache Issues q Buffer cache typically has limited size, so we need replacement algorithms o q Typically, LRU is used Buffer cache competes with virtual memory system How many frames to allocate for buffer cache vs. virtual memory? o Some systems use a unified memory cache for buffer cache and virtual memory pages o § The blocks of the buffer cache and pages in the page cache are part of a unified caching scheme § However, if a program reads a large file, then it affects programs that are not accessing files much 25

Read Ahead q q q Applications often read files sequentially File system can predict that a process will request a file block after the one that is requesting File system prefetches next block from disk Also, called read ahead o Note that the next block may not be allocated sequentially o q If process requests next block, it will be in cache o Allows operlapping IO with execution 26

Outline q Storage Media --- blocks q Files q Directories q Buffer cache Excursion: Unix File system Layout q Open file structures 27

Unix File System Layout Unix File System 28

Block Placement q q Block placement is the policy used by file system for block allocation Original Unix file system had two placement problems o Data blocks allocated randomly in “aging” file systems § Blocks for file allocated sequentially when file system is new § As file system fills, blocks are allocated from deleted files § Deleted files may be randomly placed § So, blocks for new files become scattered across disk o Inodes allocated far from blocks § All inodes at beginning of disk, far from data § Traversing file name paths, manipulating files, directories requires going back and forth from inodes to data blocks q Both of these problems generate many long seeks 29

BSD Fast File System q BSD Unix redesigned Unix FS o q New FS called Fast File System (FFS) Disk partitioned into groups of cylinders Recall, cylinder is the same track across platters o Cylinder group consists of contiguous cylinders o q Placement policy: place these in same cylinder group Inode, data blocks of a file o Files in a directory o If cylinder group is full, place in nearby group o 30

Think Time q q What are the benefits/drawbacks of using inodes in a Unix file system vs. the FAT file system? What were the problems in the Unix file system that led to the FFS design? 31