Chapter 14 File System Implementation Operating System Concepts

  • Slides: 38
Download presentation
Chapter 14: File System Implementation Operating System Concepts – 10 th Edition Silberschatz, Galvin

Chapter 14: File System Implementation Operating System Concepts – 10 th Edition Silberschatz, Galvin and Gagne © 2018

Chapter 14: File System Implementation n File-System Structure n File-System Operations n Directory Implementation

Chapter 14: File System Implementation n File-System Structure n File-System Operations n Directory Implementation n Allocation Methods n Free-Space Management n Efficiency and Performance n Recovery n Example: WAFL File System Operating System Concepts – 10 th Edition 14. 2 Silberschatz, Galvin and Gagne © 2018

Objectives n To describe the details of implementing local file systems and directory structures

Objectives n To describe the details of implementing local file systems and directory structures n To describe the implementation of remote file systems n To discuss block allocation and free-block algorithms and trade-offs Operating System Concepts – 10 th Edition 14. 3 Silberschatz, Galvin and Gagne © 2018

File-System Structure n File structure l Logical storage unit l Collection of related information

File-System Structure n File structure l Logical storage unit l Collection of related information n File system resides on secondary storage (disks) n Two characteristics of disks make them convenient for file system: 1. A disk can be rewritten in place; it is possible to read a block from the disk, modify the block, and write it back into the same block. 2. A disk can access directly any block of information it contains. 4 Thus, it is simple to access any file either sequentially or randomly, and switching from one file to another requires the drive moving the read– write heads and waiting for the media to rotate. n Nonvolatile memory (NVM) devices are increasingly used for file storage and thus as a location for file systems. l They differ from hard disks in that they cannot be rewritten in place and they have different performance characteristics. Operating System Concepts – 10 th Edition 14. 4 Silberschatz, Galvin and Gagne © 2018

File-System Structure n Disk provides in-place rewrite and random access l I/O transfers performed

File-System Structure n Disk provides in-place rewrite and random access l I/O transfers performed in blocks of sectors (usually 512 or 4096 bytes) l NVM devices usually have blocks of 4, 096 bytes n File systems l Provides efficient and convenient access to disk by allowing data to be stored, located, and retrieved easily l Interface to user involves defining a file and its attributes, the operations allowed on a file, and the directory structure for organizing files. l Algorithms and data structures to map the logical file system to physical secondary storage devices n File control block – storage structure consisting of information about a file n File system organized into layers n Each level in the design uses the features of lower levels to create new features for use by higher levels. n Device driver controls the physical device Operating System Concepts – 10 th Edition 14. 5 Silberschatz, Galvin and Gagne © 2018

Layered File System n The I/O Control level consists of device drivers and interrupt

Layered File System n The I/O Control level consists of device drivers and interrupt handlers to transfer information between the main memory and the disk system. n A device driver can be thought of as a translator. n Its input consists of high-level commands, such as “retrieve block 123” , “read drive 1, cylinder 72, track 2, sector 10, into memory location 1060” n Its output consists of low-level, hardware-specific instructions that are used by the hardware controller, which interfaces the I/O device to the rest of the system. n The device driver usually writes specific bit patterns to special locations in the I/O controller's memory to tell the controller which device location to act on and what actions to take. Operating System Concepts – 10 th Edition 14. 6 Silberschatz, Galvin and Gagne © 2018

File System Layers n Basic file system (called the “block I/O subsystem” in Linux)

File System Layers n Basic file system (called the “block I/O subsystem” in Linux) needs only to issue generic commands to the appropriate device driver to read and write blocks on the storage device. l It is also concerned with I/O request scheduling. l Also manages memory buffers and caches (allocation, freeing, replacement) l Buffers hold data in transit l Caches hold frequently used file-system metadata to improve performance n File organization module understands files, logical address, and physical blocks n Translates logical block # to physical block # n Manages free space, disk allocation Operating System Concepts – 10 th Edition 14. 7 Silberschatz, Galvin and Gagne © 2018

File System Layers (Cont. ) n Logical file system manages metadata information l Metadata

File System Layers (Cont. ) n Logical file system manages metadata information l Metadata includes all of the file-system structure except the actual data (or contents of the files). l Translates file name into file number, file handle, location by maintaining file control blocks (inodes in UNIX) l A file-control block (FCB) (an inode in UNIX file systems) contains information about the file, including ownership, permissions, and location of the file contents. l Directory management l Protection n Layering useful for reducing complexity, redundancy, and duplication of code, but adds overhead and can decrease performance n Logical layers can be implemented by any coding method according to OS designer Operating System Concepts – 10 th Edition 14. 8 Silberschatz, Galvin and Gagne © 2018

File System Layers (Cont. ) n Many file systems, sometimes many within an operating

File System Layers (Cont. ) n Many file systems, sometimes many within an operating system l Each with its own format 4 CD-ROM is ISO 9660; 4 Unix has UNIX File System (UFS), based on the Berkley Fast File System (FFS); 4 Windows has FAT, FAT 32, NTFS (or Windows. NT File System), as well as CD-ROM, DVD and Blu-ray file-system formats 4 Linux supports more than 130 types, the standard Linux file system is known as the extended file system, with the most common versions being ext 3 and ext 4; 4 There also distributed file systems in which a file system on a server is mounted by one or more client computers across a network. l New ones still arriving – ZFS, Google. FS, Oracle ASM, FUSE Operating System Concepts – 10 th Edition 14. 9 Silberschatz, Galvin and Gagne © 2018

File-System Operations n We have system calls at the API level, but how do

File-System Operations n We have system calls at the API level, but how do we implement their functions? l On-storage and in-memory structures n Boot control block (per volume) contains info needed by system to boot OS from that volume l Needed if volume contains OS, usually first block of volume l In UFS, it is called the boot block. In NTFS, it is the partition boot sector. n Volume control block (per volume) contains volume details l Total # of blocks, # of free blocks, block size, free block pointers or array, and a free-FCB count and FCB pointers. l In UFS, this is called a superblock. In NTFS, it is stored in the master file table. n A directory structure (per file system) is used to organize the files. l In UFS, this includes file names and associated inode numbers. l In NTFS, it is stored in the master file table. Operating System Concepts – 10 th Edition 14. 10 Silberschatz, Galvin and Gagne © 2018

File-System Operation (Cont. ) n A per-file File Control Block (FCB) contains many details

File-System Operation (Cont. ) n A per-file File Control Block (FCB) contains many details about the file l inode number, permissions, size, dates l NFTS stores into in master file table using relational DB structures n To create a new file, a process calls the logical file system. l The logical file system knows the format of the directory structures. l It allocates a new FCB. l The system then reads the appropriate directory into memory, updates it with the new file name and FCB, and writes it back to the file system. File Control Block (FCB) Operating System Concepts – 10 th Edition 14. 11 Silberschatz, Galvin and Gagne © 2018

In-Memory File System Structures n The in-memory information is used for both file-system management

In-Memory File System Structures n The in-memory information is used for both file-system management and performance improvement via caching. n The data are loaded at mount time, updated during file-system operations, and discarded at dismount. n An in-memory mount table contains information about each mounted volume. n An in-memory directory-structure cache holds the directory information of recently accessed directories. n The system-wide open-file table contains a copy of the FCB of each open file, as well as other information. n The per-process open-file table contains pointers to the appropriate entries in the system-wide open-file table, as well as other information, for all files the process has open. n Buffers hold file-system blocks when they are being read from or written to a file system. Operating System Concepts – 10 th Edition 14. 12 Silberschatz, Galvin and Gagne © 2018

In-Memory File System Structures n UNIX treats a directory exactly the same as a

In-Memory File System Structures n UNIX treats a directory exactly the same as a file, one with a “type” field indicating that it is a directory. n Windows implement separate system calls for files and directories and treat directories as entities separate from files. n The logical file system can call the file-organization module to map the directory I/O into storage block locations, which are passed on to the basic file system and I/O control system. n The open() call passes a file name to the logical file system. l Search the system-wide open-file table to see if the file is already in use l If it is, a per-process open-file table entry is created pointing to the existing system-wide open-file table. l If the file is not already open, the directory structure is searched for the given file name. l FCB of the file is copied into a system-wide open-file table in memory, which also tracks the number of processes that have the file open. l Parts of the directory structure are usually cached in memory Operating System Concepts – 10 th Edition 14. 13 Silberschatz, Galvin and Gagne © 2018

In-Memory File System Structures l Next, an entry is made in the per-process open-file

In-Memory File System Structures l Next, an entry is made in the per-process open-file table, with a pointer to the entry in the system-wide open-file table and some other fields. l The open() call returns a pointer to the appropriate entry in the per-process file-system table. l UNIX systems refer to the entry as a file descriptor; Windows refers to it as a file handle. n When a process closes the file, the per-process table entry is removed, and the system-wide entry's open count is decremented. n When all users that have opened the file close it, any updated metadata are copied back to the disk-based directory structure, and the system-wide open-file table entry is removed. Operating System Concepts – 10 th Edition 14. 14 Silberschatz, Galvin and Gagne © 2018

In-Memory File System Structures In-memory file-system structures. (a) File open. (b) File read. Operating

In-Memory File System Structures In-memory file-system structures. (a) File open. (b) File read. Operating System Concepts – 10 th Edition 14. 15 Silberschatz, Galvin and Gagne © 2018

Directory Implementation n Linear list of file names with pointer to the data blocks

Directory Implementation n Linear list of file names with pointer to the data blocks l Simple to program l Time-consuming to execute 4 Linear search time 4 Could keep ordered alphabetically via linked list or use B+ tree l create a new file - search the directory, then add a new entry at the end of the directory. l delete a file - search the directory for the named file and then release the space allocated to it. l To reuse the directory entry: 4 mark the entry as unused, by assigning it a special name, such as an all-blank name, assigning it an invalid inode number (such as 0), or by including a used–unused bit in each entry 4 attach it to a list of free directory entries. 4 copy the last entry in the directory into the freed location and to decrease the length of the directory. Operating System Concepts – 10 th Edition 14. 16 Silberschatz, Galvin and Gagne © 2018

Directory Implementation n Hash Table – linear list with hash data structure l Decreases

Directory Implementation n Hash Table – linear list with hash data structure l Decreases directory search time l Collisions – situations where two file names hash to the same location l Only good if entries are fixed size. 4 For larger hash table, a new hash function required to map file names to the larger range, and reorganize the existing directory entries to reflect their new hash-function values. l Chained-overflow method, where each hash entry can be a linked list instead of an individual value Operating System Concepts – 10 th Edition 14. 17 Silberschatz, Galvin and Gagne © 2018

Allocation Methods - Contiguous n An allocation method refers to how disk blocks are

Allocation Methods - Contiguous n An allocation method refers to how disk blocks are allocated for files: n Contiguous allocation – each file occupies set of contiguous blocks l Best performance in most cases l The number of disk seeks and seek time is minimal l Simple – only starting location (block #) and length (number of blocks) are required l Fast sequential access, easy direct (random) access l Problems include finding space for file, knowing file size, external fragmentation, need for compaction off-line (downtime) or on-line (performance penalty) l How much file size – difficult to estimate, if file size known, pre-allocation is inefficient as it may lead to internal fragmentation l Modified scheme – initial allocation, then add extents file a (base=1, len=3) Operating System Concepts – 10 th Edition what happens if file c needs 2 sectors? ? ? file b (base=5, len=2) 14. 18 Silberschatz, Galvin and Gagne © 2018

Contiguous Allocation n Mapping from logical to physical Q LA/512 R Block to be

Contiguous Allocation n Mapping from logical to physical Q LA/512 R Block to be accessed = Q + starting address Displacement into block = R Operating System Concepts – 10 th Edition 14. 19 Silberschatz, Galvin and Gagne © 2018

Extent-Based Systems n Many newer file systems (i. e. , Veritas File System) use

Extent-Based Systems n Many newer file systems (i. e. , Veritas File System) use a modified contiguous allocation scheme n Extent-based file systems allocate disk blocks in extents n An extent is a contiguous block of disks l Extents are allocated for file allocation l A file consists of one or more extents Operating System Concepts – 10 th Edition 14. 20 Silberschatz, Galvin and Gagne © 2018

Allocation Methods - Linked n Linked allocation – each file a linked list of

Allocation Methods - Linked n Linked allocation – each file a linked list of blocks l File ends at nil pointer l No external fragmentation l Each block contains pointer to next block l No compaction, external fragmentation l Free space management system called when new block needed l Improve efficiency by clustering blocks into groups but increases internal fragmentation l Reliability can be a problem, lose block, lose rest of the file l Locating a block can take many I/Os and disk seeks l Can grow files dynamically and free list is managed same as file l Sequential access: seek between each block; Random access: horrible how do you find the last block in a? file a (base=1) Operating System Concepts – 10 th Edition file b (base=5) 14. 21 Silberschatz, Galvin and Gagne © 2018

Example: DOS FS (simplified) n Used linked list; but instead of embedding links in

Example: DOS FS (simplified) n Used linked list; but instead of embedding links in pages, they used a separate structure called the File Allocation Table (FAT). FAT (16 -bit entries) Directory (5) a: 6 b: 2 0 1 free eof 2 1 3 eof 4 3 5 eof 6 4. . . file a 6 4 3 file b 2 1 n The FAT has an entry for each block on the disk and the entries corresponding to the blocks of a particular file are linked up. Operating System Concepts – 10 th Edition 14. 22 Silberschatz, Galvin and Gagne © 2018

FAT discussion n Space overhead of FAT is trivial: l 2 bytes / 512

FAT discussion n Space overhead of FAT is trivial: l 2 bytes / 512 byte block = ~. 4% (Compare to Unix) n Reliability: how to protect against errors? l Create duplicate copies of FAT on disk. l State duplication a very common theme in reliability n Bootstrapping: where is root directory? l Fixed location on disk: FAT (opt) FAT root dir … FAT (MS-DOS) Operating System Concepts – 10 th Edition 14. 23 Silberschatz, Galvin and Gagne © 2018

Linked Allocation n Each file is a linked list of disk blocks: blocks may

Linked Allocation n Each file is a linked list of disk blocks: blocks may be scattered anywhere on the disk block = pointer n Mapping Q LA/511 R Block to be accessed is the Qth block in the linked chain of blocks representing the file. Displacement into block = R + 1 Operating System Concepts – 10 th Edition 14. 24 Silberschatz, Galvin and Gagne © 2018

Linked Allocation Operating System Concepts – 10 th Edition 14. 25 Silberschatz, Galvin and

Linked Allocation Operating System Concepts – 10 th Edition 14. 25 Silberschatz, Galvin and Gagne © 2018

File-Allocation Table Operating System Concepts – 10 th Edition 14. 26 Silberschatz, Galvin and

File-Allocation Table Operating System Concepts – 10 th Edition 14. 26 Silberschatz, Galvin and Gagne © 2018

Allocation Methods - Indexed n Indexed allocation l Each file has its own index

Allocation Methods - Indexed n Indexed allocation l Each file has its own index block(s) of pointers to its data blocks n Logical view Operating System Concepts – 10 th Edition 14. 27 Silberschatz, Galvin and Gagne © 2018

Example of Indexed Allocation Operating System Concepts – 10 th Edition 14. 28 Silberschatz,

Example of Indexed Allocation Operating System Concepts – 10 th Edition 14. 28 Silberschatz, Galvin and Gagne © 2018

Indexed files (Nachos, VMS) n system allocates a file header to hold an array

Indexed files (Nachos, VMS) n system allocates a file header to hold an array of pointers big enough to point to file size number of blocks. File header Disk blocks Null n Pros & Cons: l + Can easily grow up to space allocated for descriptor l + Random access is fast l – Clumsy to grow file bigger than table size l – Still lots of seeks: blocks can be spread all over the disk, so sequential access is slow. Operating System Concepts – 10 th Edition 14. 29 Silberschatz, Galvin and Gagne © 2018

Indexed Allocation (Cont. ) n Need index table n Random access n Dynamic access

Indexed Allocation (Cont. ) n Need index table n Random access n Dynamic access without external fragmentation, but have overhead of index block n Mapping from logical to physical in a file of maximum size of 256 K bytes and block size of 512 bytes. We need only 1 block for index table Q LA/512 R Q = displacement into index table R = displacement into block Operating System Concepts – 10 th Edition 14. 30 Silberschatz, Galvin and Gagne © 2018

Indexed Allocation – Mapping (Cont. ) n Mapping from logical to physical in a

Indexed Allocation – Mapping (Cont. ) n Mapping from logical to physical in a file of unbounded length (block size of 512 words) n Linked scheme – Link blocks of index table (no limit on size) Q 1 LA / (512 x 511) R 1 Q 1 = block of index table R 1 is used as follows: Q 2 R 1 / 512 R 2 Q 2 = displacement into block of index table R 2 displacement into block of file: Operating System Concepts – 10 th Edition 14. 31 Silberschatz, Galvin and Gagne © 2018

Indexed Allocation – Mapping (Cont. ) n Two-level index (4, 096 -byte blocks could

Indexed Allocation – Mapping (Cont. ) n Two-level index (4, 096 -byte blocks could store 1, 024 four-byte pointers in outer index -> 1, 048, 576 data blocks and file size of up to 4 GB) Q 1 LA / (512 x 512) R 1 Q 1 = displacement into outer-index R 1 is used as follows: Q 2 R 1 / 512 R 2 Q 2 = displacement into block of index table R 2 displacement into block of file: Operating System Concepts – 10 th Edition 14. 32 Silberschatz, Galvin and Gagne © 2018

Indexed Allocation – Mapping (Cont. ) Operating System Concepts – 10 th Edition 14.

Indexed Allocation – Mapping (Cont. ) Operating System Concepts – 10 th Edition 14. 33 Silberschatz, Galvin and Gagne © 2018

Multilevel indexed (UNIX 4. 1) n Key idea: efficient for small files, but still

Multilevel indexed (UNIX 4. 1) n Key idea: efficient for small files, but still allow big files n File header contains 13 pointers l fixed size table, pointers not all equivalent l the header is called an “inode” in UNIX n First 10 are pointers to data blocks. l If file is small enough, some pointers will be NULL. Operating System Concepts – 10 th Edition 14. 34 Silberschatz, Galvin and Gagne © 2018

Multilevel indexed (UNIX 4. 1) File header 1 2 3 11 12 13 1

Multilevel indexed (UNIX 4. 1) File header 1 2 3 11 12 13 1 2 Disk blocks 11 266 267 256 256 256 Operating System Concepts – 10 th Edition 256 256 14. 35 Silberschatz, Galvin and Gagne © 2018

Multilevel indexed (UNIX 4. 1) n What if you allocate 11 th block? l

Multilevel indexed (UNIX 4. 1) n What if you allocate 11 th block? l Pointer to an indirect block – a block of pointers to data blocks. l Gives us 256 blocks, + 10 (from file header) = 1/4 MB n What if you allocate a 267 th block? l Pointer to a doubly indirect block – a block of pointers to indirect blocks (in turn, block of pointers to data blocks). l Gives us about 64 K blocks => 64 MB n What if want a file bigger than this? l One last pointer – what should it point to? l Instead, pointer to triply indirect block – block of pointers to doubly indirect blocks (which are. . . ) Operating System Concepts – 10 th Edition 14. 36 Silberschatz, Galvin and Gagne © 2018

Multilevel indexed (UNIX 4. 1) n Thus, file header is: l First 10 data

Multilevel indexed (UNIX 4. 1) n Thus, file header is: l First 10 data block pointers – point to one block each, so 10 blocks l 11 indirect block pointer – points to 256 blocks l 12 doubly indirect block pointer – points to 64 K blocks l 13 triply indirect block pointer – points to 16 M blocks 1. Bad news: Still an upper limit on file size ~ 16 GB. 2. Pointers get filled in dynamically: need to allocate indirect block only when file grows > 10 blocks. l 3. If small file, no indirection needed. How many disk accesses to reach block #23? (2) Operating System Concepts – 10 th Edition 14. 37 Silberschatz, Galvin and Gagne © 2018

Multilevel indexed (UNIX 4. 1) n How about block # 5? (1) n How

Multilevel indexed (UNIX 4. 1) n How about block # 5? (1) n How about block # 340? (3) n UNIX Pros & Cons: l + Simple (more or less) l + Files can easily expand (up to a point) l + Small files particularly cheap and easy l – Very large files, spend lots of time reading indirect blocks l – Lots of seeks Operating System Concepts – 10 th Edition 14. 38 Silberschatz, Galvin and Gagne © 2018