File System Implementations CS4513 Distributed Systems Hugh C
File System Implementations CS-4513 Distributed Systems Hugh C. Lauer Slides include materials from Modern Operating Systems, 3 rd ed. , by Tannenbaum, Operating System Concepts, 7 th ed. , by Silbershatz, Galvin, & Gagne, Distributed Systems: Principles & Paradigms, 2 nd ed. By Tanenbaum and Van Steen, and Distributed Systems: Concepts and Design, 4 th ed. , by Coulouris, et. al. CS-4513, B-Term 2010 File System Implementations 1
This topic — Implementation of Files • Create file abstraction using physical disk devices and disk blocks – Efficient in time, space, use of disk resources – Fast enough for application requirements • Must be scalable to a wide variety of file sizes – Many small files • < 1 page – Huge files • 100’s of gigabytes, terabytes, spanning disks – Everything in between CS-4513, B-Term 2010 File System Implementations 2
Reading Assignment • Tanenbaum § 4. 3 CS-4513, B-Term 2010 File System Implementations 3
Overview • File Allocation methods • Sequential • Linked and FAT • Indexed • • Free blocks & bad blocks Scalability Mounting & Virtual File Systems Directory implementation notes CS-4513, B-Term 2010 File System Implementations 4
Definition — Volume • The fundamental unit of a file system • Physical volume may be a • Physical disk storage device • Physical partition of a single disk (aka minidisk) • Logical Volume is • A physical volume • A combination of other volumes Common block addressing scheme. – Usually similar in size and characteristics • Volume is also used (loosely) to denote • The part of the file system implemented on a physical or logical volume CS-4513, B-Term 2010 File System Implementations 5
File Allocation Schemes • Contiguous – Blocks of file stored in consecutive disk sectors – Directory points to first entry & specifies length • Linked – Blocks of file scattered across disk, as linked list – Directory points to first and last entries • Indexed – Blocks of file scattered across disk – Separate index block contains pointers to file blocks – Directory points to index block CS-4513, B-Term 2010 File System Implementations 6
File Allocation Schemes (continued) • The allocation scheme is an attribute of a file system, not of individual files within a system. • All files within a file system follow same allocation model CS-4513, B-Term 2010 File System Implementations 7
File Allocation Schemes • Contiguous – Blocks of file stored in consecutive disk sectors – Directory points to first entry • Linked – Blocks of file scattered across disk, as linked list – Directory points to first entry • Indexed – Separate index block contains pointers to file blocks – Directory points to index block CS-4513, B-Term 2010 File System Implementations 8
Contiguous Allocation • Ideal for large, static files – Databases, fixed system structures, OS code – Multi-media video and audio – CD-ROM, DVD • Simple address calculation – Directory entry points to first sector – File block i disk sector address • Fast multi-block reads and writes – Minimize seeks between blocks CS-4513, B-Term 2010 File System Implementations 9
Contiguously Allocated Files CS-4513, B-Term 2010 File System Implementations 10
Block-to-sector Calculation • To find disk sector containing block i of file f – Starting_block(f) + i • Starting block of each file is named in – Directory, or – File metadata CS-4513, B-Term 2010 File System Implementations 11
File Creation (Contiguous File System) • Search for an empty sequence of blocks – First-fit – Best-fit • Prone to fragmentation when … • Files come and go • Files change size • Similar to physical memory allocation in base-limit type of virtual memory CS-4513, B-Term 2010 File System Implementations 12
What To Do If No Block Large Enough? CS-4513, B-Term 2010 File System Implementations 13
Contiguous Allocation – Extents • Extent: a contiguously allocated subset of a file • Directory entry points to – the extent itself (for file with one extent) – pointer to an extent block describing multiple extents (for file with multiple extents) • Advantages – Speed, ease of address calculation of contiguous file – Avoids (some of) the fragmentation issues – Can be adapted to support files across multiple disks • … CS-4513, B-Term 2010 File System Implementations 14
Contiguous Allocation – Extents • … • Disadvantages – Too many extents degenerates to indexed allocation • As in Unix-like systems, but not so well • Popular in 1960 s & 70 s – OS/360, other systems for commercial data processing • Currently used for large files in NTFS • Rarely mentioned in textbooks • Silbershatz, 7 th ed. , § 11. 4. 1 & 22. 5. 1 • Tanenbaum, p. 276 (solution to DVD problem) CS-4513, B-Term 2010 File System Implementations 15
Digression: Bad Block Management • Bad blocks on disks are inevitable • Part of manufacturing process (less than 1%) • Most are detected during formatting • Occasionally, blocks become bad during operation • Manufacturers typically add extra tracks to disks • Physical capacity = (1 + x%) * rated_capacity • Who handles bad blocks? • Disk controller: – Bad block list maintained internally – Automatically substitutes good blocks • Formatter: – Re-organize track to avoid bad blocks • OS: – Bad block list maintained by OS, bad blocks never used CS-4513, B-Term 2010 File System Implementations 16
Bad Block Management in Contiguous Allocation File Systems • Bad blocks must be concealed • Foul up the block-to-sector calculation • Methods • Look-aside list of bad sectors – Check each sector request against hash table – If present, substitute a replacement sector behind the scenes • Spare sectors in each track, remapped by formatting • Handling • Disk controller, invisible to OS • Lower levels of OS; concealed from higher layers of file system and from application CS-4513, B-Term 2010 File System Implementations 17
Questions? CS-4513, B-Term 2010 File System Implementations 18
File Allocation Schemes • Contiguous – Blocks of file stored in consecutive disk sectors – Directory points to first entry • Linked – Blocks of file scattered across disk, as linked list – Directory points to first entry • Indexed – Separate index block contains pointers to file blocks – Directory points to index block CS-4513, B-Term 2010 File System Implementations 19
Linked Allocation • Blocks scattered across disk • Each block contains pointer to next block • Directory points to first and last blocks • Sector header: 10 16 25 01 – Pointer to next block – ID and block number of file CS-4513, B-Term 2010 File System Implementations 20
Linked Allocation • Advantages – No external fragmentation of file space! – Easy to create, extend files – Ideal for lots of small files • Disadvantages – Lots of disk arm movement – Space taken up by links – Sequential access only! • Random access simulated by caching links • Used in Xerox Alto file system CS-4513, B-Term 2010 File System Implementations 21
Bad Block Management – Linked File Systems • In OS: – format all sectors of disk • Don’t reserve any spare sectors • Allocate bad blocks to a hidden “file” for the purpose • If a block becomes bad, “append” to the hidden file • Advantages • Very simple • No look-aside or sector remapping needed • Totally transparent without any hidden mechanism CS-4513, B-Term 2010 File System Implementations 22
Linked File System – Limitation • To access the ith block, it is necessary to physically read the first (i 1) blocks from disk! • Serious problem for large files! CS-4513, B-Term 2010 File System Implementations 23
Solution – File Allocation Table (FAT) • Instead of link on each block, put all links in one table — the FAT • Each entry corresponds to physical block in disk • ith entry contains link for block i – Each entry points to next Eliminates need for header block on each sector block (or EOF) • Directory points to first & last blocks of file CS-4513, B-Term 2010 File System Implementations 24
FAT File Systems • Advantages – Advantages of Linked File System – FAT can be cached in memory – Searchable at CPU speeds pseudo-random access • Disadvantages – Limited size, not suitable for very large disks – FAT cache describes entire disk, not just open files! – Not fast enough for large databases • Used in MS-DOS, early Windows systems – Also USB Flash drives, floppy disks, etc. CS-4513, B-Term 2010 File System Implementations 25
Bad Block Management – FAT File Systems • Same as Linked File Systems • I. e. , format all sectors of disk • Don’t reserve any spare sectors • Allocate bad blocks to a hidden file for the purpose • If a block becomes bad, append to the hidden file • Same advantages and disadvantages CS-4513, B-Term 2010 File System Implementations 26
Disk Defragmentation • Re-organize blocks in disk so that file is (mostly) contiguous • Link or FAT organization preserved • Purpose: – To reduce disk arm movement during sequential accesses – To permit contiguous reads or writes in one disk operation • Does not change the linked structure of the file system! CS-4513, B-Term 2010 File System Implementations 27
Exam Question (a few years ago) • You have a humongous database stored in a file on a 4 GB flash drive with a FAT file system. What must the file system do to locate block n of the database? • Assume that database has not been defragmented, so that its blocks are likely to be scattered randomly across the flash drive. • Given that the file system has found the location of block n, what must it do to find the location of block n+1? block n-1? CS-4513, B-Term 2010 File System Implementations 28
Questions? Linked and FAT File Systems CS-4513, B-Term 2010 File System Implementations 29
File Allocation Schemes • Contiguous – Blocks of file stored in consecutive disk sectors – Directory points to first entry • Linked – Blocks of file scattered across disk, as linked list – Directory points to first entry • Indexed – Separate index block contains pointers to file blocks – Directory points to index block CS-4513, B-Term 2010 File System Implementations 30
Problems and Issues • Contiguous file systems not suitable for files that come and go rapidly & frequently • Linked file systems not suitable for very large disks or a lot of random access • Can we do better? CS-4513, B-Term 2010 File System Implementations 31
Indexed Allocation • i-node: – Contains file metadata – Lists sector address of each block of file • Advantages – True random access – Only i-nodes of open files need to be cached – Supports small and large files CS-4513, B-Term 2010 File System Implementations 32
Unix/Linux i-nodes • Direct blocks: – Pointers to first n sectors • Single indirect table: – Extra block containing pointers to blocks n+1. . n+m • Double indirect table: – Extra block containing single indirect blocks • … CS-4513, B-Term 2010 File System Implementations 33
Indexed Allocation • Access to every block of file is via i-node • Bad block management – Similar to Linked/FAT systems • Advantages – Faster than Linked or FAT systems – Extremely flexible – Supports large and small files well – Supports files that come and go a lot CS-4513, B-Term 2010 File System Implementations 34
Indexed Allocation (continued) • Advantages (continued) – Especially suited to multi-user environments • Universities; software development teams • Servers with lots of temporary files • Disadvantage – Not as fast as contiguous allocation for large databases and static files • Requires reference to i-node for every access vs. • Simple calculation of block to sector address CS-4513, B-Term 2010 File System Implementations 35
Indexed Allocation (continued) • Widely used in Unix, Linux, Windows NTFS • Robust • Has withstood the test of time • Many variations CS-4513, B-Term 2010 File System Implementations 36
Questions? CS-4513, B-Term 2010 File System Implementations 37
Free Block Management in File Systems • Bitmap – Very compact on disk – Expensive to search – Supports contiguous allocation • Free list – Linked list of free blocks • Each block contains pointer to next free block – Only head of list needs to be cached in memory – Very fast to search and allocate – Contiguous allocation very difficult CS-4513, B-Term 2010 File System Implementations 38
Free Block Management Bit Vector 0 1 2 n-1 … bit[i] = 0 block[i] free 1 block[i] occupied Free block number calculation (number of bits per word) * (number of 0 -value words) + offset of first 1 bit CS-4513, B-Term 2010 File System Implementations See also Fig 4 -22 39
Free Block Management Bit Vector (continued) • Bit map must be kept both in memory and on disk • I. e. , memory copy is a cache of disk copy • Copy in memory and disk may differ • Cache may be newer than disk • Cannot allow block[i] to have bit[i] = 1 in memory & bit[i] = 0 on disk Why? CS-4513, B-Term 2010 File System Implementations 40
Free Block Management Bit Vector (continued) • Solution: – Set bit[i] = 1 in disk – Allocate block[i] – Set bit[i] = 1 in memory – Similarly for set of contiguous blocks • Potential for lost blocks in event of crash! – Discussion: – How do we solve this problem? CS-4513, B-Term 2010 File System Implementations 41
Free Block Management Linked List • Linked list of free blocks – Not necessarily in order! • Cache first few free blocks in memory • Head of list must be stored both – On disk – In memory • Each block must be written to disk when freed • Potential for losing blocks? CS-4513, B-Term 2010 File System Implementations 42
Free Block Management – Linked List (continued) • Can also be implemented in FAT • Can also be implemented in Indexed File system CS-4513, B-Term 2010 File System Implementations 43
Reading Assignment • Tanenbaum § 4. 3 CS-4513, B-Term 2010 File System Implementations 44
Scalability of File Systems • Question: How large can a file be? • Answer: limited by – Number of bits in length field in metadata – Size & number of block entries in FAT or i-node • Question: How large can file system be? • Answer: limited by – Size & number of block entries in FAT or i-node CS-4513, B-Term 2010 File System Implementations 45
MS-DOS & Windows • FAT-12 (primarily on floppy disks): • 4096 512 -byte blocks • Only 4086 blocks usable! • FAT-16 (early hard drives): • 64 K blocks; block sizes up to 32 K bytes • 2 GBytes max per partition, 4 partitions per disk • FAT-32 (Windows 95) • 228 blocks; up to 2 TBytes per disk • Max size FAT requires 232 bytes in RAM! CS-4513, B-Term 2010 File System Implementations 46
MS-DOS File System (continued) • Maximum partition for different block sizes • Empty boxes are forbidden combinations CS-4513, B-Term 2010 File System Implementations 47
Classical Unix • Maximum number of i-nodes = 64 K! • How many files in a modern PC? • I-node structure allows very large files, but … • Limited by size of internal fields • Limited by # of entries in triple-indirect block CS-4513, B-Term 2010 File System Implementations 48
Modern Operating Systems • Need much larger, more flexible file systems • • Many terabytes per system Multi-terabyte files Suitable for both large and small Cache only open files in RAM CS-4513, B-Term 2010 File System Implementations 49
Examples of Modern File Systems • Windows NTFS • Tanenbaum § 11. 8 • Linux ext 2 and ext 3 • Tanenbaum § 10. 6. 3 • Other file systems … • Consult your favorite system documentation • Find out what Android OS offers! CS-4513, B-Term 2010 File System Implementations 50
New Topic CS-4513, B-Term 2010 File System Implementations 51
Mounting mount –t type device pathname • Attach device (which contains a file system of type) to the directory at pathname • File system implementation for type gets loaded and connected to the device • Anything previously below pathname becomes hidden until the device is un-mounted again • The root of the file system on device is now accessed as pathname • E. g. , mount –t iso 9660 /dev/cdrom /my. CD CS-4513, B-Term 2010 File System Implementations 52
Mounting (continued) • OS automatically mounts devices in mount table at initialization time • /etc/fstab in Linux • Users or applications may mount devices at run time, explicitly or implicitly — e. g. , • Insert a floppy disk • Plug in a USB flash drive • Type may be implicit in device • Windows equivalent • Map drive CS-4513, B-Term 2010 File System Implementations 53
Problem • All the file systems that one might want to mount are different • E. g. , • Contiguous • Linked or FAT • Indexed • E. g. , • NTFS • Ext 3 • … CS-4513, B-Term 2010 • Many kinds of media and implementations • • • CD & DVD Hard drive Flash drive Floppy disk Network files … How to provide a uniform way of accessing files on them? File System Implementations 54
Virtual File System • Virtual File Systems (VFS) provide objectoriented way of implementing file systems. • VFS allows same system call interface to be used for different types of file systems. • The file API is to the VFS interface, rather than any specific type of file system. CS-4513, B-Term 2010 File System Implementations 55
View of Virtual File System VFS is logically like an abstract class Each local file system is like a concrete subclass However, CCS-4513, does. B-Term not 2010 support classes and subclasses Therefore, everything has to 56 be implemented in longhand File System Implementations
Virtual File System (continued) • Mounting: formal mechanism for attaching a file system to the Virtual File interface mount –t type device pathname • I. e. , a mechanism for instantiating an object of type and device … • … and connecting it to pathname CS-4513, B-Term 2010 File System Implementations 57
Linux Virtual File System (VFS) • A generic file system interface provided by the kernel • Common object framework – superblock: a specific, mounted file system – i-node object: a specific file in storage – d-entry object: a directory entry – file object: an open file associated with a process CS-4513, B-Term 2010 File System Implementations 58
Linux Virtual File System (continued) • VFS operations – super_operations: • read_inode, sync_fs, etc. – inode_operations: • create, link, etc. – d_entry_operations: • d_compare, d_delete, etc. – file_operations: • read, write, seek, etc. CS-4513, B-Term 2010 File System Implementations 59
Linux Virtual File System (continued) • Individual file system implementations conform to this architecture. • May be linked to kernel or loaded as modules • Linux kernel 2. 6 supports over 50 file systems in official version • E. g. , minix, ext 2, ext 3, iso 9660, msdos, nfs, smb, … CS-4513, B-Term 2010 File System Implementations 60
Reading references • Tanenbaum § 4. 3, 10. 6. 3, 11. 8 • Esp. § 4. 3. 7 Also: – • Robert Love, Linux Kernel Development, 2 nd edition, Chapter 12 CS-4513, B-Term 2010 File System Implementations 61
Questions? CS-4513, B-Term 2010 File System Implementations 62
Tanenbaum, § 4. 3. 3 Implementation of Directories • A list of [name, information] pairs • Must be scalable from very few entries to very many • Name: • User-friendly, variable length • Any language • Fast access by name • Information: • • • File metadata (itself) Pointer to file metadata block (or i-node) on disk Pointer to first & last blocks of file Pointer to extent block(s) … CS-4513, B-Term 2010 File System Implementations 63
Very Simple Directory name 1 attributes name 2 name 3 name 4 … attributes … • Short, fixed length names • Attribute & disk addresses contained in directory • MS-DOS, etc. CS-4513, B-Term 2010 File System Implementations 64
Simple Directory i-node name 1 i-node name 2 name 3 name 4 … i-node Data structures containing attributes • Short, fixed length names • Attributes in separate blocks (e. g. , i-nodes) • Attribute pointers are disk addresses (or i-node numbers) • Older Unix versions, MS-DOS, etc. CS-4513, B-Term 2010 File System Implementations 65
More Interesting Directory attributes … … name 1 longer_na me 3 very_long_n ame 4 name 2 … • Variable length file names – Stored in heap at end • Modern Unix, Windows • Linear or logarithmic search for name • Compaction needed after – Deletion, Rename See also Fig 4 -15 CS-4513, B-Term 2010 File System Implementations 66
Very Large Directories • Hash-table implementation • Each hash chain like a small directory with variable-length names • Must be sorted for listing CS-4513, B-Term 2010 File System Implementations 67
Questions? CS-4513, B-Term 2010 File System Implementations 68
- Slides: 68