Disks and Files Vivek Pai Princeton University Why

  • Slides: 29
Download presentation
Disks and Files Vivek Pai Princeton University

Disks and Files Vivek Pai Princeton University

Why Files n Physical reality n n 2 Block oriented Physical sector #s No

Why Files n Physical reality n n 2 Block oriented Physical sector #s No protection among users of the system Data might be corrupted if machine crashes n Filesystem model n n Byte oriented Named files Users protected from each other Robust to machine failures

File Structures n Byte sequence n n n Record sequence n n n Fixed

File Structures n Byte sequence n n n Record sequence n n n Fixed or variable length Read or write a number of records Tree n n 3 Read or write a number of bytes Unstructured or linear Records with keys Read, insert, delete a record (typically using B-tree)

File Structures Today n Stream of bytes n n More complicated structures n n

File Structures Today n Stream of bytes n n More complicated structures n n 4 Simplest to implement in kernel Easy to manipulate in other forms Little performance loss Hardware assist fell out of favor Special-purpose hardware slower, costly

File Types n n ASCII – plain text A Unix executable file n n

File Types n n ASCII – plain text A Unix executable file n n n n 5 header: magic number, sizes, entry point, flags Text (code) Data relocation bits symbol table Devices Everything else in the system

So What Makes Filesystems Hard? n n n 6 Files grow and shrink in

So What Makes Filesystems Hard? n n n 6 Files grow and shrink in pieces Little a priori knowledge 6 orders of magnitude in file sizes Overcoming disk performance behavior Desire for efficiency Coping with failure

File System Components n Disk management n n Keep information secure Reliability/durability n 7

File System Components n Disk management n n Keep information secure Reliability/durability n 7 User gives file name, not track or sector number, to locate data File Naming File access Disk management Security n n Arrange collection of disk blocks into files Naming n n User When system crashes, lose stuff in memory, but want files to be durable Disk drivers

Some Definitions n n n 8 File descriptor (fd) – an integer used to

Some Definitions n n n 8 File descriptor (fd) – an integer used to represent a file – easier than using names Metadata – data about data - bookkeeping data used to eventually access the “real” data Open file table – system-wide list of descriptors in use

Kinds of Metadata n inode – index node, or a specific set of information

Kinds of Metadata n inode – index node, or a specific set of information kept about each file n n Directory – names and location information for files and subdirectories n n n 9 Two forms – on disk and in memory Note: stored in files in Unix Superblock – contains information to describe the file system, disk layout Information about free blocks/inodes on disk

Contents of an Inode n Disk inode: n n n n 10 File type,

Contents of an Inode n Disk inode: n n n n 10 File type, size, blocks on disk Owner, group, permissions (r/w/x) Reference count Times: creation, last access, last mod Inode generation number Padding & other stuff 128 bytes on classic Unix

Directories in Unix n Stored like regular files n n n Logic n n

Directories in Unix n Stored like regular files n n n Logic n n n 11 Contents are file names and inode #s Names are nul-terminated strings Separates file from location in tree File can appear in multiple places What are the drawbacks?

Effects of Corruption n inode – file gets “damaged” n n Directory – “lose”

Effects of Corruption n inode – file gets “damaged” n n Directory – “lose” files/directories n n Might get to read deleted files Superblock – can’t figure out anything n 12 Maybe some “free” block gets viewed This is why we replicate the superblock

Data Structures for A Typical File System Process control block Open file pointer array

Data Structures for A Typical File System Process control block Open file pointer array 13 Open file table (systemwide) Memory Inode Disk inode . . .

Opening A File n n n 14 File name lookup and authentication Copy the

Opening A File n n n 14 File name lookup and authentication Copy the file metadata into the in-memory data structure, if it is not in yet Create an entry in the open file table (system wide) if there isn’t one Create an entry in PCB Link up the data structures Return a pointer to user fd = open( File. Name, access) PCB Allocate & link up data structures Open file table File name lookup & authenticate Metadata File system on disk

Reading And Writing What happens when you… n read 10 bytes from a file?

Reading And Writing What happens when you… n read 10 bytes from a file? n write 10 bytes into an existing file? n write 1024 bytes into a file? Disk works on blocks (sectors) n Can have temporary (ephemeral) buffers n Longer lasting buffers = disk cache 15

Reading A Block read( fd, user. Buf, size ) PCB Open file table Metadata

Reading A Block read( fd, user. Buf, size ) PCB Open file table Metadata Get physical block to sys. Buf copy to user. Buf read( device, phy. Block, size ) Buffer cache Logical phyiscal Disk device driver 16

A Disk Layout for A File System Boot block n n n 17 File

A Disk Layout for A File System Boot block n n n 17 File metadata (i-node in Unix) File data blocks Superblock defines a file system n n Super block size of the file system size of the file descriptor area free list pointer, or pointer to bitmap location of the file descriptor of the root directory other meta-data such as permission and various times For reliability, replicate the superblock

File Usage Patterns n How do users access files? n n How are files

File Usage Patterns n How do users access files? n n How are files used? n n Most files are small Large files use up most of the disk space Large files account for most of the bytes transferred Bad news n 18 Sequential: bytes read in order Random: read/write element out of middle of arrays Whole file or partial file Need everything to be efficient

Data Structures for Disk Management n A “header” for each file (part of the

Data Structures for Disk Management n A “header” for each file (part of the file meta-data) n n Disk sectors associated with each file A data structure to represent free space on disk n Bit map n n n 19 1 bit per block (sector) blocks numbered in cylinder-major order, why? Linked list Others? How much space does a bit map need for a 4 G disk?

Linked Files (Alto) n n n File header points to 1 st block on

Linked Files (Alto) n n n File header points to 1 st block on disk Each block points to next Pros n n n Cons n n 20 Can grow files dynamically Free list is similar to a file random access: horrible unreliable: losing a block means losing the rest File header . . . null

Contiguous Allocation n Request in advance for the size of the file Search bit

Contiguous Allocation n Request in advance for the size of the file Search bit map or linked list to locate a space File header n n n Pros n n n Fast sequential access Easy random access Cons n n 21 first sector in file number of sectors External fragmentation Hard to grow files

Single-Level Indexed Files or Extent-based Filesystems n n n A user declares max size

Single-Level Indexed Files or Extent-based Filesystems n n n A user declares max size A file header holds an array of pointers to point to disk blocks Pros n n n Cons n n n 22 Can grow up to a limit Random access is fast Clumsy to grow beyond limit Periodic cleanup of new files Up-front declaration a real pain File header Disk blocks

File Allocation Table (FAT) n Approach n n n Pros n n Simple 0

File Allocation Table (FAT) n Approach n n n Pros n n Simple 0 217 619 399 EOF 619 399 Cons n n 23 A section of disk for each partition is reserved One entry for each block A file is a linked list of blocks A directory entry points to the 1 st block of the file foo Always go to FAT Wasting space FAT

Multi-Level Indexed Files (Unix) n 13 Pointers in a header n n n Pros

Multi-Level Indexed Files (Unix) n 13 Pointers in a header n n n Pros & Cons n n 24 10 direct pointers 11: 1 -level indirect 12: 2 -level indirect 13: 3 -level indirect 1 2 11 12 13 In favor of small files Can grow Limit is 16 G and lots of seek What happens to reach block 23, 5, 340? data . . data

Challenges n n n 25 Unix filesystem has great flexibility Extent-based filesystems have speed

Challenges n n n 25 Unix filesystem has great flexibility Extent-based filesystems have speed Seeks kill performance – locality Bitmaps show contiguous free space Linked lists easy to search How do you perform backup/restore?

Bigger, Faster, Stronger n n Making individual disks larger is hard Throw more disks

Bigger, Faster, Stronger n n Making individual disks larger is hard Throw more disks at the problem n n Use some disks to provide redundancy n n 26 Capacity increases Effective access speed may increase Probability of failure also increases Generally assume a fail-stop model Fail-stop versus Byzantine failures

RAID (Redundant Array of Inexpensive Disks) n Main idea n n n Pros n

RAID (Redundant Array of Inexpensive Disks) n Main idea n n n Pros n n n Reliability High bandwidth Cons n 27 Store the error correcting codes on other disks General error correcting codes are too powerful Use XORs or single parity Upon any failure, one can recover the entire block from the spare disk (or any disk) using XORs The controller is complex RAID controller XOR

Synopsis of RAID Levels RAID Level 0: Non redundant (JBOD) RAID Level 1: Mirroring

Synopsis of RAID Levels RAID Level 0: Non redundant (JBOD) RAID Level 1: Mirroring RAID Level 2: Byte-interleaved, ECC RAID Level 3: Byte-interleaved, parity RAID Level 4: Block-interleaved, parity 28 RAID Level 5: Block-interleaved, distributed parity

Did RAID Work? n n n Performance: yes Reliability: yes Cost: no n n

Did RAID Work? n n n Performance: yes Reliability: yes Cost: no n n 29 Controller design complicated Fewer economies of scale High-reliability environments don’t care Now also software implementations