Filesystems Metadata Paths Caching Vivek Pai Princeton University

  • Slides: 19
Download presentation
Filesystems – Metadata, Paths, & Caching Vivek Pai Princeton University

Filesystems – Metadata, Paths, & Caching Vivek Pai Princeton University

Diskgedanken ® Assuming you back-up and restore files, what factors affect the time involved?

Diskgedanken ® Assuming you back-up and restore files, what factors affect the time involved? ® How are these factors changing? ® What issues affect the rates of change? ® How is total backup time changing over the years? ® What is Occam’s razor? 2

Today’s Overview ® Finish up metadata, reliability ® A little discussion of mounting, etc

Today’s Overview ® Finish up metadata, reliability ® A little discussion of mounting, etc ® Move on to performance ® Quiz 1 not graded ® My project 1 solution available 3

Occam’s Razor ® From William of Occam (philosopher) ® “entities should not be multiplied

Occam’s Razor ® From William of Occam (philosopher) ® “entities should not be multiplied unnecessarily” ® Often reduced to other statements “one should not increase, beyond what is necessary, the number of entities required to explain anything” ® “Make as few assumptions as possible” ® “once you have eliminated all other possible explanations, what remains must be the answer” ® 4

A Reasonable Approach ® Disk size: 40 GB (20 -80 GB common) ® File

A Reasonable Approach ® Disk size: 40 GB (20 -80 GB common) ® File size: 10 KB (5 -20 KB common) ® Access time: 10 ms (5 -20 ms common) ® Assume 1 seek per file (reasonable) ® 100 files = 1 MB, each access. 01 sec ® So, 40 GB at 1 MB/s = 40 K sec = 11+ hours 5

Changes Over Time ® Disk density doubling each year ® Seek time dropping <

Changes Over Time ® Disk density doubling each year ® Seek time dropping < 10% ® File size growing slowly ® Results ®# of files grows faster than access time reduction ® Backup time increases 6

Most Common Answer ® Disk size / maximum transfer rate ® In other words,

Most Common Answer ® Disk size / maximum transfer rate ® In other words, read sectors, not files ® Can this be done? ® Yes, if you have access to “raw” disk ® ® Which means that you have “root” permission And that the system has raw disk support Faster than file-based dump/restore ® No concept of files, however ® What happens if you restore to a disk with a different geometry? ® 7

Namespace ® Basically, the filesystem hierarchy ® Provides a convenient way of accessing things

Namespace ® Basically, the filesystem hierarchy ® Provides a convenient way of accessing things ® Files ® Devices ® Pseudo-“filesystems” ® In Unix, a nice, consistent namespace ® No “drive names” 8

A Sample File Tree / bin/ boot/ proc/ usr/ home/ local/ mariah/ vivek/ 9

A Sample File Tree / bin/ boot/ proc/ usr/ home/ local/ mariah/ vivek/ 9

What If You Have Two Disks? / bin/ boot/ proc/ usr/ home/ local/ mariah/

What If You Have Two Disks? / bin/ boot/ proc/ usr/ home/ local/ mariah/ vivek/ 10

As Mariah’s Files Grow? / bin/ boot/ proc/ usr/ home/ local/ mariah/ vivek/ 11

As Mariah’s Files Grow? / bin/ boot/ proc/ usr/ home/ local/ mariah/ vivek/ 11

Mount Points / bin/ boot/ proc/ usr/ home/ local/ mariah/ vivek/ 12

Mount Points / bin/ boot/ proc/ usr/ home/ local/ mariah/ vivek/ 12

Mount Points ® Original directories get “hidden” ® Traversal is transparent to user ®

Mount Points ® Original directories get “hidden” ® Traversal is transparent to user ® OS keeps track of various disks (devices) ® But what happens with big disks? ® Partition (split) them into several logical devices – easier to manage, safer, etc ® Home directories in one partition, startuprelated files/programs in another, etc 13

Paths ® Each process has “current directory” Convenient shorthand ® Paths that start with

Paths ® Each process has “current directory” Convenient shorthand ® Paths that start with “/” are absolute ® Paths without “/” are relative to current directory ® ® Path lookup is potentially expensive It’s also repetitive ® Amenable to caching ® Metadata cache from assigned reading ® 14

Finding Paths ® In Unix, directory contains inode # ® If two directories contain

Finding Paths ® In Unix, directory contains inode # ® If two directories contain same #, file is accessible via different paths (and names) ® Adding another name into the filespace is called “linking” (via ‘ln’ command) ® But the directory is a file ® What happens if a directory gets linked? 15

Consider The Following / bin/ boot/ proc/ usr/ home/ local/ mariah/ vivek/ 16

Consider The Following / bin/ boot/ proc/ usr/ home/ local/ mariah/ vivek/ 16

Various Solutions ® Only allow “root” to link to directory ® Can still be

Various Solutions ® Only allow “root” to link to directory ® Can still be useful ® Hopefully root knows when to do it ® Limit the number of iterations ® Pick some “large” maximum ® Terminate traversal after that ® Detect loops ® Cost? Utility? 17

Does It “Do What You Want” ®I create ~vivek/work/cal/now/mtgs ® I create a link

Does It “Do What You Want” ®I create ~vivek/work/cal/now/mtgs ® I create a link to it via ~vivek/mtgs ® The month advances, and ~vivek/work/cal/now/mtgs becomes ~vivek/cal/Sep 01/mtgs ® Create new ~vivek/work/cal/now/mtgs ® To what does ~vivek/mtgs point? 18

Symbolic Link ® Created via “ln –s” command ® Dynamically interpreted each use ®

Symbolic Link ® Created via “ln –s” command ® Dynamically interpreted each use ® Does not cause a standard directory entry to target. Instead ® Link is a file containing the file/path ® May be stored in inode if link is short ® Standard looping rules apply 19