Software Systems File Systems and Storage Emery Berger
Software Systems File Systems and Storage Emery Berger and Mark Corner University of Massachusetts Amherst UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
Files § Associate names with data § Usually stored on persistent media (disks) UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 2
File Names § Hierarchical directory structure – Absolute, relative to current § Windows names = location + dir UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 3
File Systems § Organized set of data types – organize data – point to where data is stored – searchable database of files § LOTS of file systems – AFS, BFS, CFS, DFS, EFS, FFS, GFS, HFS, etc. – Distributed, local, encrypted, different OSs UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
Directories § Directory – just special file – Contains metadata, filenames • pointers to inodes § Typically hierarchical tree – odd exposure of data structure to user UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 5
Blocks § Storage organized as a sequence of blocks – Unit or reading and writing – Read, modify, write sequence § File system tracks free and full blocks – typically stored in a bitmap UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
Inodes § On disk data structure – Describes where all the bits of a file (dir) are UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
Storage § Lots of forms of permanent storage – Disk drives, flash storage, Tape, CDs, DVDs UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
Storage § Disks – Seek latency & rotational latency – High bandwidth – One of two moving parts in a PC UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
Storage § Flash memory – Predictable & low latency (including random) – Lower bandwidth – Larger erase blocks, wears out, energy Prediction: all PC storage Flash-based in 10 years UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
Locality § File systems use directory structure to improve locality – More important for disks than Flash – E. g. , ext 2 – all files in same directory clustered in same region of disk – Try to make all blocks of same file sequential – Move directories apart for expansion UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
Caching § Disk blocks, inodes, directories all cached § 1/3 to 1/2 of memory is disk cache § Disk drive has a cache too! UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
Poor Man’s Database § Because files & directories are easy to use, they get used as de facto databases – e. g. , Internet Explorer web cache • ~ 1000 files in each hash subdirectory C: Documents and SettingsEmeryLocal SettingsTemporary Internet FilesContent. IE 5> ls -ltra total 1873 -rwx------+ 1 Emery None 67 Jan 10 17: 31 desktop. ini drwx------+ 2 Emery None 0 Jan 17 22: 42 0 NDWKTYT drwx------+ 7 Emery None 0 Feb 19 19: 53. drwx------+ 7 Emery None 0 Apr 20 14: 45. . drwx------+ 2 Emery None 0 May 1 21: 41 8 HZD 6 WS 6 drwx------+ 2 Emery None 0 May 1 21: 54 I 4 F 15 DOK drwx------+ 2 Emery None 0 May 1 22: 03 XM 0 N 4 Q 4 W -rwx------+ 1 Emery None 1916928 May 3 12: 21 index. dat drwx------+ 2 Emery None 0 May 3 12: 21 S 0 RKZRFZ C: Documents and SettingsEmeryLocal SettingsTemporary Internet FilesContent. IE 5> UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 13
File Systems Abstraction § File system manages files – Traditionally: file system maps files to disk § But: files convenient abstraction use same, easy interface (read, write) – Block devices (/dev/scsi 0) • Disk drives – transfer in blocks – Character devices (/dev/tty) • Console, printer – Proc filesystem (/proc/mem) – FIFO (named pipes) UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 14
Device files § Unix devices live in /dev, act like ordinary files elnux 14> echo "foo" > /dev/tty foo UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 15
/proc filesystem § Normal file access to kernel internals elnux 14> ls total 0 dr-xr-xr-x -r----r--r--r-lrwxrwxrwx -r-------lrwxrwxrwx dr-x------rw-r--r--r----rw-------r--r--r-lrwxrwxrwx -r--r--r--r--r-dr-xr-xr-x -r--r--r-- -l /proc/30917/ 2 1 1 1 1 3 1 emery emery emery emery emery fac fac fac fac fac 0 0 0 0 0 May May May May May 3 3 3 3 3 13: 18 13: 01 13: 18 12: 06 13: 18 13: 18 13: 01 13: 18 13: 10 attr auxv cmdline cwd -> /nfs/elsrv 4/users 5/fac/emery environ exe -> /bin/tcsh fd loginuid maps mem mounts root -> / statm status task wchan UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 16
File Metadata § Files have a lot of associated “metadata”; ex. : Unix (from stat) – Date created, last modified, last accessed – Size (bytes) – User & group ID of file’s owner – File type (not content type) • Directory • Regular file • Block / character device (disk drive, screen) • FIFO UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 17
Untyped Files § Unix, Windows – file contents untyped – Stream of bytes – Type implied by convention (extensions) • . ppt, . pdf, … § Mac: file types stored in metadata UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 18
Access Control § Unix: each file has associated bits that control access (& other stuff) – Read – Write – Execute § Can specify for three “users” – User (file owner) – Group (set of users) – Other (everyone else) UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 19
Access Control - chmod § Can read bits via ls, set bits via chmod elnux 14> ls -l ack. scm -rw-r----- 1 emery fac 197 Feb 25 15: 19 ack. scm elnux 14> chmod -r ack. scm elnux 14> ls -l ack. scm --w------- 1 emery fac 197 Feb 25 15: 19 ack. scm elnux 14> cat ack. scm cat: ack. scm: Permission denied UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 20
Access Control Lists (ACLs) § ACLs are more expressive – Specify different rights per user or group – Opinion: one of the biggest UNIX problems UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
What’s Wrong with One Disk? UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 22
Distributed File Systems § Numerous drawbacks of local file systems – Inconvenient – Administrative overhead – Single point-of-failure § Solution: distributed file systems – FS appears local, but data remote – Two major implementations: • Windows (CIFS, SAMBA) • NFS (Sun’s Network File System) § Lots of manual DFSs (rsync, svn, USB keys) UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 23
Complications § Complexity and design tradeoffs – Naming – absolute vs. relative (to server) – Remote access vs. caching – Stateless or stateful server – Single image or replication UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 24
Naming & Transparency § Issues – How are files named? – Do filenames reveal location? – Do filenames change if file moves? – Do filenames change if user moves? UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 25
Location naming § Location transparency § Use indirection! – filename does not reveal storage location – Normal in Unix – Compare to Windows - C: foobar § Name may still change – if storage location changes – transparent not independent! UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 26
What parts are transparent? § Windows – Local: //computer/share/…. /directory/file • Remote files are explicit! – Remote: . …/directory/file § UNIX: – Local: /…. /mountpoint/directory/file • Remote files look like any other file – Remote: /…. /directory/file § Neither reveals all of storage location – Windows reveals machine, UNIX does not UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
NFS Example UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 28
URLs Viewed as File System § Uniform Resource Locator names increasingly standard way to access data protocol: //machine/path/to/file § Good? Bad? § Looks like Windows… same? UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 29
File Caching § Cache information from file server locally § Local disk: – Reduces access time (compared to remote) – Safe if node fails – Requires client to have disk (…) § Local memory: – – Quick Works without disks Smaller cache size Not fault-tolerant UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 30
Remote File Access & Caching § Caching issues: – Performance: • Where & when to cache file blocks? – Correctness: • When to propagate updates back to remote file? • What happens with multiple clients sharing? UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 31
Sharing with Others UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
When do changes get written? § User A opens a file, changes a file – When does it write it to file server? – If another user opens file does it see the changes? § Unix/one-copy semantics – Immediate • keep in mind UI issues § Session semantics – After close § Transaction semantics – Defined by program – Uncommon in FS UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
How is client informed? § Client-initiated consistency – – client contacts server and checks consistency every access at given intervals only upon opening a file § Server-initiated consistency – server detects potential conflicts, invalidates caches – Server needs to know: • which clients have cached which parts of which files, plus • which clients are readers & which are writers UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 34
Conflicts § Simplest kind: Read-Write Conflicts – Two people read same thing: • “The cat is red” – Both write: • “The cat is brown”, “The cat is purple” – Which is right? § Can this happen locally? – Yes! Try it with an editor § Worse with DFS, not obvious to user why UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
RAID, NAS, SAN Storage § Redundant Array of Inexpensive Disks – Multiple disks attached to controller – Disks each carry part of data • Redundancy, error detection, parallel transfer § Network Attached Storage – Box w/network port and storage (ie. XRAID) § Storage Area Network – Specialized network of NAS (ie. XSAN) UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
The “Near”-Future § Parallel File Systems (p. NFS, GFS) § Separate meta-data and data § Store data chunks on different machines UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
The End UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 38
Atomic Updates § Shadowing § Logs § Explain! UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
Named Pipes (FIFO) § Special file: acts like unnamed pipe – E. g. , cat file | wc -l elnux 14> mkfifo the. Pipe elnux 14> ls -ld the. Pipe prw-r----- 1 emery fac 0 May 3 14: 00 the. Pipe elnux 14> cat simplesocket. h > the. Pipe & [1] 32242 elnux 14> wc -l < the. Pipe 155 [1] Done cat simplesocket. h > the. Pipe elnux 14> UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 40
Named Pipes (FIFO) n Special file: acts like unnamed pipe n E. g. , cat file | wc -l elnux 14> mkfifo the. Pipe elnux 14> ls -ld the. Pipe prw-r----- 1 emery fac 0 May 3 14: 00 the. Pipe elnux 14> cat simplesocket. h > the. Pipe & [1] 32242 elnux 14> wc -l < the. Pipe 155 [1] Done cat simplesocket. h > the. Pipe elnux 14> UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 41
Named Pipes (FIFO) n Special file: acts like unnamed pipe n E. g. , cat file | wc -l elnux 14> mkfifo the. Pipe elnux 14> ls -ld the. Pipe prw-r----- 1 emery fac 0 May 3 14: 00 the. Pipe elnux 14> cat simplesocket. h > the. Pipe & [1] 32242 elnux 14> wc -l < the. Pipe 155 [1] Done cat simplesocket. h > the. Pipe elnux 14> UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 42
Named Pipes (FIFO) n Special file: acts like unnamed pipe n E. g. , cat file | wc –l elnux 14> mkfifo the. Pipe elnux 14> ls -ld the. Pipe prw-r----- 1 emery fac 0 May 3 14: 00 the. Pipe elnux 14> cat simplesocket. h > the. Pipe & [1] 32242 elnux 14> wc -l < the. Pipe 155 [1] Done cat simplesocket. h > the. Pipe elnux 14> n Useful when cannot do redirection n Especially for compression UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 43
Named Pipes (FIFO) n Exercise: n Program named “joe” outputs file “joe. out” n n Huge (~ 3 GB) Compress it automagically using gzip -c & named FIFO to “joe. out. gz” UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 44
Named Pipes (FIFO) n Exercise: n Program named “joe” outputs file “joe. out” n n Huge (~ 3 GB) Compress it automagically using gzip -c & named FIFO to “joe. out. gz” elnux 14> mkfifo joe. out elnux 14> gzip –c < joe. out > joe. out. gz & [1] elnux 14> joe UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 45
- Slides: 45