UNIX Internals The New Frontiers Chapters 8 9
- Slides: 67
UNIX Internals – The New Frontiers Chapters 8 & 9 File Systems 1
Contents u The User Interface to Files u File System Framework u The Vnode/VFS Architecture u Implementation Overview u File-System-Dependent Objects u Mounting a File System u Operations on Files u The System V File System(s 5 fs) u S 5 fs Kernel 2
8. 2 The User Interface files, directory, file descriptor, file systems u File & Directories u u File: logically a container for data u A hierarchical, tree-structured name space u Pathname: all the components in the path from the root to the node, by “/” u “. ” & “. . ” u Link: a directory entry for a file. 3
Directory tree 4
Operation on directory u u u 5 dirp = opendir(const *filename); direntp = readdir (dirp); rewinddir(dirp); status = closedir(firp); struct dirent { int_t d_ino; char d_name[NAME_MAX +1]; };
File Attributes u Kept in the inode: index node u File attributes: u File type u Number of hard links u File size u Device ID u Inode number u User and Group Ids of the owner of the file. u Timestamps u Permissions and mode flags 6
Permissions and mode flags u u u 7 0 wner, group, others (3 x 3 bits) Read, write, execute (3 bits) Mode flags - apply to executable files - suid, sgid – to set the user’s effective UID to that of the owner of the file, - stick – to retain file in swap area
System calls u u 8 link, unlink – to create and delete hard links utimes – to change the access and modify timestamps, chown – to change the owner UID and GID, Chmode – to change permissions and mode flags.
File Descriptors u u 9 fd = open (path, oflag, mode); fd is a per-process object.
File descriptors 10
File I/O u Random u lseek and sequential access – random access u nread = read(fd, buf, count); u Write has similar semantics u Operations are serialized u In append mode offset pointer set to the end of the file 11
Scatter-Gather I/O u 12 nbytes = writev(fd, iovcnt);
File Locking u Read and write are atomic. u Advisory locks: protect from cooperative processes, flock() in 4 BSD; in SVR 3 chmod must be enabled first u SVR 4: r/w locks. u Mandatory locks: kernel u C library function lockf 13
8. 3 File systems u u 14 Mount-on - a directory is covered by the mounted file system. - mount table (original) & vfs list (modern) Restrictions - file cannot span file system, - each file system must reside on a single logical disk
15
Logical Disks u u u u 16 A logical disk is a storage abstraction that the kernel sees as a linear sequence of fixed sized, randomly accessible blocks. newfs, mkfs, Traditional: partition – physical storage of a file system Modern configurations: Volume (several disks combined), Disk mirroring Stripe sets RAID(Redundant Array of Inexpensive Disks)
Special files u u 17 Generalization to include all kinds of I/O related objects such as directories, symbolic links, hardware devices (disks, terminals, printers, psuedodevices such as the system memory, and communications abstractions such as pipes and sockets; Problems with hard links – may not span file systems, can be created by superuser only, ownership problems,
Special files u u u 18 Symbolic links – special file that points to another file (linked-to file); the data portion of the file contains the pathname of the linked-to file; may be stored in the Inode of the symbolic link ( more on this in Practical UNIX Programming pp. 90 -96); Pipes – created by pipe system call, deleted by the kernel automatically FIFOs - created by mknod system call, must be explicitly deleted;
8. 5 File System Framework u Traditional UNIX can not support >1 types of FS. u The new developments (DOS, file sharing, RFS, NFS) require the framework to change. u AT&T: file system switch u Sun Microsystem: vnode/vfs u DEC: gnode u SVR 4: (AT&T+ standard 19 vnode/vfs+NFS)-> de facto
8. 6 The Vnode/Vfs Architecture u Objectives u Support several file system types simultaneously. u Different disk partitions may contain different types of file systems. u Support for sharing files over a network. u Vendors should be able to create their own file system types and add them to the kernel. 20
Lessons from Device I/O u Devices: block & character u Character device switch: struc cdevsw { int (*d_open)(); int (*d_close)(); int (*d_read)(); int (*d_write)(); } cdevsw[ ]; u Major 21 device number: as the index
read system call(in traditional UNIX) 1) 2) 3) 4) 5) 6) 7) 8) 9) 22 Use the file descriptor to get to the open file object; Check the entry to see if the file is open for read; Get the pointer to the in-core inode from this entry; Lock the inode so as to serialize access to the file; Check the inode mode field and find that the file is a character device file. Use the major device number to index into a table of character devices and obtain the cdevsw entry for this device; From the cdevsw, obtain the pointer to the d_read routine for this device; Invoke the d_read operation to perform the devicespecific processing of the read request. Unlock the inode and return to the user.
Lessons from Device I/O u It is necessary to separate the file subsystem code into file-systemindependent code and file-systemdependent code u The interface between these two parts is defined by a set of generic functions that are called by the file systemindependent code 23
Object Oriented Design 24
Overview of the Vnode/Vfs Interface u Vnode represents a file in the UNIX kernel. u Vfs represents a file system 25
) 26
base class data and operations pointers v_data: inode(s 5 fs), rnode(NFS), tmpnode(tmpfs), u v_op: vnodeops u Example: to close the file associated with the vnode u 27 #define VOP_CLOSE(vp, …) (*((vp)->v_opclose))(vp, …)
VFS base class 28
8. 7 Implementation Overview u Objectives u Each operation must be carried out on behalf of the current process. u Certain operations may need to serialize access to the file. u The interface must be stateless and reentrant. u FS implementation should be allowed to use global resources, such as buffer cache. u The interface should be usable by the server side u The use of fixed-size static tables must be avoided. 29
Vnodes and Open Files u The vnode is the fundamental abstraction that represents an active file in the kernel. u access to a vnode: u by a file descriptor u by file-system-dependent data structures 30
Data structures Reference count 31
The Vnode struct vnode {u_short v_flag; u_short v_count; struct vfs *vfsmountedhere; struct vnodeops *v_op; struct vfs *vfsp; … }; // p 242 32
Vnode Reference Count u u 33 It determines how long the vnode must remain in the kernel. Reference versus lock: Acquire a reference: u Open a file u A process holds a reference to its current directory. u When a new file system is mounted u Pathname traversal routine file is deleted physically when reference count becomes zero.
The Vfs Object u struct vfs { u u u u }; 34 struct vfs *vfs_next; struct vfsops * vfs_op; struct vnode *vfs_vnodecovered; int vfs_fstype; caddr_t vfs_data; dev_t vfs_dev; … //p 243
35
8. 8 File-System-Dependent Objects u The Per-File Private Data u Vnode 36 is an abstract objects.
The vnodeops Vector struct vnodeops{ int (*vop_open)(); int (*vop_close)(); … }; //p 245 For ufs: struct vnodeops ufs_vnodeops = { ufs_open; ufs_close; … }; //p 246 37
38
File-System-Dependent Parts of the Vfs Layer struct vfsops { int (*vfs_mount)(); int (*vfs_unmount)(); int (*vfs_root)(); int (*vfs_statvfs)(); int (*vfs_sync)(); … }; //p 246 39
40
8. 9 Mounting a File System u mount(spec, dir, flags, type, dataptr, datalen) //SVR 4 u Virtual File System Switch - a global table containing one entry for each file system type. struct vfssw{ char *vsw_name; int (*vsw_init)(); struct vfsops * vsw_vfsops; …. } vsfsw[]; 41
mount Implementation u Adds the structure to the linked list headed by rootvfs. u Sets the vfs_op field to the vfsops vector specified in the switch entry. u Sets the vfs_vnodecovered field to point to the vnode of the mount point directory. 42
VFS_MOUNT processing u Verify permissions for the operation. u Allocate and initialize the private data object of the file system. u Store a pointer to it in the vfs_data field of the vfs object. u Access the root directory of the file system and initialize its vnode in memory. 43
8. 10 Operations on Files Pathname Traversal lookuppn(): u_cdir 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. v_type is of a directory “. . ” & system root – move on “. . ” & a mounted system root – access the mount point VOP_LOOKUP Not found, last one - success, else – error ENOENT A mount point - go to the mounted vfs root A symbolic link – translate it and append Release the directory Go back to the top of the loop Terminate, do not release the reference of the final vnode //p 250 44
Opening a file fd = open(pathname, mode) 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 45 Allocate a descriptor Allocate an open file object Call lookuppn() Check the vnode for permissions Check for the operations Not exist, O_Creat, VOP_CREAT; ENOENT VOP_OPEN If O_TRUNC, VOP_SETATTR Initialize Return the index of the file descriptor //p 252
Other topics u u u 46 File I/O File attributes User credentials Analysis Drawbacks of the SVR 4 Implementation The 4. 4 BSD Model
Chapter 9 File System Implementations 47
9. 2 The System V File System(s 5 fs) u The layout of s 5 fs partition: B S inode list u Directories: u 48 s 5 fs directory is a special file containing a list of files and subdirectories. data blocks
Inodes u The inode contains administrative information, or meta data. u The node list contains all the inodes. u On-disk inode - see Tab. 9 -1 u In-core inode have more fields 49
Inode Fields 50
di_mode Bit-fields 51
Block array of inode—di_addr inode 10, 10 K 256, 256 K 256*256=65 K, 65 M 52 256*256=16 M, 16 G
The superblock u Size in blocks of the file system u Size in blocks of the inode list u Number of free blocks and inodes u Free block list u Free inode list 53
Free block list 54
9. 3 s 5 fs Kernel Organization u In-core u The Inodes vnode u Device ID u Inode number of the file u Flags for synchronization and cache management u Pointers to keep the inode on a free list u Pointers to keep the inode on a hash queue. u Block number of last block read 55
Allocating and Reclaiming Inodes u Inode table(LRU) containing the active inodes u Reference count of a vnode ==0 the reclaim the inode as free u Iget()(allocating): 56
Inode lookup u s 5 lookup() u Checks the directory name lookup cache u Directory name lookup cache Miss? Reads the directory one block at a time, searching the entries for the specified file name: Get it u If the file is in the directory, get the inode number, use iget() to locate the inode, u Inode in the table? get it: allocate a new inode, initialize, copy, put in the hash queue, also initialize the vnode(v_ops, v_data, vfs) u Return the pointer to the inode 57
File I/O (1) u Read(to u Fd-> 58 a user buffer address) the open file object, verify mode-> vnode-> get the rw-lock->call s 5 read() u Offset -> block number & the offset -> uiomove()-> call copyout() u The page not in memory? page fault->the handler>s 5 getpage()->call bmap() u logical to physical mapping, search vnode’s page list, not in? allocates a free page and call the disk driver to read the data from disk u Sleeps until the I/O completes. Before copying to user data space, verifies the user has access u s 5 read() returns, unlock, advances the offset, returns the number of bytes read
File I/O (2) u Write: u Not immediately to disk u May increase the file size u May require the allocation of data blocks u Read the entire block, write relevant data, write back all the block 59
Allocating and reclaiming Inodes u When the reference count drops to 0. . u When a file becomes inactive…. u It is better to reuse inodes………… 60
Analysis of s 5 fs u Reliability concern : super block u Performance: u 2 disk I/Os u Blocks randomly located u Block size: 512(SVR 2), 1024(SVR 3) u Name: 14 characters u Inodes limit: 65535 61
The Berkeley Fast File System Hard disk structure u On-disk organization - Blocks and fragments - Allocation policy u FFS functionality enhancements – long file names, - symbolic links, - other enhancements; u Analysis u 62
Other file systems u Temporary file systems - RAM disk, mfs, tmpfs) u The Specfs File System u The /proc File System 63
Linux Virtual File System u Uniform file system interface to user processes u Represents any conceivable file system’s general feature and behavior u Assumes files are objects that share basic properties regardless of the target file system 64
65
66
Primary Objects in VFS u Superblock object u Represents u Inode object u Represents u Dentry a specific directory entry object u Represents process 67 a specific file object u Represents u File a specific mounted file system an open file associated with a
- Unix internals
- Unix device drivers
- Unix internals
- New frontiers beliefs
- Boeing research and technology
- Richard walker frontiers
- Human frontiers postdoctoral fellowship
- Hochedlinger
- Frontiers in bioinformatics
- Frontiers of biotechnology chapter 9
- Frontiers in bioscience
- Chapter 9 frontiers of biotechnology
- Weakening frontiers meaning
- Chapter 9 frontiers of biotechnology
- Frontiers
- Frontiers in chemical engineering
- Cop 4910
- A brave new world chapter 4 summary
- Brave new world quiz chapters 1-3
- Ntfs internals
- Azure internals
- Shivaram venkataraman
- Operating systems internals and design principles
- Operating systems: internals and design principles
- Mfc hierarchy chart
- Operating systems: internals and design principles
- Operating system internals and design principles
- Windows kernel internals
- Sql server internals and architecture
- Mfc internals
- Android internals
- Advapi logon process
- Windows internals
- Linux kernel internals
- Operating system internals and design principles
- Windows kernel internals
- Slidetodoc.com
- Ntos kernel
- Operating systems: internals and design principles
- Windows kernel internals
- Mfc internals
- Jvm internals
- Cdrom.sys
- Operating systems: internals and design principles
- Azure internals
- Zfs internals
- Thiếu nhi thế giới liên hoan
- Vẽ hình chiếu vuông góc của vật thể sau
- Một số thể thơ truyền thống
- Thế nào là hệ số cao nhất
- Slidetodoc
- Sơ đồ cơ thể người
- Bảng số nguyên tố lớn hơn 1000
- đặc điểm cơ thể của người tối cổ
- Các châu lục và đại dương trên thế giới
- Cách giải mật thư tọa độ
- Tư thế worm breton
- ưu thế lai là gì
- Tư thế ngồi viết
- Thẻ vin
- Bàn tay mà dây bẩn
- Các châu lục và đại dương trên thế giới
- Bổ thể
- Từ ngữ thể hiện lòng nhân hậu
- Tư thế ngồi viết
- V cc
- Làm thế nào để 102-1=99
- Thơ thất ngôn tứ tuyệt đường luật