Colleague System Maintenance File System Administration Maintenance Part

  • Slides: 42
Download presentation
Colleague System Maintenance File System Administration & Maintenance – Part Two Cinda Goff Chuck

Colleague System Maintenance File System Administration & Maintenance – Part Two Cinda Goff Chuck Hauser 2005 -10 -30

Presentation Conventions n Names (files, users, daemons) are usually in bold: /etc/syslog. conf System

Presentation Conventions n Names (files, users, daemons) are usually in bold: /etc/syslog. conf System dependent or variable items are usually in italics: /var/sadm/patchnumber/log File entries and output are in mono-spaced type: > root 8036 c Tue Apr 26 23: 59: 00 2005 < root 8036 c Tue Apr 26 23: 59 2005 Ä marks a line wrapped to fit on the slide: n mv Solaris_8_Recommended_log ÄSolaris_8_Recommended_log. yyyymmdd ðmarks a horizontal tab (09 hex) n n n Reference OE is Solaris 8 Reference Uni. Data is 6. 0

Unix File System Maintenance Log File Maintenance n Cleaning the File System (files that

Unix File System Maintenance Log File Maintenance n Cleaning the File System (files that do not need to be retained) n File System Maintenance (fsck & UFS Logging) n

Solaris 8 (and Earlier) Automatically Maintained Logs Log Maintained by /var/adm/messages /usr/lib/newsyslog /var/cron/log /etc/cron.

Solaris 8 (and Earlier) Automatically Maintained Logs Log Maintained by /var/adm/messages /usr/lib/newsyslog /var/cron/log /etc/cron. d/logchecker /var/log/syslog /usr/lib/newsyslog /var/lp/logs/lpsched lp crontab entry /var/lp/lpgs/requests lp crontab entry /var/spool/lps/adm Easy. Spooler lpsd daemon

Solaris 9 System Log Rotation Starting with Solaris 9, logadm is used to handle

Solaris 9 System Log Rotation Starting with Solaris 9, logadm is used to handle log rotation for: n /var/adm/messages n /var/cron/log n /var/log/syslog n /var/lp/logs/lpsched n /var/lp/logs/requests

Some Logs That Need Manual Maintenance These files will grow forever: n n n

Some Logs That Need Manual Maintenance These files will grow forever: n n n n /var/adm/authlog /var/adm/loginlog /var/adm/sulog /var/adm/wtmpx /var/sadm/install_data/Solaris_8_Recommended_log var/sadm/patchnumber/log /var/spool/lps/adm/activity_log (Easy. Spooler)

Trimming Logs n n Most logs can be cleared by copying /dev/null or standard

Trimming Logs n n Most logs can be cleared by copying /dev/null or standard out to the log: cp /dev/null >logfil (or simply: >logfile) Or save the file, then clear: mv logfile. save. name Ä && >logfile Or save the latest entries in the log with the tail command: tail -50 logfile >logfile. tmp Ä && mv logfile. tmp logfile To stop Easy. Spooler logging, remove the log file: rm /var/spool/lps/adm/activity_log

Patch Cluster Logs Recommendation If a Solaris_8_Recommended_log file already exists, the next cluster installation

Patch Cluster Logs Recommendation If a Solaris_8_Recommended_log file already exists, the next cluster installation will append new entries at the end of the existing log. May want to save each log individually: mv Solaris_8_Recommended_log ÄSolaris_8_Recommended_log. yyyymmdd

Maintaining wtmpx Can be maintained by hand or script: cp /var/adm/wtmpx /var/adm/cis. wtmpx. 2004

Maintaining wtmpx Can be maintained by hand or script: cp /var/adm/wtmpx /var/adm/cis. wtmpx. 2004 cp /dev/null >/var/adm/wtmpx (Or simply >/var/adm/wtmpx) Note: /var/adm/utmpx is information about who is currently logged into the system. Used by who, whodo, w, users, and finger commands. Don’t mess with this file.

Files That May Need Manual Removal core files n preserve files n backing-store snapshot

Files That May Need Manual Removal core files n preserve files n backing-store snapshot files n

Core Files Core files are images dumped to a disk of a user process

Core Files Core files are images dumped to a disk of a user process terminated by certain signals n coreadm command will configure or show settings (but won’t prevent core dumps) n To prevent core dumps, use Bourne Shell ulimit command with –c set to zero: ulimit –c 0 n

Removing Core Files As root, find and remove all core files: find / -name

Removing Core Files As root, find and remove all core files: find / -name core –type f –exec rm –f {} ; (Note: Be sure to use the ‘–type f’ option; patches for third-party apps such as Apache sometimes have a directory named core. )

Crash Dump Files n n n A crash dump file is a disk copy

Crash Dump Files n n n A crash dump file is a disk copy of physical memory of the computer at the time of a fatal system error. Usually not a problem – unless the machine has crashed. The dump is usually saved in a swap file partition, then on reboot the savecore command copies the dump to the savecore directory.

Removing Crash Dump Files n n dumpadm will show crash dump settings, including the

Removing Crash Dump Files n n dumpadm will show crash dump settings, including the savecore directory: # dumpadm Dump content: kernel pages Dump device: /dev/dsk/c 2 t 0 d 0 s 3 (swap) Savecore directory: /var/crash/cis Savecore enabled: yes Check the savecore directory for a crash dump file (vmcore. n), or use a find command with: –name “vmcore. *”.

Preserve Files Should the vi editor (or the system) crash, a copy of the

Preserve Files Should the vi editor (or the system) crash, a copy of the user’s file is stored in /var/preserve/username. n vi –r filename allows recovery of workfile. n Unlikely there any files there, but may want to check and remove: n find /var/preserve –exec rm –f {} ;

Snapshot File Cleanup n If the fssnap command is used without the unlink option,

Snapshot File Cleanup n If the fssnap command is used without the unlink option, the backing-store file will still exist on the system and need to be manually removed after issuing the fssnap –d filesystem command. n If there are no active snapshots (check with fssnap –i), either: rm /backup-store-path/snapshot# Or find /backing-store-path –name “snapshot*” Ä–exec rm {} ;

Unix Disk-Based File Systems n n Disk drives consists of slices (partitioned using the

Unix Disk-Based File Systems n n Disk drives consists of slices (partitioned using the format command). Each slice is either raw or contains a file system (constructed by newfs, front-end to mkfs). Disk file systems consists of so many blocks The default Solaris disk file system is UFS (Unix File System)

VTOC (Volume Table of Contents) n n n The first cylinder of the disk

VTOC (Volume Table of Contents) n n n The first cylinder of the disk contains a VTOC describing the disk’s slices – slice number, tag, starting sector, size, & last sector. Use the prtvtoc command to save the VTOC of a disk in case the entire disk fails: prtvtoc /dev/rdsk/c#t#d 0 s 2 Ä>c#t#d 0 s 2. vtoc Use this file with the fmthard command to recreate the VTOC on a replacement disk: fmthard -s c#t#d 0 s 2. vtoc Ä/dev/rdsk/c#t#d 0 s 2

UFS File System Structure n n Formatting a UFS file system divides the disk

UFS File System Structure n n Formatting a UFS file system divides the disk slice into cylinder groups. Cylinder groups have four basic types blocks: bootblock, superblock, inode, and storage (or data block). For further info, see man page fs_ufs (4) Structure is documented in /usr/include/sys/fs/ufs_fs. h

Cylinder Group Layout Graphic from Sun Microsystems, Inc.

Cylinder Group Layout Graphic from Sun Microsystems, Inc.

UFS Areas and Block Types Type Stores Boot Block Used to boot system Superblock

UFS Areas and Block Types Type Stores Boot Block Used to boot system Superblock Information about the file system Inode 128 bytes with all information about a file – except the name(s), which directories store Storage (Data block) File or directory data Cylinder Group Map A bitmap in a UFS file system that stores information about block use and availability within each cylinder. Indirect Block Data block that instead of data stores either direct or other indirect block addresses Free blocks Blocks not in use

Boot Block n n n Used for booting system, holds bootstrap programs Only appears

Boot Block n n n Used for booting system, holds bootstrap programs Only appears in first cylinder group (blocks 0 15) Left blank if file system isn’t used for booting Firmware boot program loads and executes bootblk, which then loads and executes /ufsboot. Installed by installboot command.

Superblock Critical data about the file system – a copy is replicated before each

Superblock Critical data about the file system – a copy is replicated before each cylinder group n Sync command forces all file systems’ superblocks to be written to disk; n The shutdown command calls sync. n Structure is documented in /usr/include/sys/fs/ufs_fs. h n Superblock contains flags about file system state, including fs_clean. n

Superblock fs_clean Flags Name Value Meaning FSACTIVE 0 File system (FS) mounted and modified.

Superblock fs_clean Flags Name Value Meaning FSACTIVE 0 File system (FS) mounted and modified. FSCLEAN 1 FS was unmounted properly; no need to check file system when system booted FSSTABLE 2 File system not changed since last sync or fsflush; fsck can be skipped before mounting FSBAD 0 xff (-1) Root file system mounted as read-only because state was not FSCLEAN or FSTABLE FSSUSPEND 0 xfe (-2) Operations on system temporarily suspended FSLOG 0 xfd (-3) FS mounted with UFS logging; not checked when system is booted. FSFIX 0 xfc (-4) FS being repaired while mounted

Some Inode (or I-node) Fields n n n n File type (regular, directory, link,

Some Inode (or I-node) Fields n n n n File type (regular, directory, link, etc. ) File mode (read/write/execute permissions) Hard link count UID of owner, GID of group Size (number of bytes) Dates & times: created, last accessed, last modified Array of 15 disk-block addresses

File Addressing Graphic from Sun Microsystems, Inc.

File Addressing Graphic from Sun Microsystems, Inc.

Boot Block Problems n n If ‘The file just loaded does not appear to

Boot Block Problems n n If ‘The file just loaded does not appear to be executable’ message appears when booting, then the hard disk boot block is corrupted. Boot Solaris Software 1 of 2 CD and install a new boot block on the boot disk: ok> boot cdrom –s // #installboot /usr/platform/`uname Ä- i`/lib/fs/ufs/bootblk /dev/rdsk/c#t#d 0 s 0

Causes of File System Inconsistencies n Unclean shutdowns ¨ Stop+A executed ¨ System turned

Causes of File System Inconsistencies n Unclean shutdowns ¨ Stop+A executed ¨ System turned off without proper shutdown ¨ System unplugged or power failure ¨ Disk with mounted file systems removed ¨ Software error in kernel n Hardware ¨ Failing disk or controller ¨ Other major component fails

Check & Repair File Systems: fsck is only a file-system checker; does not handle

Check & Repair File Systems: fsck is only a file-system checker; does not handle data-integrity by checking contents of regular file data blocks. n Runs automatically at boot, can be run manually n Never use fsck on a mounted file system! (Unless you want to cause a panic and test your crash dump settings …) n

fsck Superblock Checks File system size versus number of inodes and number of blocks

fsck Superblock Checks File system size versus number of inodes and number of blocks used by superblock n Checks free blocks; blocks marked as free should not be claimed by any files n Count of summary block free inodes is compared to actually count of free inodes n

fsck Inode Checks n n Inodes are checked sequentially starting at first true inode

fsck Inode Checks n n Inodes are checked sequentially starting at first true inode (2). Inodes are checked for: ¨ Format and type ¨ Link count (directories compared to inode) ¨ Duplicate block ¨ Bad block numbers (blocknumber >= first block & <= last block number) ¨ Inode size (actual number of blocks compared to inode size field)

fsck Directory Block Checks n n n Inode number in directory points to unallocated

fsck Directory Block Checks n n n Inode number in directory points to unallocated inode Inode number in directory entry points beyond inode list First entry in directory list must be ‘. ’ entry referencing itself Second entry must be ‘. . ’ and equal to inode of parent directory Directory must be linked to somewhere in the file system.

fsck and fs_clean n n The fsck command uses the fs_clean flag to determine

fsck and fs_clean n n The fsck command uses the fs_clean flag to determine whether a file system needs checking. To view a file system’s fs_clean flag: #fstyp –v /dev/rdsk/c#t#d#s# Ä| grep fsclean filesystem state is valid, fsclean is 2 (be patient: fstyp –v returns a lot of information)

fsck at Boot Time n n At boot time the /sbin/rc. S script checks

fsck at Boot Time n n At boot time the /sbin/rc. S script checks /, /usr and /var file systems (if /usr & /var are separate systems). When the system boots, fsck runs in ‘preen’ mode: inconsistences consist with an unorderly shutdown are repaired, file systems are checked sequentially using the fsck pass field in /etc/vfstab. If the system cannot be repaired interactively, /sbin/rc. S will print a either a warning or fatal message that states ‘Run fsck manually’. After non-fatal errors are repaired, the system will continue booting. After fatal errors, the system will reboot.

/ (Root) and fsck n n n If the root file system isn’t mounted

/ (Root) and fsck n n n If the root file system isn’t mounted FSCLEAN or FSTABLE, then at boot time / is mount read-only and flagged FSBAD. To fix, need to boot from alternate device such as CD, then use fsck on CD to repair. If /usr is hosed, will usually also need to boot from CD.

UFS Logging n n UFS logging first appeared in Solaris 7 Previously (2. 6

UFS Logging n n UFS logging first appeared in Solaris 7 Previously (2. 6 and earlier) file system logging required Solaris Disk. Suite and a separate logging partition. UFS logging writes all metadata changes first to the logging space, then actual data blocks are written. Metadata is the directory and Inode information In other words, details of all changes to the file system are recorded in a log before the changes are actually written to the disk.

How to Log File Systems n To log a file system, either mount the

How to Log File Systems n To log a file system, either mount the system with the logging option: mount –o logging myfilesystem or change the option field in /etc/vfstab to logging: #device #to mount. . . /dev/md/dsk/d 4 n device to fsck mount FS point type /dev/md/rdsk/d 4 /vol 1 ufs fsck mount pass at boot options 1 yes logging Using mount without arguments will include logging status: /vol 1 on /dev/md/dsk/d 4 Äread/write/setuid/intr/largefiles/logging/ onerror Ä=panic/dev=1540004 on Sat Oct 29 14: 21 2005

UFS Logging Space The logging information is stored in the file system’s free blocks.

UFS Logging Space The logging information is stored in the file system’s free blocks. n Log size is 1 MB per 1 GB of file system space, up to 64 MB. n

UFS Logging Advantages Reduces risk of file system inconsistencies – if data blocks aren’t

UFS Logging Advantages Reduces risk of file system inconsistencies – if data blocks aren’t written, metadata changes are rolled back. n Increases boot speed – fsck doesn’t waste time on system where the fs_clean flag is FSLOG. n Most file system operations are significantly faster n

Solaris Disk Suite (up to OE 8) and Solaris Volume Manager (OE 9 and

Solaris Disk Suite (up to OE 8) and Solaris Volume Manager (OE 9 and later) n n If using Solaris Disk Suite (SDS) or Solaris Volume Manager (SVM), periodically use the metastat command to check status of metadevices and hot spare pool. Automate checking by using shell and awk scripts run by cron: # Check status of disk metadevices 0 4, 16 * * * /opt/local/sbin/dsmon. sh

dsmon. sh Shell Script #!/bin/sh # @(#)dsmon. sh 1. 1 @(#) # Uses an

dsmon. sh Shell Script #!/bin/sh # @(#)dsmon. sh 1. 1 @(#) # Uses an awk script (dsmon. awk) to parse output of Disk. Suite # metastat command in order to report errors. DSMON_SCRIPT=/opt/local/sbin/dsmon. awk DSMON_OUT="/tmp/dsmon. $$. out" DSMON_STATUS=0 # RECEPIENTS – sendmail alias of who to notify RECEPIENTS=sysadmin. list trap "rm -f $DSMON_OUT; exit 1" 1 2 3 15 if metastat | awk -f $DSMON_SCRIPT >$DSMON_OUT then : else DSMON_STATUS=$? mailx -s "Metastat Disk Error Report" sysadmin. list <$DSMON_OUT fi rm -f $DSMON_OUT exit $DSMON_STATUS

dsmon. awk Awk Script # Disk. Suite Status Monitor # @(#)dsmon. awk 1. 1

dsmon. awk Awk Script # Disk. Suite Status Monitor # @(#)dsmon. awk 1. 1 @(#) BEGIN { STATUS=0 } /State: / { if ($2 != "Okay" ) { if (prev ~ /^d/) print prev, $0 STATUS=9 } } { prev = $0} END { exit STATUS }