1 prtdiag varadmmessages X solaris 10 cat varadmmessages

  • Slides: 10
Download presentation

1단계 • 문제를 확인하기 위해 prtdiag 명령어와 /var/adm/messages. X 파일들을 확인한다. solaris 10 #

1단계 • 문제를 확인하기 위해 prtdiag 명령어와 /var/adm/messages. X 파일들을 확인한다. solaris 10 # cat /var/adm/messages Feb 5 ssd fmd: [ID 441519 daemon. error] SUNW-MSG-ID: SUN 4 U-8000 -35, TYPE: Fault, VER: 1, SEVERITY: Critical Feb 5 ssd EVENT-TIME: Tue Feb 5 KST 2008 Feb 5 ssd PLATFORM: SUNW, Sun-Fire-880, CSN: -, HOSTNAME: ssd Feb 5 ssd SOURCE: cpumem-diagnosis, REV: 1. 5 Feb 5 ssd EVENT-ID: d 2 bbe 7 be-8355 -6529 -d 33 c-d 03 fcc 400 bec

Article for Message ID: SUN 4 U-8000 -35 Memory module errors exceeded acceptable levels

Article for Message ID: SUN 4 U-8000 -35 Memory module errors exceeded acceptable levels Type Fault Severity Critical Description The Solaris Fault Manager has determined that one or more uncorrectable (multibit) memory errors indicating a fault which requires repair action is present. Automated Response The system will attempt to remove the affected physical memory page from service after restart of one or more specific services, or the entire system. Impact This error will cause either a system panic or restart of one or more user processes, with resulting interruption in service. After restart, the performance of the system may be minimally impacted as a result of removing the physical memory page from operation. Suggested Action for System Administrator Schedule a repair procedure to replace the affected memory DIMM module, whose identity can be determined using fmdump -v -u <EVENT-ID> For example: EVENT-ID: d 05 a 9 f 16 -e 969 -4988 -d 340 -dea 1 b 54 bd 307 Details

3단계 • fmdump 명령어를 통해 현재 상황 확인 solaris 10 # fmdump -v TIME

3단계 • fmdump 명령어를 통해 현재 상황 확인 solaris 10 # fmdump -v TIME UUID SUNW-MSG-ID Jan 07. 3516 78 b 628 f 1 -11 da-65 a 3 -86 b 4 -b 82 a 1767 ef 6 b SUN 4 U 8000 -35 95% fault. memory. bank Problem in: mem: ///unum=Slot, B: J 2900, J 2901, J 3000 Affects: mem: ///unum=Slot, B: J 2900, J 2901, J 3000 FRU: mem: ///unum=Slot, B: J 2900, J 2901, J 3000

4단계 • fmadm 명령어를 통해 교체할 부품 확인 solaris 10 # fmadm faulty STATE

4단계 • fmadm 명령어를 통해 교체할 부품 확인 solaris 10 # fmadm faulty STATE RESOURCE / UUID --------------------------------degraded mem: ///unum=Slot, B: J 2900, J 2901, J 3000 78 b 628 f 1 -11 da-65 a 3 -86 b 4 -b 82 a 1767 ef 6 b

6단계 • fmadm 명령어를 통해 복구(repair) 하기 solaris 10 # fmdump b 82 a

6단계 • fmadm 명령어를 통해 복구(repair) 하기 solaris 10 # fmdump b 82 a 1767 ef 6 b repair 78 b 628 f 1 -11 da-65 a 3 -86 b 4 - solaris 10 # fmadm faulty STATE RESOURCE / UUID ---------------------------------