Module 7 Xtrem IO Troubleshooting and Upgrades Upon
Module 7: Xtrem. IO Troubleshooting and Upgrades Upon completion of this module, you should be able to: • Perform basic troubleshooting • Describe Field Replaceable Unit (FRU) procedures • Perform upgrades • Identify support resources Copyright © 2014 EMC Corporation. All Rights Reserved. Module 7: Xtrem. IO Troubleshooting and Upgrades 1
Module 7: Xtrem. IO Troubleshooting and Upgrades Lesson 1: Troubleshooting Considerations This lesson covers the following topic: • Basic troubleshooting Copyright © 2014 EMC Corporation. All Rights Reserved. Module 7: Xtrem. IO Troubleshooting and Upgrades 2
Troubleshooting with the GUI Levels of Severity Color Codes Critical Major Copyright © 2014 EMC Corporation. All Rights Reserved. Minor Information Module 7: Xtrem. IO Troubleshooting and Upgrades 3
Check Alerts • The show-alerts command displays a list of pre-defined • alerts with details Examine output and run more commands in area of concern Copyright © 2014 EMC Corporation. All Rights Reserved. Module 7: Xtrem. IO Troubleshooting and Upgrades 4
Check Cluster Status • The show-clusters command displays general cluster information 4 XMS currently is disconnected from the cluster Copyright © 2014 EMC Corporation. All Rights Reserved. Module 7: Xtrem. IO Troubleshooting and Upgrades 5
Check BBUs • The show-bbus command displays battery backup unit information 4 Disconnected Copyright © 2014 EMC Corporation. All Rights Reserved. Module 7: Xtrem. IO Troubleshooting and Upgrades 6
Check Initiators Connectivity • The show-initiators-connectivity command displays initiator-port connectivity and the number of connected targets Syntax: show-initiators-connectivity target-details Mapping: Initiator Target Copyright © 2014 EMC Corporation. All Rights Reserved. Module 7: Xtrem. IO Troubleshooting and Upgrades 7
Check FC Errors • The show-targets-fc-error-counters command displays Fibre Channel error counters per target. Copyright © 2014 EMC Corporation. All Rights Reserved. Module 7: Xtrem. IO Troubleshooting and Upgrades 8
Check Infini. Band Counters • The show-storage-controllers-infiniband- counters command displays Infini. Band errors and link down states. Copyright © 2014 EMC Corporation. All Rights Reserved. Module 7: Xtrem. IO Troubleshooting and Upgrades 9
Module 7: Xtrem. IO Troubleshooting and Upgrades Lesson 1: Summary This lesson covered the following topic: • Basic troubleshooting Copyright © 2014 EMC Corporation. All Rights Reserved. Module 7: Xtrem. IO Troubleshooting and Upgrades 10
Module 7: Xtrem. IO Troubleshooting and Upgrades Lesson 2: Field Replaceable Units (FRU) This lesson covers the following topic: • Field replaceable units procedure review Copyright © 2014 EMC Corporation. All Rights Reserved. Module 7: Xtrem. IO Troubleshooting and Upgrades 11
Xtrem. IO FRU Replacement Procedures • Procedures to replace the following components 4 Storage Controllers 8 Power supplies 4 DAE 8 SSDs 8 Chassis 8 Controller 8 Power supply 4 BBUs 4 XMS 4 Software re-installation 4 Generation Log Bundle Copyright © 2014 EMC Corporation. All Rights Reserved. Module 7: Xtrem. IO Troubleshooting and Upgrades 12
Storage Controller, BBU, and Infini. Band Tolerance Single SC Failure in same X-Brick Double SC Failure in same X-Brick BBU Failure Infini. Band Switch Failure Single X-Brick Cluster Performance degradation Data Loss of service N/A Two X-Brick Cluster Performance degradation Loss of Service Loss of service if more than half failed Loss of service if both failed Four X-Brick Cluster Performance degradation Loss of Service Loss of service if more than half failed Loss of service if both failed Copyright © 2014 EMC Corporation. All Rights Reserved. Module 7: Xtrem. IO Troubleshooting and Upgrades 13
DAE Tolerance # of failures in same X-Brick Outcome One SSD Performance degradation until rebuild is complete Two SSDs (concurrent) Data loss Six SSDs Degraded state with only a single parity protection Seven SSDs Loss of service • Insufficient SSD space in an X-Brick might prevent the cluster from rebuilding the XDP group 4 Degraded state where the data has only a single parity protection • DAE chassis failure results in loss of service Copyright © 2014 EMC Corporation. All Rights Reserved. Module 7: Xtrem. IO Troubleshooting and Upgrades 14
Cluster Management – Start/Stop • Detailed in the • • • Storage Array Operations Guide May be required for maintenance procedure Impacts data availability There is a pre-shut down procedure Copyright © 2014 EMC Corporation. All Rights Reserved. Command Description start-cluster Starts a stopped cluster and enables it to respond to host IOs and process data stop-cluster Stops an active cluster and disables data processing in an orderly manner shutdown Shuts down a Storage Controller, a defined set of Storage Controllers, or all the Storage Controllers of a specified cluster Module 7: Xtrem. IO Troubleshooting and Upgrades 15
Storage Controller – Power On/Off • Understand ramifications • • • prior to running Powering off disconnects all connected hosts from the paths to this SC Powering off entire cluster does not respond to host IOs requests Using the GUI or CLI commands determine the issues Copyright © 2014 EMC Corporation. All Rights Reserved. Command Description power-off Shuts down a Storage Controller power-on Powers up a Storage Controller power-cycle Powers down and up a Storage Controller Module 7: Xtrem. IO Troubleshooting and Upgrades 16
Field Replaceable Units • FRUs should always be • Table of Contents from FRU Guide escalated Perform FRU when required 4 SC 4 DAE 4 IB 4 BBU • Process is detailed and specific 4 Always check support. emc. com Copyright © 2014 EMC Corporation. All Rights Reserved. Module 7: Xtrem. IO Troubleshooting and Upgrades 17
FRU – SC Replacement • Gather the defective storage controller configuration data • Power down the defective storage controller 4 Verify that the deactivation process is complete 4 Enabled-state value should be user_disabled • Disconnect all power and I/O cables from the back of the server 4 Make sure that all cables are clearly labeled before disconnecting them from the storage controllers • Replace storage controller 4 Connect only the two power cables and the network cable to the Storage Controller MGMT port Copyright © 2014 EMC Corporation. All Rights Reserved. Module 7: Xtrem. IO Troubleshooting and Upgrades 18
FRU – SC Replacement (Cont. ) • Configure Storage Controller configuration (Easy-CLI) • Register the new Storage Controller 4 replace-storage-controller sc-id=<ID> where ID is the index of the defective Storage Controller • No need to record new WWNs or rezone 4 WWN spoofing is performed Copyright © 2014 EMC Corporation. All Rights Reserved. Module 7: Xtrem. IO Troubleshooting and Upgrades 19
FRU – Storage Controller Power Supply • Identify the defective storage controller power supply 4 show-storage- PS removal from FRU Guide controllers-psus • Replace the defective storage • controller power supply Configure the replaced storage controller power supply 4 replace-storage- controller-psu sc-psu -id=<ID> • Verify that the status is Healthy Copyright © 2014 EMC Corporation. All Rights Reserved. Module 7: Xtrem. IO Troubleshooting and Upgrades 20
FRU – XMS Replacement • Identify the defective XMS • Replace the defective XMS 4 Physical requires hardware replacement 4 Virtual required OVA redeployment • Configure the replaced XMS 4 Upload software image to XMS 4 Invoke xms-recovery command 4 Invoke xms-restart command 4 Configure ESRS, DNS, and NTP Copyright © 2014 EMC Corporation. All Rights Reserved. Module 7: Xtrem. IO Troubleshooting and Upgrades 21
FRU – SSD Replacement • Identify the defective SSD • Replace the defective SSD 4 Remove the defective SSD entry from the cluster database 4 remove-ssd ssd-id=<Name or Index> 4 Add the new SSD to the XDP Group 4 add-ssd brick-id=<Brick ID> ssd-UID=<SSD Index or Name> • Verify 4 show-ssds Copyright © 2014 EMC Corporation. All Rights Reserved. Module 7: Xtrem. IO Troubleshooting and Upgrades 22
FRU – BBU Replacement • Identify the defective BBU • Replace the defective BBU 4 replace-bbu bbu-id=<ID> • Confirm BBU is in operational state Inserting a BBU from FRU Guide 4 show-bbus Copyright © 2014 EMC Corporation. All Rights Reserved. Module 7: Xtrem. IO Troubleshooting and Upgrades 23
FRU – Infini. Band Replacement • Identify the defective Infini. Band switch 4 show-infiniband- switches • Replace the defective Infini. Band • switch Configure the replaced Infini. Band switch 4 replace-infiniband- switch ibswitch-id=<id> Copyright © 2014 EMC Corporation. All Rights Reserved. Module 7: Xtrem. IO Troubleshooting and Upgrades 24
Module 7: Xtrem. IO Troubleshooting and Upgrades Lesson 2: Summary This lesson covered the following topic: • Field replaceable units procedure review Copyright © 2014 EMC Corporation. All Rights Reserved. Module 7: Xtrem. IO Troubleshooting and Upgrades 25
Module 7: Xtrem. IO Troubleshooting and Upgrades Lesson 3: Upgrades This lesson covers the following topics: • Hardware upgrades • Disruptive vs non-disruptive software upgrades Copyright © 2014 EMC Corporation. All Rights Reserved. Module 7: Xtrem. IO Troubleshooting and Upgrades 26
Hardware Upgrades • Cluster Hardware Upgrade • • • is a disruptive procedure Requires software installation and reinitialization Existing data and configuration on the cluster will be erased Procedures for adding XBricks to a cluster are similar to those for installing a cluster from a mini-rack Copyright © 2014 EMC Corporation. All Rights Reserved. Mini-Rack box Module 7: Xtrem. IO Troubleshooting and Upgrades 27
Software Upgrade Options • Disruptive (cold) Upgrade 4 Performed when I/O traffic interferes with the upgrade process 4 Data is unavailable to the user 4 I/O traffic is re-continued only after the cluster upgrade is completed • Non-Disruptive Upgrade (NDU) 4 Upgrades the software on a running cluster 4 The upgrade process is performed while the service is online 4 Performance degradation during the upgrade • Software installation, following a hardware upgrade 4 Re-installing an updated software package on the entire cluster Copyright © 2014 EMC Corporation. All Rights Reserved. Module 7: Xtrem. IO Troubleshooting and Upgrades 28
Disruptive (Cold) upgrade • 21 step procedure in • • • Software Installation and Upgrade Guide Xtrem. IO Software Installation and Upgrade Guide Contact support Many verification steps Disable ESRS during procedure Upload software package Performed as Tech user Copyright © 2014 EMC Corporation. All Rights Reserved. Module 7: Xtrem. IO Troubleshooting and Upgrades 29
Non-Disruptive Upgrade (NDU) • 14 step procedure in • • • Software Installation and Upgrade Guide Xtrem. IO Software Installation and Upgrade Guide Contact support Many verification steps Disable ESRS during procedure Upload software package Performed as Tech user Copyright © 2014 EMC Corporation. All Rights Reserved. Module 7: Xtrem. IO Troubleshooting and Upgrades 30
Module 7: Xtrem. IO Troubleshooting and Upgrades Lesson 3: Summary This lesson covered the following topics: • Hardware upgrades • Disruptive vs non-disruptive software upgrades Copyright © 2014 EMC Corporation. All Rights Reserved. Module 7: Xtrem. IO Troubleshooting and Upgrades 31
Module 7: Xtrem. IO Troubleshooting and Upgrades Lesson 4: Support Resources This lesson covers the following topics: • Product support page • Sol. Ve Desktop • Release notes Copyright © 2014 EMC Corporation. All Rights Reserved. Module 7: Xtrem. IO Troubleshooting and Upgrades 32
Product Support Page • support. emc. com • Installation related guides 4 Site preparation guide 4 Hardware installation guide 4 Operations guide 4 Software installation guide 4 Users guide • Software 4 XMS OVA file 4 Xtremapp Code Copyright © 2014 EMC Corporation. All Rights Reserved. Module 7: Xtrem. IO Troubleshooting and Upgrades 33
Sol. Ve Desktop • An alternative to • • support. emc. com Provides procedures for common tasks, technical reviews, and alerts Procedure Generator Copyright © 2014 EMC Corporation. All Rights Reserved. Module 7: Xtrem. IO Troubleshooting and Upgrades 34
Current Release Notes Review • Xtrem. IO is a new product and currently maintains a high • • • frequency update cycle The latest version of code might have already been discussed in the course This course might be behind the current GA version Review current release notes and discuss changes Copyright © 2014 EMC Corporation. All Rights Reserved. Module 7: Xtrem. IO Troubleshooting and Upgrades 35
Module 7: Xtrem. IO Troubleshooting and Upgrades Lesson 4: Summary This lesson covered the following topics: • Sol. Ve Desktop • Release notes • Support resources Copyright © 2014 EMC Corporation. All Rights Reserved. Module 7: Xtrem. IO Troubleshooting and Upgrades 36
Module 7: Summary Key points covered in this module: • Troubleshooting • Field Replaceable Units (FRU) • Support resources and updates Copyright © 2014 EMC Corporation. All Rights Reserved. Module 7: Xtrem. IO Troubleshooting and Upgrades 37
Course Summary Key points covered in this course: • Xtrem. IO hardware components • Software installation procedures • Using the Xtrem. IO GUI and CLI interface for management • Provisioning from an Xtrem. IO array • Monitoring an Xtrem. IO storage array • Maintenance procedures and troubleshooting methods Copyright © 2014 EMC Corporation. All Rights Reserved. Module 7: Xtrem. IO Troubleshooting and Upgrades 38
Thank You! Copyright © 2014 EMC Corporation. All Rights Reserved.
- Slides: 39