Cisco UCS Hardware Monitoring BMC Proactive Net Performance
Cisco UCS Hardware Monitoring BMC Proactive. Net Performance Management for Cisco UCS
Sentry’s Hardware and Storage Monitoring solution Service Assurance Monitoring solution - Runs “within” BPPM Complements BMC BSM BMC exclusive Technology - PATROL Performance Manager Proactive. Net © Copyright 12/6/2020 BMC Software, Inc 2
Monitoring the Hardware of a Server Critical devices Environment Disks • Processors • Memory modules • Network cards • Link monitoring • Traffic • • • Controllers • Physical disks • RAIDs © Copyright 12/6/2020 BMC Software, Inc Temperature Cooling Power supplies Energy Usage 3
Features Inventory - Discover all of the internal components of servers, disk arrays, fiber switches and tape libraries. Perform an inventory with detailed information about each device’s characteristics. Monitoring - Disks: RAID controllers, hard disks, RAIDs, failure prediction, availability of the volumes. Environment: temperature, internal voltages, power supplies, fans. Critical components: processors, memory modules, ECC errors, failure prediction. Network links: network adapters, link loss, negotiated speed, data traffic, bandwidth utilization. Diagnosis - Provides details about each monitored component in order to facilitate its replacement should a failure occurs (vendor, model, serial number, part number, FRU number, location in the chassis) Full hardware health reports display detailed information regarding failures, their consequences and how to fix them. Reporting - Ethernet traffic report: visualize the network traffic on each port, in MB/sec or the total amount of data that transited, in and out, in GB per hour or per day. SAN traffic report: visualize the SAN traffic from the fiber switch, for each FC port, in MB/sec or the total amount of data that transited, in and out, in GB per hour or per day. Capacity Report Convenient report detailing the capacity of the monitored system: number of physical CPUs, amount of memory, overall size of disks and volumes, number of © Copyright 12/6/2020 BMC Software, Inc connected ports - 4
Monitoring Cisco UCS C-Series (rack-mount) Instrumentation - - IPMI § Environment § Processors, memory modules § LEDs § Power consumption § Disks WMI/SNMP § Network cards and traffic Prerequisites - - - UCS C-Series running Windows § Microsoft’s IPMI provider for WMI § WMI or Windows SNMP MIB-2 Agent UCS C-Series running Linux § ipmitool § Linux commands or Linux SNMP MIB-2 Agent Out-of-band monitoring § Through Cisco Integrated Management Controller (IMC) with remote IPMI © Copyright 12/6/2020 BMC Software, Inc 5
Cisco UCS C-Series running Windows © Copyright 12/6/2020 BMC Software, Inc 6
Cisco UCS C-Series running Linux © Copyright 12/6/2020 BMC Software, Inc 7
Cisco UCS C-Series Out-of-Band © Copyright 12/6/2020 BMC Software, Inc 8
Cisco UCS C-Series running VMWare ESXi © Copyright 12/6/2020 BMC Software, Inc 9
Monitoring Cisco UCS B-Series (Blades) Instrumentation - Through the Fabric Interconnect Switch Native UCS XML API Blade enclosure - Powering, cooling, temperature sensors Overall power consumption Status of each blade server Fabric Interconnect Switch - Powering, cooling, temperature Power consumption Ethernet and fiber links (status, speed) Traffic monitoring (in, out, MB/sec and GB/day) Blade servers (in the chassis) - Very much like regular servers Without power supplies, network cards Instrumented like a UCS C-Series server (Windows, Linux or VMware ESXi) © Copyright 12/6/2020 BMC Software, Inc 10
Cisco UCS B-Series (Blade Servers) © Copyright 12/6/2020 BMC Software, Inc 11
How it works internally Initialization • Checks protocol availability with specified credentials • SNMP, WMI, WBEM, IPMI, UCS XML API © Copyright 12/6/2020 BMC Software, Inc Platform detection • Tests each of the connectors against the monitored system • B-Series, CSeries • Windows, Linux, VMware • Builds a “detected connectors” list Discovery • Discovers hardware pieces • Detects “missing” components • Sets alert thresholds on all parameters • Activates/deactiv ates parameters depending on avail information Collection • Collects the value of each parameter • Executes “Alert Actions” when a threshold is breached 12
Use Case: Monitoring the Hardware of a Server Same module for all servers - Cisco UCS B-Series, Chassis and Interconnect Windows, Linux, VMware Same classes and parameters Easy integration Comprehensive - Temperature sensors, fans, power supplies, controllers, processors, memory modules, network cards, disks, HBAs, etc. Versatile - SNMP, WMI, WBEM, Telnet, SSH, command lines Upon hardware failure, a standard alert is generated The alert contains a full text description of the problem - Short description, status reported by the device Value of the various parameters and thresholds Possible consequences and recommended action Help to identify and replace the faulty device © Copyright 12/6/2020 BMC Software, Inc 13
Use Case: Instrumentation Failures Our product relies on various protocols and instrumentation layers If one protocol or instrumentation layer fails… - The associated “connector” goes into alarm The objects monitored through this protocol/instrumentation layer go “offline” No hardware alert is generated Helps you sort out real hardware problems from monitoring problems © Copyright 12/6/2020 BMC Software, Inc 14
Use Case: Hardware Inventory Our product discovers and reports on the real hardware components - Real number of CPUs (no cores, no hyper-threading) Real number of memory modules Real amount of physical disks (not only what is seen by the OS) Real number of network interfaces (not only the ones configured) Link Speed for network interfaces Other additional information Benefits - True hardware inventory (not just what is seen by the OS) Licensing based on the number of CPUs © Copyright 12/6/2020 BMC Software, Inc 15
Use Case: Monitoring the Traffic on the Interconnect Monitoring of the Ethernet and FC traffic - Internal and external For each port - Status of the SFP Link speed and status Received and transmitted packets, error percentage Traffic (received and transmitted) Bandwidth utilization Reporting - MB/sec Total amount of data in GB per day Benefits - Identify big users (servers) Analyze the impact of the nightly backups, the mirroring Analyze the impact of the deployment of a new application Diagnose multi-pathing issues Identify disk arrays under hard pressure Etc. © Copyright 12/6/2020 BMC Software, Inc 16
Use Case: Reporting the Power Consumption in the Data Center Live graph - In Watts Report - In k. Wh Per hour Per day Allow to calculate the actual energy cost of any system Benefits - “If you can’t measure it, you can’t manage it” Identify power-hungry devices Estimate the cost reduction provided by virtualization, upgrades, etc. Charge-back application owners © Copyright 12/6/2020 BMC Software, Inc 17
Use Case: Warming the Data Center Cooling costs = 50% of the electricity costs in the data center - P. U. E. = 2. 0+ Measure the temperature - Internal CPUs Ambient Compare to the alert thresholds Let you find the optimal temperature in the datacenter - Higher temperature means less cooling 1 degree warmer means 5% cooling costs reduction © Copyright 12/6/2020 BMC Software, Inc 18
Not only Cisco… List of Supported Servers Cisco • UCS BSeries • UCS CSeries Dell • Power. Edge (Win, Linux) • Blades © Copyright 12/6/2020 BMC Software, Inc HP IBM Sun • Pro. Liant (Win, Linux) • Integrity (Win, Linux, HP-UX) • HP 9000 (HP-UX) • Net. Server (Win) • Super. Dome (HP-UX) • Blade. Syste m • Alpha. Server (Tru 64) • Open. VMS • p. Series (AIX) • e. Server p 5 (AIX) • Netfinity (Win, Linux) • x. Series (Win, Linux) • Blade. Center • SPARC (sun 4 u) • SPARC T 1/T 2 (sun 4 v) • X 64 (Solaris, Linux) • Sun Fire F 12 K, F 15 K, etc. • Sun Fire M 4000, M 9000, etc. • Blades Fujitsu. Siemens • PRIMERGY (Win, Linux) • PRIMEPOW ER (Solaris) • Blade BX 19
And also SAN Devices Disk arrays • EMC Symmetrix • EMC Clariion • HP EVA, HP Storage. Work s XP, VA, EMA, MSA • IBM DS 3000, 4000, 6000, 8000 series • Hitachi USP, AMS © Copyright 12/6/2020 BMC Software, Inc Filers • Net. App Fiber switches • Brocade Silkworm • Mc. Data • Cisco MDS Tape libraries • Quantum/ADI C • IBM • HP • Storage. Tek HBA • Emulex • QLogic 20
Why BMC and Sentry Goal Improve uptime, optimize performance Lower IT costs Manage energy costs Where it Hurts Missed hardware failures Long time to resolve problems Integrating the monitoring of a new platform is complex and time-consuming Energy expenses keep climbing every year Frustrating not to know the culprits and what to do about it • Hardware-related problems • SAN-related problems Problems involve sysadmins, network admins and SAN admins. Hard to arbitrate Root Cause Solved by How Lack of visibility on server Hardware and SAN hardware health instrumentation is Lack of visibility on SAN perf. vendor-specific and sometimes even lacking No per-device visibility on power consumption Sentry’s Hardware and Storage Monitoring Solution Monitors the hardware • Servers, disk arrays, SAN switches, tape libraries • Disks, RAIDs, power supplies, NICs, HBAs, processors, etc. • Discovery, Inventory, Monitoring, © Copyright 12/6/2020 BMC Software, Inc Diagnosis, Reporting, Data traffic Single solution for all • Cisco B-Series, C-Series, all OSes. Monitors the power consumption • On each server and SAN device, in Watts and k. Wh • Works on 100% of IT 21
© Copyright 12/6/2020 BMC Software, Inc 22
- Slides: 22