Lessons Learned While Deploying v SAN TwoNode Clusters

  • Slides: 31
Download presentation
Lessons Learned While Deploying v. SAN Two-Node Clusters Wes Milliron – Systems Engineer @wesmilliron

Lessons Learned While Deploying v. SAN Two-Node Clusters Wes Milliron – Systems Engineer @wesmilliron blog. wesmilliron. com

Overview q What v. SAN is and how it works q v. SAN Requirements

Overview q What v. SAN is and how it works q v. SAN Requirements and Design Considerations q Configuration Summary with Gotchas q Unexpected Outcomes q Resources

What is v. SAN? • Software-defined storage solution • Pools local storage from hosts

What is v. SAN? • Software-defined storage solution • Pools local storage from hosts into a single shared datastore • Policy-driven (SPBM) performance and protection control • No hardware RAID

Types Of v. SAN Deployments v. SAN 2 -Node Cluster v. SAN 3 -Node

Types Of v. SAN Deployments v. SAN 2 -Node Cluster v. SAN 3 -Node Cluster • Supports 1 failure v. SAN 3 -Node Cluster • Supports 1 failure v. SAN 4+ Node Cluster • Supports 2+ failures v. SAN 4+ Node Cluster

Disk Groups Example Hybrid v. SAN Host • Logical grouping of physical disks on

Disk Groups Example Hybrid v. SAN Host • Logical grouping of physical disks on a host • Each disk group has 1 SSD for cache, and 1 or more capacity disks • At least 2 disk groups recommended per host

Objects and Components • v. SAN is object-based distributed storage • Objects are split

Objects and Components • v. SAN is object-based distributed storage • Objects are split into components, which are distributed across the hosts in the cluster • Component count is determined by object size/policy • • Failures to Tolerate (FTT) “RAID” Type Disk Stripes Maximum component size of 255 GB • 50% of an objects components must be available

What is v. SAN 2 -Node for? • ROBO = Remote Office/Branch Office =

What is v. SAN 2 -Node for? • ROBO = Remote Office/Branch Office = License type • When HA is required, but the VM count for a site is low • Less expensive alternative to other HA ROBO solutions • Requires witness host

Requirements | Hardware Compatibility • v. SAN leans heavily on hardware functionality • Use

Requirements | Hardware Compatibility • v. SAN leans heavily on hardware functionality • Use the v. SAN Hardware Compatibility List (HCL) while planning • Other options: • Vx. Rail from Dell EMC is an appliance-type solution • Certified v. SAN Ready. Nodes from OEM partner of choice • NEW: v. SAN Hardware Compatibility Checker Fling

Requirements | Networking Between Hosting Nodes Max Latency 1 ms RTT Bandwidth (Hybrid) 1

Requirements | Networking Between Hosting Nodes Max Latency 1 ms RTT Bandwidth (Hybrid) 1 Gbps Bandwidth (All-Flash) 10 Gbps • Maximum of 500 ms latency is supported from hosting site to witness, but sub-200 ms is recommended • Witness bandwidth required is 2 Mbps/1000 components • Each host has a 9000 component maximum • Witness bandwidth won’t exceed 18 Mbps

v. Sphere and v. SAN Versions • For v. SAN versions 6. 6 and

v. Sphere and v. SAN Versions • For v. SAN versions 6. 6 and later, unicast is fully supported for v. SAN traffic • v. SAN 6. 6 Requires at least ESXi 6. 5. 0 d • Try to avoid earlier versions

Requirements | Cluster Witness • One witness per cluster • Maintains quorum and data

Requirements | Cluster Witness • One witness per cluster • Maintains quorum and data accessibility of the cluster • Contains witness components • Does not contribute CPU/RAM/Storage to cluster • Witness OVA exclusively for 2 -node and stretched cluster v. SAN

Licensing Considerations • Two licenses required: • One for v. Sphere (hosts) • One

Licensing Considerations • Two licenses required: • One for v. Sphere (hosts) • One for v. SAN (cluster) • ROBO licenses have a 25 VM maximum • DRS requires a v. Sphere Enterprise license v. SAN License Options

Design | Limiting Factors q Determine your limiting factors: • Budget • Site networking

Design | Limiting Factors q Determine your limiting factors: • Budget • Site networking options • Witness location and networking q Capacity q Hardware q Networking

Design | Capacity Planning q Determine your limiting factors q Capacity • Hybrid or

Design | Capacity Planning q Determine your limiting factors q Capacity • Hybrid or all-flash configuration • 2 -node clusters are limited to mirrored storage policies • Storage multiplier for mirrored policies is 2 X q Hardware q Networking

Design | Hardware Considerations q Determine your limiting factors q Capacity q Hardware •

Design | Hardware Considerations q Determine your limiting factors q Capacity q Hardware • Cache drive should be at least 10% of consumed storage • Write buffer is limited to 600 GB • SAS SSD vs M 2 vs SD Card • Separate RAID controller for OS volume q Networking

Design | Networking Approach q Determine your limiting factors q Capacity q Hardware q

Design | Networking Approach q Determine your limiting factors q Capacity q Hardware q Networking • Distributed Switch vs Standard v. Switch • NIOC • Link Aggregation Groups (LAG) • v. Center Portability

Build Summary q Build ESXi hosts q Configure DVS q Deploy v. SAN Witness

Build Summary q Build ESXi hosts q Configure DVS q Deploy v. SAN Witness q Create Cluster q Configure v. SAN

Host Configuration • Build ESXi hosts using standard practices • RAID controllers • •

Host Configuration • Build ESXi hosts using standard practices • RAID controllers • • Passthrough mode Write caching disabled • Enable v. Motion service on each host • Don’t forget NTP and Syslog!

Networking Configuration Distributed v. Switch Creation • Number of uplinks • NIOC

Networking Configuration Distributed v. Switch Creation • Number of uplinks • NIOC

Networking Configuration Management Port Group v. SAN Port Group

Networking Configuration Management Port Group v. SAN Port Group

Networking Configuration v. SAN VMK Creation • Dedicated for v. SAN traffic • Remember

Networking Configuration v. SAN VMK Creation • Dedicated for v. SAN traffic • Remember the v. SAN service!

Witness Deployment • Separate subnets for management and v. SAN VMKs • Set management

Witness Deployment • Separate subnets for management and v. SAN VMKs • Set management IP, hostname, etc • Add witness host to v. Center • Modify witness v. SAN VMK • Override default gateway

v. SAN Cluster Creation 1. Create cluster object in v. Center with DRS and

v. SAN Cluster Creation 1. Create cluster object in v. Center with DRS and HA disabled 2. Enable v. SAN service on the cluster • Enable dedupe/compression if applicable 3. Claim v. SAN disks for cache and capacity roles • Code to link NAA ID with Drive Bay 4. Select the deployed witness appliance as cluster witness

Required Extra Step for Direct-Connect Clusters • Witness traffic must be re-routed out of

Required Extra Step for Direct-Connect Clusters • Witness traffic must be re-routed out of the management VMK • Run from each VM host - not witness esxcli vsan network ip add -i vmk 0 –T=witness • Information on Witness Traffic Separation • Troubleshoot with vmkping -I vmk 1

Finishing Touches • Assign license to the cluster Patching • Without DRS, VMs need

Finishing Touches • Assign license to the cluster Patching • Without DRS, VMs need to be manually migrated • Enable High Availability • Admission Control set to 50% reserved for CPU • Hosts and witnesses must be at the same patch level and Memory • Datastore heartbeating • Upgrade the v. SAN on-disk is not enabled by default format version after entire on v. SAN datstores cluster is patched

v. SAN Health Check • Able to drill down into potential issues • Automatically

v. SAN Health Check • Able to drill down into potential issues • Automatically remediate problems • Online health checks

Deployment Strategies Build and Ship and Build • Stage environment in-house • Deliver to

Deployment Strategies Build and Ship and Build • Stage environment in-house • Deliver to remote site • Change IPs/DNS before shutdown • Build remotely • Ship to remote site • Remote hands follow runbook

Documentation

Documentation

Unexpected Outcomes • Organizational culture shift • Aligning design and deployment strategies • Do

Unexpected Outcomes • Organizational culture shift • Aligning design and deployment strategies • Do it once – do it right

Resources • Storage Hub - VMware • Virtual Blocks Blog • Cormac Hogan’s Blog

Resources • Storage Hub - VMware • Virtual Blocks Blog • Cormac Hogan’s Blog • Wes Milliron’s Blog • Building a 2 -Node Direct Connect v. SAN Cluster • Associating NAA ID with Physical Drive Bay Use the v. Community!

Thank You @wesmilliron blog. wesmilliron. com

Thank You @wesmilliron blog. wesmilliron. com