Your Service Better than Azure Platform Better than
Your Service Better than Azure Platform Better than Physical Assets
Sync Storage Replication Load Balancing Async Storage Replication Traffic Manager Managed Disks Planned Maintenance HA SLA Single Instance SLA Scheduled Events VM BC/DR (ASR) VM Backup & DR Scenario Single Instance Lower Availability Sets VMSS Zone spanning VM/VMSS Cost Higher
• 99. 9% SLA guarantee for VMs backed by Premium Storage • First global public cloud to offer single VM SLA • Move legacy apps to cloud and stay compliant • Superior storage availability and performance Average VM Availability 100. 000% 99. 999% 99. 998% 99. 997% 99. 996% 99. 995% 99. 994% 99. 993% 99. 992% 99. 991% 99. 990% 99. 989% 99. 988% 99. 987% 99. 986% 99. 985% Your Service Better than M ay il Ap r ch ar M y n ua r br Fe ce m De Ja be r r be em ov N O ct ob er r pt em be st Se Au gu ly Ju ne Ju ay M il Ap r ch ar M y ua r br Fe Ja n Azure Platform Better than Physical Assets
• “Gray Failure: The Achilles’ Heel of Cloud-Scale Systems”, Hot. OS ’ 17, May 08 -10, 2017, Whistler, BC, Canada, Peng Huang, etc. • “Bravo to Microsoft Research and the Azure folks for publishing this paper. It's nice to know that MS has some very smart people minding the store. ” - Robin Harris, ZDNet as of July 24, 2017
T. B. D : Screenshots
Overview Scheduled backups On-Demand backups Geo-redundant or Locallyredundant T. B. D : Screenshots
Overview Azure site Recovery Non disruptive drills Recent Announcements https: //azure. microsoft. com/en-us/blog/announcing-disaster-recovery-for-azure-iaas-vms-using-asr/
Challenges • Stateful workloads are vulnerable to frequent outages across multiple instances. • Others are impacted by a single outage • Lack of knowing what lead to a VM outage Azure Scheduled Events • Surface upcoming events from within the VM • A local endpoint with a simple REST API • Instance metadata service Support graceful shutdown • • • Provides a Not. Before time for the workload to initiate graceful shutdown Acknowledge completion to expedite Cover all maintenance scenarios • • • Platform initiated (e. g. Host OS rollout) User policy initiated (e. g. guest OS update) Interactive calls (e. g. restart a VM) Predictable hardware failures URI curl -H Metadata: true http: //169. 254/metadata/scheduledevents? apiversion=2017 -03 -01 Event Response : { Events: [ { Event. Id: “ 92 ac 957 f-f 99 a-4 e 8 c-9578 -39 a 0 f 6518 cd 2”, Event. Type: “Reboot | Redeploy | Freeze”, Resource. Type: “Virtual Machine“, Resources: [ {vm. Name} ], Status: ”Started | Scheduled ”, Not. Before: ” 2016 -04 -20 T 12: 30: 29+0100”, } }]
https: //github. com/zivraf/Scheduled. Events
Better Visibility More Control Reduced impact • Longer Notification time. • Visibility to the maintenance window • Additional recipients • Additional Channels • Proactive-Redeploy • Longer control window • Automate failover with Scheduled Events • Memory preserving maintenance (when possible) • Live Migration when possible • Keep ephemeral disks if possible • • • Notification Email/SMS/Web. Hook Programmatically Azure Portal
T 0 Pre-emptive Window (30 days) Starts with a notification to the user One notification per subscription Sent to subscription’s admin Customers can add recipients Customer can add more channels (SMS, web hook) Maintenance window is discoverable Discover which VMs are going to be impacted in this wave Azure maintenance dashboard Azure VM list blade, VM Details blade, API, Power. Shell, Cli Enable customers to start maintenance on their VMs T 1 T 2 T 3 Scheduled Maintenance Window Maintenance window is discoverable In-VM Scheduled Events raised 15 minutes prior to the actual impact.
(Not Highly Available) Planned Maintenance Single Instance SLA Scheduled Events VM Backup & DR Scenario Single Instance Lower Cost Higher
• Provides fault domains and upgrade domains • Tied to a role in your application • Required for 99. 95% SLA AVSet Subnet Virtual Network
Managed Availability Sets ü Simple - Abstracts storage accounts from customers ü Granular access control – Top level ARM resource, apply Azure RBAC ü Storage account limits do not apply – No throttling due to storage account IOPS limits ü Big scale - 20, 000 disks per region per subscription ü Better Storage Resiliency - Prevents single points of failure due to storage ü Support VM level disk encryption – secure data at rest FD 0 FD 1 FD 2 Managed Storage account 1 Managed Storage account 2 Managed Storage account 3 Storage FD 0 Storage FD 1 Storage FD 2
Load Balancing Managed Disks Planned Maintenance HA SLA Single Instance SLA Scheduled Events VM Backup & DR Scenario Single Instance Lower Availability Sets VMSS Cost Higher
What is it Region +100 mi apart One or more data centers Paired regions Data Center A ‘Building’ Cluster of Racks Not surfaced to customers Deployment & management Rack Multiple nodes PDU TOR 40 Azure Regions Recently ANNOUNCED: France: France Central and France South Korea: Korea Central and Korea South Do. D East and Central Australia gov South Africa
What is it Region +100 mi apart One or more data centers Paired regions Availability Zone Isolated locations within a region Offered 3 per regions Data Center A ‘Building’ Cluster of Racks Not surfaced to customers Deployment & management Rack Multiple nodes PDU TOR § Availability Zones (AZ) are physically separated locations within an Azure region § Each AZ has independent power, network, and cooling § AZ locations are chosen based on a per-region risk assessment § Reduce single points of failure in the platform § Close enough to provide regional latency for synchronous data replication
Zone redundant Load Balancer VNet • • Assign VMs to specific zones Assign scale sets to a specific zone Zone redundant load balancer Public IP per VM VM scale set Zone 1 VM scale set Zone 2 VM scale set Zone 3 Zone redundant Load Balancer • Rolling preview: Regions: West Europe & East US 2 Sizes: AV 2, DSv 2 VNet VM Zone 1 VM Zone 2 VM Zone 3
ARM template examples – Minimal change
Early access preview Zone redundant Load Balancer • Single end point for VMs spread across zones VNet • Zone failure resilient • Auto Scaling • High level of isolation (FDs within zones) VM scale set
Zone spanning scale set https: //github. com/Azure/vm-scale-sets/blob/master/preview/upgrade/zonesmanualrolling. json
üRead / write resilience to single cluster/ datacenter unavailability üSupport for Blob, Table, File, Queue Storage üPublic Preview in Q 4 CY 2017 in multiple regions. GA in H 1 CY 2018 Synchronous data replication across Azure Availability Zones within region LRS ZRS GRS RA-GRS Lowest cost storage Resilience to single cluster/ datacenter outage Resilience to regional outage Read accessible secondary
Sync Storage Replication Load Balancing Managed Disks Planned Maintenance HA SLA Single Instance SLA Scheduled Events VM Backup & DR Scenario Single Instance Lower Availability Sets VMSS Zone spanning VM/VMSS Cost Higher
Region Pairs https: //docs. microsoft. com/en-us/azure/best-practices-availability-paired-regions
Cosmos DB SQL My. SQL Post. Gres Blob Storage VM to VM
Sync Storage Replication Load Balancing Async Storage Replication Traffic Manager Managed Disks Planned Maintenance HA SLA Single Instance SLA Scheduled Events VM BC/DR (ASR) VM Backup & DR Scenario Single Instance Lower Availability Sets VMSS Zone spanning VM/VMSS Cost Higher
https: //myignite. microsoft. com/evaluations https: //aka. ms/ignite. mobileapp
- Slides: 34