Introduction Types of planned maintenance The majority of

  • Slides: 24
Download presentation

Introduction: Types of planned maintenance The majority of updates have no impact to hosted

Introduction: Types of planned maintenance The majority of updates have no impact to hosted VMs. However, there are cases where updates to the hosting infrastructure result in interference to running VMs: • VM preserving maintenance (aka PHU- Preserving Host Updates) • The virtual machine is placed into a “paused” state • RAM, Network connection, open files remain open • Virtual machine is then resumed within 30 seconds • VM restarting maintenance • VM Reboot – VM is restarted. Memory, open files, network connections are lost • VM Redeploy – VM is moved to another host. Ephemeral drive is lost as well

Planned Maintenance: Goals Better communication More Control • • Longer Notification time. Per VM

Planned Maintenance: Goals Better communication More Control • • Longer Notification time. Per VM status Configure recipients (not just admin) Multiple Channels • • User initiated maintenance • 14 -30 days control window • In-VM Scheduled Events Notification Email/SMS/Web. Hook Programmatically Azure Portal Microsoft Confidential Reduced impact • A single VM Restarting maintenance per year • Tighter maintenance window • Keep ephemeral disks if possible • Innovate in order to reduce impact (Live. Migration)

The planned maintenance cycle • Maintenance Cycle contains multiple iterations (waves) • Each iterations

The planned maintenance cycle • Maintenance Cycle contains multiple iterations (waves) • Each iterations has separate scheduled and scope. • Safe Deployment • Initial iterations have smaller scope: • EUAP Regions ( Canary) First Region (PILOT) Broad rollout • Region Pairs • Enable customers to have BC/DR schemas across paired regions (within a geo) • An iteration will impact only one region in the pair • See the list of Azure region pairs. • UD Safe • Cloud Services, Availability Set and VM Scale Set all use Update Domains (UDs) • Only a single update domain is impacted at any given time

The planned maintenance iteration (wave) T 0 • Pre-emptive Window (30 days) • Starts

The planned maintenance iteration (wave) T 0 • Pre-emptive Window (30 days) • Starts with a notification to the user • • One notification per subscription Sent to subscription’s admin Customers can add recipients Customer can add more channels (SMS, web hook) • Maintenance window is discoverable • Discover which VMs are going to be impacted in this wave • Azure maintenance dashboard • Azure VM list blade, VM Details blade, API, Power. Shell, Cli • Enable customers to start maintenance on their VMs T 1 T 2 T 3 • Scheduled Maintenance Window • Maintenance window is discoverable • In-VM Scheduled Events raised 15 minutes prior to the actual impact.

Should I start maintenance myself ? • Initiate maintenance when: • • • You

Should I start maintenance myself ? • Initiate maintenance when: • • • You need to communicate an exact maintenance window You need to control/orchestrate the impact You need more than 30 minutes between each two VMs You have a large state stored in a local (ephemeral) disk You are running a single instance VM for a production workload • Let Azure initiate maintenance when • You have a large deployment (Availability Set, VMSS) where a single VM going down does not affect the overall availability • You are using scheduled events to proactively drain or failover VMs. Impact: User Initiated: The VM will move to another node (ephemeral drive is lost) Platform Initiate: The VM will be rebooted (ephemeral drive is kept) Note: Single VM per node (D 15, DS 15, GS 5, L 32) will be maintained in-place.

Scheduled Events: Surfacing Upcoming Events Inside a VM • Surface upcoming maintenance events from

Scheduled Events: Surfacing Upcoming Events Inside a VM • Surface upcoming maintenance events from within the VM to improve Availability • A local endpoint with a simple REST API • Visibility to upcoming event across all instances: VMs, cloud service / Availability Set/ VMSS • A Not. Before time (10 -15 minutes notification) • Acknowledge completion to expedite • Potential use cases: • Graceful shutdown – save state, drain node, suspend jobs • Proactive failover – fasted failover (skip detection) • Adjust thresholds – Avoid failover in the case of VMpreserving maintenance • Cover all maintenance scenarios • Platform initiated (e. g. Host OS rollout) • Interactive calls (e. g. restart a VM) • Predictable hardware failures curl -H Metadata: true http: //169. 254/metadata/scheduledevents? api-version=2017 -03 -01 { "Document. Incarnation": {Incarnation. ID}, "Events": [ { "Event. Id": {event. ID}, "Event. Type": "Reboot" | "Redeploy" | "Freeze", "Resource. Type": "Virtual. Machine", "Resources": [{resource. Name}], "Event. Status": "Scheduled" | "Started", "Not. Before": {time. In. UTC}, } ] }

Learn more • Azure Friday – Planned Maintenance (Video) • Planned Maintenance concepts •

Learn more • Azure Friday – Planned Maintenance (Video) • Planned Maintenance concepts • Planned Maintenance – How-To for Windows and Linux • Planned Maintenance – Blog post

The s lide w ill be This s repla lide is req ced o

The s lide w ill be This s repla lide is req ced o u nsite ired. Do N throu gh Si OT delete lver F ox Pr or alter th oduc tions e slide. with an up da http: //myignite. microsoft. com https: //aka. ms/ignite. mobileapp ted Q R cod e.

Set activity log alert with Type=Maintenance and Status = ALL Add recipients and channels

Set activity log alert with Type=Maintenance and Status = ALL Add recipients and channels to your alert

New ‘Planned Maintenance Dashboard’

New ‘Planned Maintenance Dashboard’

See Impacted resources

See Impacted resources

Alternatively, use the VM List view

Alternatively, use the VM List view

Maintenance information is surfaced per VM as well

Maintenance information is surfaced per VM as well

Proactive-Redeploy to start maintenance

Proactive-Redeploy to start maintenance

VM Experience – CRP

VM Experience – CRP

Initiate Maintenance – CRP

Initiate Maintenance – CRP

Planned Maintenance & Scheduled Events

Planned Maintenance & Scheduled Events

Scheduled Events - API URI Description curl -H Metadata: true http: //169. 254/metadata/scheduledevents? api-version=2017

Scheduled Events - API URI Description curl -H Metadata: true http: //169. 254/metadata/scheduledevents? api-version=2017 -03 -01 Retrieve all events for current tenant Response : { "Document. Incarnation": {Incarnation. ID}, "Events": [ { "Event. Id": {event. ID}, "Event. Type": "Reboot" | "Redeploy" | "Freeze", "Resource. Type": "Virtual. Machine", "Resources": [{resource. Name}], "Event. Status": "Scheduled" | "Started", "Not. Before": {time. In. UTC}, } ] } https: //docs. microsoft. com/en-us/azure/virtual-machines-scheduled-events Event Type • Pause – CPU is suspended, resources maintained ( e. g. active connections) • Reboot – VM Reboot, Memory wiped • Freeze – VM moved to another host. Ephemeral drive loss Event Status • Started – Events already approved and started • Scheduled – Events to be approved next

https: //myignite. microsoft. com/evaluations https: //aka. ms/ignite. mobileapp

https: //myignite. microsoft. com/evaluations https: //aka. ms/ignite. mobileapp