Single VM SLA

seal-1771694_640 By now you’ve probably heard the news:  Azure became the first Public Cloud to offer SLA on single VM.

This was announced on Monday, November 21st.

In this article, I’ll quickly explore what that means.

Multi-VMs SLA

Before that announcement, in order to have SLA on connectivity to compute, we needed to have 2 or more VMs in an Availability Set.

This was and still is the High Availability solution.  It gives an SLA of %99.95 availability, measured monthly.

There is no constrain on the storage used (Standard or Premium) and the SLA includes planned maintenance and any failures.  So basically, we put 2+ VMs in an availability set and we’re good all the time.

Single-VM SLA

The new SLA has a few constraints.

So, it is quite important to state that it isn’t a simple extension of the existing SLA to a single-VM.  But it is very useful nonetheless.

Planned maintenance

I just wanted to expand a bit on planned maintenance.

What is a planned maintenance?  Once in a while Azure needs some maintenance which requires a shutdown of hosts.  Either the host itself gets updated (software / hardware) or it gets decommissioned altogether.  In those cases, the underlying VMs are shutdown, the host is rebooted (or decommissioned, in which case the VMs get relocated) and then the VM are booted back.

This is a downtime for a VM.

With an Highly Available configuration, i.e. 2+ VMs in Availability Set, the downtime of one VM doesn’t affect the availability of the availability set since there is a guarantee that there will always be one VM available.

Without an Highly Available configuration, there is no such guarantee.  For that reason, I suppose, this downtime isn’t covered within the SLA.  Remember, %99.9 on a monthly basis means 43 minutes of downtime per month.  A planned maintenance would easily take a few minutes of downtime:  taking into account the VMs shutdown (all of the VMs on the host), the host restart and the VM boot.  That isn’t negligible compare to the 43 minutes of margin the SLA gives.

This would leave very little margin of manoeuver for potential hardware / software failures during the month.

Now, that isn’t the end of the world.  For quite a few months we have the redeploy me now feature in Azure.  This feature redeploys the VM to a new host.  If there is a planned maintenance in course in the Data Center, the new host should be an updated one already, in which case our VM won’t need a reboot anymore.

Planned maintenance follow a workflow where a notification is sent a week in advance subscription owner (see https://docs.microsoft.com/en-us/azure/virtual-machines/virtual-machines-windows-planned-maintenance & https://docs.microsoft.com/en-us/azure/virtual-machines/virtual-machines-linux-planned-maintenance).  We can then trigger a redeploy at our earliest convenience (maintenance window).

Alternatively, we can trigger a redeploy every week, during a maintenance window and ignore notification emails.

High Availability

The previous section should have convinced you that Single-VM SLA isn’t a replacement for an Highly Available (HA) configuration.

On top of the Azure planned maintenance being outside the SLA, our own solution maintenance will impact the SLA of the solution.

In an HA configuration, we can take an instance down, update it, put it back, then upgrade the next one.

With a single VM we cannot do that and a solution maintenance will incur a downtime and should therefore be done inside maintenance window (of the solution).

For those reason, I still recommend to customers to use an HA configurations if HA is a requirement.

Enabled scenarios

What Single VM brings isn’t a cheaper HA configuration.  Instead, it enables non-HA configuration with SLA in Azure.

Until now, there were two modes.  Either we took the HA route or we lived without an SLA.

Often No SLA is ok.  For dev & test scenarios for instance, SLA is rarely required.

Often HA is required.  For most production scenarios I deal with, in the enterprise space & consumer facing space anyway, HA is a requirement.

Sometimes HA doesn’t make business sense and no SLA isn’t acceptable though.  HA might not make business sense when

For those scenarios, the single-VM SLA might hit the sweet spot.


Leave a comment