Azure Managed Disk–Overview

pexels-photo-196520[1]

Microsoft released Azure Managed disk 2 weeks ago.  Let’s look at it!

What did we have until then?  The virtual hard disk (.vhd file) was stored as a page blob in an Azure Storage account.

That worked quite fine and Azure Disks are a little more than that.  A little abstraction.  But at the same time, Azure now knows it’s a disk and can hence optimize for it.

Issues with unmanaged disks

That’s right, our good old page blob vhd is now an unmanaged disk.  This sounds like 2001 when Microsoft released .NET & managed code and you learned that all the code you’ve been writing since then was unmanaged, unruly!

Let’s look at the different issues.

First that comes to mind is the Input / Output Operations per Seconds (IOPs).  A storage account tops IOPS at 20000.  An unmanaged standard disk can have 500 IOPs.  That means that after 40 disks in a storage account, if we only have disks in there, we’ll start to get throttled.  This doesn’t sound too bad if we plan to run 2-3 VMs but for larger deployments, we need to be careful.  Of course, we could choose to put each VHD in different storage account but a subscription is limited to 100 storage accounts and also, it adds to management (managing the domain names & access keys of 100 accounts for instance).

Another one is access rights.  If we put more than one disks in a storage account, we can’t give different access to different people to different disks:  if somebody is contributor on the storage account, he / she will have access to all disks in the account.

A painful one is around custom images.  Say we customize a Windows or Linux image and have our generalized VHD ready to fire up VMs.  That VHD needs to be in the same storage account than the VHD of the created VM.  That means you can only create 40 VMs really.  That’s where the limitation for VM scale set with custom images comes from.

A side effect of being in a storage account is the VHD is publicly accessible.  You still need a SAS token or an access key.  But that’s the thing.  For industries with strict regulations / compliances / audits, the ideas of saying “if somebody walked out with your access key, even if they got fired and their logins do not work anymore, they can now download and even change your VHD” is a deal breaker.

Finally, one that few people are aware of:  reliability.  Storage accounts are highly available and have 3 synchronous copies.  They have a SLA of %99.9.  The problem is when we match them with VMs.  We can setup high availability of a VM set by defining an availability set:  this gives some guarantees on how your VMs are affected during planned / unplanned downtime.  Now 2 VMs can be set to be in two different failure domains, i.e. they are deployed on different hosts and don’t share any critical hardware (e.g. power supply, network switch, etc.) but…  their VHDs might be on the same storage stamp (or cluster).  So if a storage stamp goes down for some reason, two VMs with different failure / update domain could go down at the same time.  If those are our only two VMs in the availability set, the set goes down.

Managed Disks

Managed disks are simply page blobs stored in a Microsoft managed storage account.  On the surface, not much of a change, right?

Well…  let’s address each issues we’ve identified:

Other differences

Beside the obvious advantages here is a list of differences from unmanaged disks:

Also, quite importantly, Managed Disks do not support Storage Service Encryption at the time of this writing (February 2017).  It is supposed to come very soon though and Managed Disks do support encrypted disks.

Summary

Manage Disks bring a couple of goodies with them.  The most significant one is reliability, but other features will clearly make our lives easier.

In future articles, I’ll do a couple of hands on with Azure Managed Disks.


2 responses

  1. Anonymous 2017-08-01 at 01:42

    hi vincent ..its really so nice of you to publish article with such a great deep dive …really very helpful and the way you collate all the additional information at one stop. Thanks!! I need little more understanding on RTO and RPO in the context of ASR as in what azure offers as that is always a deciding factor to customers when comparing DR using traditional methods.

  2. Vincent-Philippe Lauzon 2017-08-01 at 07:14

    You can have a look at https://docs.microsoft.com/en-us/azure/site-recovery/site-recovery-overview for ASR. They mention the RPO could be as low as replications every 30 seconds.

    Basically ASR replicates on a continuous basis. An agent in the VM intercept every IO calls and feed it to replication.

    RTO would basically be the time it takes your ops team to acknowledge a service outage in the primary region + the time to failover. Failover time is linked to the time to create disks from replication vault & boot the VMs. I would definitely test it to get a number.

    Have a look at https://vincentlauzon.com/2016/07/11/disaster-recovery-with-azure-virtual-machines/ to see different options you have for DR in Azure.

Leave a comment