Shared Storage Options in Azure: Part 2 – IaaS Storage Server

Posted on Updated on

Reading Time: 9 minutes

Recently, I posted the “Shared Storage Options in Azure: Part 1 – Azure Shared Disks” blog post, the first in the 5-part series. Today I’m posting Part 2 – IaaS Storage Server. While this post will be fairly rudimentary insofar as Azure technical complexity, this is most certainly an option when considering shared storage options in Azure and one that is still fairly common with a number of configuration options. In this scenario, we will be looking at using a dedicated Virtual Machine to provide shared storage through various methods. As I write subsequent posts in this series, I will update this post with the links to each of them.


Virtual Machine Configuration Options:


While it may not seem vitally important, the VM SKU you choose can impact your ability to provide storage capabilities in areas such as Disk Type, Capacity, IOPS, or Network Throughput. You can view the list of VM SKUs available on Azure at this link. As an example, I’ve clicked into the General Purpose, Dv3/Dvs3 series and you can see there are two tables that show upper limits of the SKUs in that family.

VM SKU Sizes in Azure

VM Performance Metrics in Azure

In the limits for each VM you can see there are differences between Max Cached and Temp Storage Throughput, Max Burst Cached and Temp Storage Throughput, Max uncached Disk Throughput, and Max Burst uncached Disk Throughput. All of these represent very different I/O patterns, so make sure to look carefully at the numbers.

Below are a few links to read more on disk caching and bursting:


You’ll notice when you look at VM SKUs that there is an L-Series which is “storage optimized”. This may not always be the best fit for your workload, but it does have some amazing capabilities. The outstanding feature of the L-Series VMs are the locally mapped NVMe drives which as of the time of writing this post on the L80s_v2 SKU can offer 19.2TB of storage at 3.8 Million IOPS / 20,000 MBPS.

Azure LS Series VM Specs

The benefits of these VMs are the extremely low latency, and high throughput local storage, but the caveat to that specific NVMe storage is that it is ephemeral. Data on those disks does not persist a reboot. This means it’s incredibly good at serving from a local cache, tempdb files, etc. though its not storage that you can use for things like a File Server backend (without some fancy start-up scripts, please don’t do this…). You will note that the maximum uncached throughput is 80,000 IOPS / 2,000 MBPS for the VM, which is the same as all of the other high spec VMs. As I am writing this, no Azure VM allows for more than that for uncached throughput – this includes Ultra Disks (more on that later).

For more information on the LSv2 series, you can read more here: Lsv2-series – Azure Virtual Machines | Microsoft Docs

Additional Links on Azure VM Storage Design:


Networking capabilities of the Virtual Machine are also important design decisions when considering shared storage, both in total throughput and latency. You’ll notice in the VM SKU charts I posted above when talking about performance there are two sections for networking, Max NICs and Expected network bandwidth Mbps. It’s important to know that these are VM SKU limitations, which may influence your design.

Expected network bandwidth is pretty straight forward, but I want to clarify that the number of Network Interfaces you mount to a VM does not change this number. For example, if your expected network bandwidth is 3200 Mbps and you have an SMB share running on that single NIC, adding a second NIC and using SMB multi-channel WILL NOT increase the total bandwidth for the VM. In that case you could expect each NIC to potentially run at 1,600 Mbps.

The last networking feature to take into consideration is Accelerated Networking.  This feature allows for SR-IOV (Single Root I/O Virtualization), which by bypassing the host CPU and offloading the network traffic directly to the Network Interface can dramatically increase performance by reducing latency, jitter, and CPU utilization.  

Accelerated Networking Comparison in Azure

Image Reference: Create an Azure VM with Accelerated Networking using Azure CLI | Microsoft Docs  

Accelerated Networking is not available on every VM though, which makes it an important design decision. It’s available on most General Purpose VMs now, but make sure to check the list of supported instance types. If you’re running a Linux VM, you’ll also need to make sure it’s a supported distribution for Accelerated Networking.  


In an obvious step, the next design decision is the storage that you attach to your VM. There are two major decision types when selecting disks for you VM – disk type, and disk size.

Disk Types:

Azure VM Disk Types

Image Reference:  

As the table above shows, there are three types of Managed Disks ( ) in Azure. At the time of writing this, Premium/Standard SSD and Standard HDD all have a limit of 32TB per disk. The performance characteristics are very different, but I also want to point out the difference in the pricing model because I see folks make this mistake very often.  

Disk Type: Capacity Cost: Transaction Cost:
Standard HDD Low Low
Standard SSD Medium Medium
Premium SSD High None
Ultra SSD Highest (Capacity/Throughput) None

Transaction costs can be important on a machine whose sole purpose is to function as a storage server. Make sure you look into this before a passing glance shows the price of a Standard SSD lower than a Premium SSD. For example, here is the Azure Calculator output of a 1 TB disk across all four types that averages 10 IOPS * ((10*60*60*24*30)/10,000) = 2,592 transaction units.

Sample Standard Disk Pricing:

Azure Calculator Disk Pricing  


Sample Standard SSD Pricing:

Azure Calculator Disk Pricing  

Sample Premium SSD Pricing:

Azure Calculator Disk Pricing

Sample Ultra Disk Pricing:

Azure Calculator Disk Pricing


The above example is just an example, but you get the idea. Pricing gets strange around Ultra Disk due to the ability to configure performance (more on that later). Though there is a calculable break-even point for disks that have transaction costs versus those that have a higher provisioned cost.

For example, if you run an E30 (1024 GB) Standard SSD at full throttle (500 IOPS) the monthly cost will be ~$336, compared to ~$135 for a P30 (1024 GB) Premium SSD, with which you get x10 the performance. The second design decision is disk capacity. While this seems like a no-brainer (provision the capacity needed, right?) it’s important to remember that with Managed Disks in Azure, the performance scales with, and is tied to, the capacity of the disk.

Image Reference:

You’ll note in the above image the Disk Size scales proportionally with both the Provisioned IOPS and Provisioned Throughput. This is to say that if you need more performance out of your disk, you scale it up and add capacity.

The last note on capacity is this, if you need more than 32TB of storage on a single VM, you simply add another disk and use your mechanism for combining that storage (Storage Spaces, RAID, etc.). This same method can be used to further tweak your total IOPS, but make sure you take into consideration the cost of each disk, capacity, and performance before doing this – most often it’s an insignificant cost to simply scale-up to the next size disk. Last but not least, I want to briefly talk about Ultra Disks – these things are amazing!

Ultra Disk Configuration in Azure

Unlike with the other disk types, this configuration allows you to select your disk size and performance (IOPS AND Throughput) independently! I recently worked on a design where the customer needed 60,000 IOPS, but only needed a few TB of capacity, this is the perfect scenario for Ultra Disks. They were actually able to get more performance, for less cost compared to using Premium SSDs.

To conclude this section, I want to note two design constraints when selecting disks for your VM.

  1. The VM SKU is still limited to a certain number of IOPS, Throughput and Disk Count. Adding together the total performance of your disks, cannot exceed the maximum performance of the VM. If the VM SKU supports 10,000 IOPS and you add 3x 60,000 IOPS Ultra Disks, you will be charged for all three of those Ultra Disks at their provisioned performance tiers but will only be able to get 10,000 IOPS out of the VM.
  2. All of the hardware performance may still be subject to the performance of the access protocol or configuration, more on this in the next section.

Additional Reading on Storage:


Software Configuration and Access Protocols:

As we come to the last section of this post, we get to the area that aligns with the purpose of this blog series – shared storage. In this section I’m going to cover some of the most common configurations and access types for shared storage in IaaS. This is by no means an exhaustive list, rather what I find most common.

Scale-Out File Server (SoFS):

First up is Sale-Out File Server, this is a software configuration inside Windows Server that is typically used with SMB shares. SoFS was introduced in Windows 2012, uses Windows Failover Clustering, and is considered a “converged” storage deployment. It’s also worth noting that this can run on S2S (Storage Space Direct), which is the method I recommend using with modern Windows Server Operating Systems. Scale-Out File Server is designed to provide scale-out file shares that are continuously available for file-based server application storage. It provides the ability to share the same folder from multiple nodes of the same cluster. It can be deployed in two configuration options, for Application Data or General Purpose. See the additional reading below for the documentation on setup guidance.

Additional reading:

SMB v3:

Now into the access protocols – SMB has been the go-to file services protocol on Windows for quite some time now. In modern Operating Systems, SMB v3.* is an absolutely phenomenal protocol. It allows for incredible performance using things like SMB Direct (RDMA), Increasing MTU, and SMB Multichannel which can use multiple NICs simultaneously for the same file transfer to increase throughput. It also has a list of security mechanisms such as Pre-Auth Integrity, AES Encryption, Request Signing, etc. There is more information on the SMB v3 protocol below, if you’re interested, or still think of SMB in the way we did 20 years ago – check it out. The Microsoft SQL Server team even supports SQL hosting databases on remote SMB v3 shares.

Additional reading:


NFS has been a similar staple as a file server protocol for a long while also, and whether you’re running Windows or Linux can be used in your Azure IaaS VM for shared storage. For organizations that prefer an IaaS route compared to PaaS, I’ve seen many use this as a cornerstone configuration for their Azure Deployments. Additionally, a number of HPC (High Performance Compute) workloads, such as Azure CycleCloud (HPC orchestration) or the popular Genomics Workflow Management System, Cromwell on Azure prefer the use of NFS.

Additional Reading:


While I would not recommend the use of custom block storage on top of a VM in Azure if you have a choice, some applications do still have this requirement in which case iSCSI is also an option for shared storage in Azure.

Additional Reading:

That’s it! We’ve reached the end of Part 2. Okay, here we go with the Pros and Cons for using an IaaS Virtual Machine for your shared storage configuration on Azure.

Pros and Cons:



  • More control, greater flexibility of protocols and configuration.
  • Depending on the use case, potentially greater performance at a lower cost (becoming more and more unlikely).
  • Ability to migrate workloads as-is and use existing storage configurations.
  • Ability to use older, or more “traditional” protocols and configurations.
  • Allows for the use of Shared Disks.


  • Significantly more management overhead as compared to PaaS.
  • More complex configurations, and cost calculations compared to PaaS.
  • Higher potential for operational failure with the higher number of components.
  • Broader attack surface, and more security responsibilities.

  Alright, that’s it for Part 2 of this blog series – Shared Storage on IaaS Virtual Machines. Please reach out to me in the comments, on LinkedIn, or Twitter with any questions about this post, the series, or anything else!


DNS Load Balancing in Azure

Posted on Updated on

Reading Time: 3 minutes

This post won’t be too long, but I wanted to expand a bit on the recent repo that I published to Github for Azure Load Balanced DNS Servers. I’ve been working in Azure the better part of a decade and the way we’ve typically approached DNS is in one of two ways. Either use (a pair of) IaaS Domain Controllers or use Azure-Provided DNS resolution. In the last year or so there have been an increasing number of architectural patterns that require private DNS resolution where it we may not necessarily care about the servers themselves.

This pattern has become especially popular with the requirements for Azure Private Link in hybrid scenarios where on-premises systems need to communicate with Azure PaaS services over private link.


The only thing the DNS forwarder is providing here is very basic DNS forwarding functionality. This is not to say that it can’t be further configured, but the same principles still apply. DNS isn’t something that needs any sort of complex failover during patch windows, but since it has to be referenced by IP we have to be careful about taking DNS servers down if there aren’t alternates configured. With a Web Server we would just put it behind a Load Balancer, but there don’t seem to be configurations published for a similar setup with DNS servers (other than using a Network Virtual Appliance) since UDP isn’t a supported health probe by Azure Load Balancers. How then do we configure a pair of “zero-touch” private DNS functionality in Azure?

When asked “What port does DNS use?”, the overwhelming majority of IT Professionals will say “UDP 53”. While that is correct, it also uses TCP 53. UDP Packets can’t be larger than 512 Bytes, and while this suffices in most cases for DNS there are certain scenarios where it does not. For example, DNS Zone Transfers (AFXR/IFXR), DNSSEC, and EDNS all have response sizes larger than 512 bytes, which is why they use TCP. This is why the DNS Service does (be default) listen on TCP 53, which is what we can use as the health probe in the Azure Load Balancer.

The solution that I’ve published on Github (, contains the template to deploy this solution which has the following configuration.

  • Azure Virtual Network
  • 2x Windows Core Servers:
    • Availability Set
    • PowerShell Script to Configure Servers with DNS
    • Forwarder set to Azure Multicast DNS Resolver
  • Azure Load Balancer:
    • TCP 53 Health Probe
    • UDP/TCP 53 Listener
Azure DNS Load Balanced Solution

This template does not include patch management, but I would highly recommend using Azure Update Management, this way you can setup auto-patching and an alternate reboot schedule. If this is enabled, this solution would be a zero-touch, highly-available, private DNS solution for ~$55/mo (assuming D1 v2 VM, which can be lower if a cheaper SKU is chosen).

Since I’m talking about DNS here, the last recommendation that I’ll make is to go take a look at Azure Defender for DNS which monitors, and can alert you to suspicious activity in your DNS queries.

Alright, that’s it! I hope this solution will be helpful, and if there are options or configurations you’d like to see available in the Github repository please feel free to submit an issue or a PR! If you want to deploy it right from here, click the button below!

Deploy to Azure


If you have any questions, comments, or suggestions for future blog posts please feel free to comment blow, or reach out on LinkedIn or Twitter. I hope I’ve made your day a little bit easier!

Azure Site-to-Site VPN with a Palo Alto Firewall

Posted on Updated on

Reading Time: 9 minutes

In the past, I’ve written a few blog posts about setting up different types of VPNs with Azure.

Since the market is now full of customers who are running Palo Alto Firewalls, today I want to blog on how to setup a Site-to-Site (S2S) IPSec VPN to Azure from an on-premises Palo Alto Firewall. For the content in this post I’m running PAN-OS on a VM-50 in Hyper-V, but the tunnel configuration will be more or less the same across deployment types (though if it changes in a newer version of PAN-OS let me know in the comments and I’ll update the post).

Alright, let’s jump into it! The first thing we need to do is setup the Azure side of things, which means starting with a virtual network (vnet). A virtual network is a regional networking concept in Azure, which means it cannot span multiple regions. I’m going to use “East US” below, but you can use whichever region makes the most sense to your business since the core networking capabilities shown below are available in all Azure regions.

With this configuration I’m going to use as the overall address space in the Virtual Network, I’m also going to configure two subnets. The “hub” subnet is where I will host any resources. In my case, I’ll be hosting a server there to test connectivity across the tunnel. The “GatewaySubnet” is actually a required name for a subnet that will later house our Virtual Network Gateway (PaaS VPN Appliance). This subnet could be created later in the portal interface for the Virtual Network (I used this method in my PFSense VPN blog post), but I’m creating it ahead of time. Note that this subnet is name and case sensitive. The gateway subnet does not need a full /24, (requirements for the subnet here), it will do for my quick demo environment.

Now that we have the Virtual Network deployed, we need to create the Virtual Network Gateway. You’ll notice that once we choose to deploy it in the “vpn-vnet” network that we created, it will automatically recognize the “GatewaySubnet” and will deploy into that subnet. Here we will choose a VPN Gateway type, and since I’ll be using a route-based VPN, select that configuration option. I won’t be using BGP or an active-active configuration in this environment so I’ll leave those disabled. Validate, and create the VPN Gateway which will serve as the VPN appliance in Azure. This deployment typically takes 20-30 minutes so go crab a cup of coffee and check those dreaded emails.

Alright, now that the Virtual Network Gateway is created we want to create “connection” to configure the settings needed on the Azure side for the site-to-site VPN.

Here we’ll name the connection, set the connection type to “Site-to-Site (IPSec)”, set a PSK (please don’t use “SuperSecretPassword123″…) and set the IKE Protocol to IKEv2. You’ll notice that you need to set a Local Network Gateway, we’ll do that next.

Let’s go configure a new Local Network Gateway, the LNG is a resource object that represents the on-premises side of the tunnel. You’ll need the public IP of the Palo Alto firewall (or otherwise NAT device), as well as the local network that you want to advertise across the tunnel to Azure.

Once that’s complete we can finish creating the connection, and see that it now shows up as a site-to-site connection on the Virtual Network Gateway, but since the other side isn’t yet setup the status is unknown. If you go to the “Overview” tab, you’ll notice it has the IP of the LNG you created as well as the public IP of the Virtual Network Gateway – you will want to copy this down as you’ll need it when you setup the IPSec tunnel on the Palo Alto.

Alright, things are just about done now on the Azure side. The last thing I want to do is kick off the deployment of a VM in the “hub” subnet that we can use to test the functionality of the tunnel. I’m going to deploy a cheap B1s Ubuntu VM. It doesn’t need a public IP and a basic Network Security Group (NSG) will do since there is a default rule that allows all from inside the Virtual Network (traffic sourced from the Virtual Network Gateway included).

Now that the test VM is deploying, let’s go deploy the Palo Alto side of the tunnel. The first thing you’ll need to do is create a Tunnel Interface (Network –> Interfaces –> Tunnel –> New). In accordance with best practices, I created a new Security Zone specifically for Azure and assigned that tunnel interface. You’ll note that it will deploy a sub interface that we’ll be referencing later. I’m just using the default virtual router for this lab, but you should use whatever makes sense in your environment.

Next we need to create an IKE Gateway. Since we set the Azure VNG to use IKEv2, we can use that setting here also. You want to select the interface that is publicly-facing to attach the IKE Gateway, in my case it is ethernet 1/2 but your configuration may vary. Typically you’ll have the IP address of the interface as an object and you can select that in the box below, but in my case my WAN interface is using DHCP from my ISP so I leave it as “none”.

It is important to point out though, that if your Palo Alto doesn’t have a public IP and is behind some other sort of device providing NAT, you’ll want to use the uplink interface and select the “local IP address” private IP object of that interface. I suspect this is an unlikely scenario, but I’ll call it out just in case.

The peer address is the public IP address of the Virtual Network Gateway of which we took note a few steps prior, and the PSK is whatever we set on the connection in Azure. Lastly, make sure the Liveness Check is enabled on the Advanced Options Screen.

Next we need an IPSec Crypto Profile. AES-256-CBC is a supported algorithm for Azure Virtual Network Gateways, so we’ll use that along with sha1 auth and set the lifetime to 8400 seconds which is longer than lifetime of the Azure VNG so it will be the one renewing the keys.

Now we put it all together, create a new IPSec Tunnel and use the tunnel interface we created, along with the IKE Gateway and IPSec Crypto Profile.

Now that the tunnel is created, we need to make appropriate configurations to allow for routing across the tunnel. Since I’m not using dynamic routing in this environment, I’ll go in and add a static route to the virtual router I’m using to advertise the address space we created in Azure to send out the tunnel interface.

Great! Now at this point I went ahead and grabbed the IP of the Ubuntu VM I created earlier (which was and did a ping test. Unfortunately they all failed, what’s missing?

Yes yes, I did commit the changes (which always seems to get me) but after looking at the traffic logs I can see the deny action taking place on the default interzone security policy. Yes I could have not mentioned this, but hey, now if it doesn’t work perfectly for the first time for you – you can be assured you’re in good company.

Alright, if you recall we created the tunnel interface in its own Security Zone so I’ll need to create a Security Policy from my Internal Zone to the Azure Zone. You can use whatever profiles you need here, I’m just going to completely open interzone communication between the two for my lab environment. If you want machines in Azure to be able to initiate connections as well remember you’ll need to modify the rule to allow traffic in that direction as well.

Here we go, now I should have everything in order. Let’s go kick off another ping test and check a few things to make sure that the tunnel came up and shows connected on both sides of things. It looks like the new Allow Azure Security Policy is working, and I see my ping application traffic passing!

Before I go pull up the Windows Terminal screen I want to quickly check the tunnel status on both sides.

Success!!! Before I call it, I want to try a two more things so I’ll SSH into the Ubuntu VM, install Apache, edit the default web page and open it in a local browser.

At this point I do want to call out the troubleshooting capabilities for Azure VPN Gateway. There is a “VPN Troubleshoot” functionality that’s a part of Azure Network Watcher that’s built into the view of the VPN Gateway. You can select the gateway on which you’d like to run diagnostics, select a storage account where it will store the sampled data, and let it run. If there are any issues with the connection this will list them out for you. It will also list some specifics of the connection itself so if you want to dig into those you can go look at the files written to the blob storage account after the troubleshooting action is complete to get information like packets, bytes, current bandwidth, peak bandwidth, last connected time, and CPU utilization of the gateway. For further troubleshooting tips you can also visit the documentation on troubleshooting site-to-site VPNs with Azure VPN Gateways.

That’s it, all done! The site-to-site VPN is all setup. The VPN Gateway in Azure makes the process very easy and the Palo Alto side isn’t too bad either once you know what’s needed for the configuration.

If you have any questions, comments, or suggestions for future blog posts please feel free to comment blow, or reach out on LinkedIn or Twitter. I hope I’ve made your day a little bit easier!

Shared Storage Options in Azure: Part 1 – Azure Shared Disks

Posted on Updated on

Reading Time: 4 minutes

In an IaaS world, shared storage between virtual machines is a common ask. “What is the best way to configure shared storage?”, “What options do we have for sharing storage between these VMs?”, both are questions I’ve answered several times, so let’s go ahead and blog some of the options! The first part in this blog series titled “Shared Storage Options in Azure”, will cover Azure Shared Disks.

As I write subsequent posts in this series, I will update this post with the links to each of them.


When shared disks were announced in July of 2020, there was quite a bit of excitement in the community. There are so many applications that still leverage shared storage for things like Windows Server Failover Clustering, on which many applications are built like SQL Server Failover Cluster Instances. Also, while I highly recommend using a Cloud Witness, many customers migration workloads to Azure still rely on a shared disk for quorum as well. Additionally, many Linux applications leverage shared storage that were previously configured to use a shared virtual disk, or even RAW LUN mappings, for applications such as GFS2 or OCFS2.

Additional sample workloads for Azure Shared Disks can be found here: Shared Disk Sample Workloads.

There are a few limitations of shared disks, the list of which is constantly getting smaller. For now, though, let’s just go ahead and jump into it and see how to deploy them. After which, we’ll do a quick “Pros” and “Cons” list before moving on to the other shared storage options. I deployed Shared Disks in my lab using the portal first (screenshots below), but also created a Github Repository ( with the Azure PowerShell script and an ARM template to deploy a similar environment – feel free to use those if you’d like!

As a prerequisite (not pictured below) I created the following resources:

  • A Resource Group in the West US region
  • A Virtual Network with a single subnet
  • 2x D2s v3, Windows Server 2016 Virtual Machines (VM001, VM002) each with a single OS disk

Now that those are created, I deployed a Managed Disk (named “sharedDisk001”) just like you would if you were deploying a typical data disk.

On the “advanced” tab you will see the ability to configure the managed disk as a “shared disk”, here is where you set the max shares which specifies the maximum number of VMs that can attach that particular disk type.

After the disk is finished deploying, we head over to the first VM and attach an existing disk. You’ll note that the disk shows up as a “shared disk” and shows the number of shares left available on that disk. Since this is the first time it’s being mounted it shows 0.

After attaching the disk to the first VM, we head over and do the same thing on VM002. You’ll note that the number of shares has increased by 1 since we have now mounted the disk on VM001.

Great, now the disk is attached to both VMs! Heading over to the managed disk itself you’ll notice that the overview page looks a bit different from typical managed disks, showing information like “Managed by” and “Max Shares”.

In the properties of the disk, we can see the VM owners of that specific disk, which is exactly what we wanted to see after mounting it on each of the VMs.

Although I setup this configuration using Windows machines, you’ll notice I didn’t go into the OS. This is to say that the process, from an Azure perspective, is the same with Linux as it is with Windows VMs. Of course, it will be different within the OS, but there is nothing Azure-specific from that aspect.

Okay, here we go the Pros and Cons:


  • Azure Shared disks allows for the use of what is considered to be “legacy clustering technology” in Azure.
  • Can be leveraged by familiar tools such as Windows Failover Cluster Manger, Scale-out File Server, and Linux Pacemaker/Corosync.
  • Premium and Ultra Disks are supported so performance shouldn’t be an issue in most cases.
  • Supports SCSI Persistent Reservations.
  • Fairly simple to setup.


  • Does not scale well, similar to what would be expected with a SAN mapping.
  • Only certain disk types are supported.
  • ReadOnly host caching is not available for Premium SSDs with maxShares >1.
  • When using Availability Sets and Virtual Machine Scale sets, storage fault domain alignment with the VMs are not enforced on the shared data disk.
  • Azure Backup not yet supported.
  • Azure Site Recovery not yet supported.

Alright, that’s it for Azure Shared Disks! Go take a look at my Github Repository and give shared disks a shot!

Please reach out to me in the comments, LinkedIn, or Twitter with any questions or comments about this blog post or this series.  

Delete Azure Recovery Vault Backups Immediately

Posted on Updated on

Reading Time: 3 minutes

If you’re like many others, over the past few months you’ve noticed that if you configure Azure Backup, you can’t delete the vault for 14 days until after you stop backups. This is due to Soft Delete for Azure Backup. It doesn’t cost anything to keep those backups during that time, and it’s honestly a great safeguard to accidentally deleting backups and gives the option to “undelete”. Though, in some cases (mostly in lab environments) you just want to clear it out (or as was affectionately noted by a colleague of mine, “nuke it from orbit”). Let’s walk though how to do that real quickly.

When you go and stop backups and delete the data you’ll get the warning “The restore points for this backup item have been deleted and retained in a soft delete state” and you have to wait 14 days to fully delete those backups, you’ll also get an email alert letting you know.

To remove these backups immediately we need to disable soft delete, which is a configuration setting in the Recovery Services Vault. DO NOT DO THIS UNLESS YOU ABSOLUTELY MUST. As previously noted, this is a greats safeguard to have in place, and I would also suggest using ARM Resource Locks in production environments in addition to soft delete. If you’re sure though, we can go turn it off.

Alright, now that we’ve disabled Soft Delete for the vault, we have to commit the delete operation again. This means first we’ll need to “undelete” the backup, then delete it again which this time won’t be subject to the soft delete policy.

Now we can go delete it again, after which we can find that there are no backup items in the vault.

Success!!! The backup is fully deleted. So long as there are no other dependencies (policies, infrastructure, etc.) you can now delete the vault.

If you have any questions or suggestions for future blog posts feel free to comment below, or reach out to me via email, twitter, or LinkedIn.


Azure Web Apps with Cost Effective, Private and Hybrid Connectivity (The ASE Killer!)

Posted on Updated on

Reading Time: 10 minutesCaution: this blog post may save you significant time and money, and has been affectionately dubbed “The ASE Killer!”.

Note: If you prefer a video walk-through, you can now view my video building this solution on YouTube:

With the advent of the cloud came the ever so attractive PaaS service model. The first time I heard about this, I was sold. Host my application without having to manage the infrastructure, the OS, patching, scaling and all the other things that I really don’t want to do anyways – sign me up! The “catch” is though that (to misquote all those before me) “with less responsibility comes less power”. When relinquishing responsibility (the positive side of PaaS), you also relinquish some control and ability for customization. This is a conversation I have with customers at least once a week – how do I control networking for my PaaS application.

One my peers, Steve Loethen who focuses more on the application development aspects in Azure, and I were speaking and noted that it would be great if we could leverage Azure Private Link Endpoints and Azure Application Gateway to get both private connectivity from the internal network and leverage a cloud-native Web Application Firewall for internet traffic. Traditionally using the “isolated” tier of Azure App Services called an Azure Application Service Environment (which provides a completely isolated and dedicated app service environment) would be the way to go. Unfortunately the nature of a private stamp of a PaaS service inherently comes with a pretty heft price tag. There are still some great features to be had with the ASE, but using Private Link we can get to a point where all external access is blocked unless its using the WAF – but can be still be accessed over the internet network. I’ll document that configuration below.

To setup this environment we will need to:

  • Setup the Web App
  • Create a Virtual Network in Azure
  • Setup a Site-to-Site VPN
  • Setup a Private Link Endpoint for the Web App
  • Restrict Network Access to the Web App to only the Private Link Endpoint
  • Setup a Public Application Gateway

*Note* As of 3/12/20 Private Link Endpoints for App Service Web Apps is in Public Preview.

As of 10/6/20 Private Link Endpoints for App Service Web Apps is Generally Available

Let’s get started! First, we’ll create the App Service. I’m going to be using the resource group named “pri-webapp-rg” and the app service name “” in the West US region.

*Note* At the time of writing this post (5/18/20) Private Endpoints require the App Service to be a Pv2.

I am not going to use it in this lab, but I am going to leave App Insights enabled. If you’ve not used App Insights yet as an APM tool I highly suggest you look at instrumenting your app with this tooling.

Go ahead and add any tags if you need any for your environment, then create that App Service. Next we’ll go and create our Azure Virtual Network. This will allow us to facilitate the private network connectivity. I’m going to create this in West US, I’ll note that Private Link Endpoints do not need to be deployed in the same region as the resource, but I want to reduce latency here so I’ll deploy the vnet (and subsequent Private Link Endpoints in the following steps) as the same region as the App Service.

I’m going to use the address space, and initially create two subnets. The “WebApp” subnet will be for the Private Link Endpoint and the “AppGateway” subnet will be designated for the Application Gateway.

I’m going to leave the rest of the deployment settings with their default options selected for now and create the vnet.

Once the vnet has finished creating, we need to go create a “Gateway Subnet” which will be used for the VPN Gateway to be used for the Site-to-Site VPN for hybrid connectivity. By default, it will pre-select a subnet that is available in your address space. Once that subnet is created we have our finished vnet setup.

Now that the network in Azure is setup, we need to get the VPN Gateway configured. When the “West US” region is selected on the deployment screen you’ll get a list of available virtual networks in that region. I selected the one we just created, and it will validate that there is a “GatewaySubnet” created in that vnet and will select that as the deployment subnet for the VPN Gateway. I’m going to be using a Route-based VPN so I’ll select that as my VPN type then create a new public IP address and leave the Active-Active and BGP options disabled for this lab.

The VPN gateway does take 20-30 minutes to deploy, so go grab a cup of coffee. Once that’s done we’ll need to add the “connection” for the site-to-site VPN.

For this setup we’re using a Site-to-Site IPSec VPN Connection, and this is where you’ll also set your PSK and IKE protocol. You’ll also need to create a local network gateway, which is an ARM resource used for representing the on-premises network.

The “IP Address” field is where you’ll enter the public IP address of the appliance that’s going to terminate the VPN on-premises. You’ll also need to add which networks are on-prem, this will add add the local network address space to the route table of the VPN Gateway so it knows to use that link for traffic bound for those addresses.

Once that’s created, we’ll need to go to the overview page for the VPN Gateway to get its public IP address. That will be needed to setup the on-premises side of the VPN. I’m going to use a PFSense appliance in home lab network to accomplish this setup. If you want to test this just in Azure you can also use just a vnet peered network and create an emulated “client” machine, alternatively you could also setup a point-to-site VPN for just your local machine. I won’t be showing that process here, but I have another post that discusses the setup of PFSense S2S VPN with an Azure VPN Gateway and another that uses PaloAlto for S2S VPN to Azure.

Once the S2S VPN is setup, we can now go and setup the Private Link Endpoint for the App Service.

When choosing a private link endpoint you need to choose the resource type, so here I’m using the “Microsoft.Web/sites” resource type.

As noted earlier on, we created the “WebApp” subnet to hold the private link endpoint and we’ll select that here. A very important component of private link is DNS. As is stated so eloquently in my favorite Haku:

“It’s not DNS

  There’s no way it’s DNS

   It was DNS”. 

Before deploying private link you have to consider the DNS scenario. Clients cannot call private link endpoints by their IP addresses, it has to be by their DNS names. I highly recommend some of the Microsoft Documentation on the topic, or Daniel Mauser has an amazing article on Private Link DNS Integration Scenarios. I’m going to go ahead and allow the private endpoint to create an Azure Private DNS Zone, but ultimately for this lab environment I’ll just be using a host file entry.

*Note* You can also check out my post on automated IaaS DNS Load Balancing to help with the private, hybrid DNS integration scenarios that may be required for Private Link here: DNS Load Balancing in Azure.

Alright, the private link endpoint is all setup. Now to make sure we restrict network traffic to that which only originates from the private link endpoint. App Services can use virtual network service endpoints to restrict traffic originating from vnets in Azure quite easily. Though in our case we want to make sure that on-premises traffic can also access that app service so rather than allowing access at the vnet level we’ll just be allowing the single IP of the private endpoint.

After adding the access restriction rule, you’ll see that the “Allow All” rule switches to a “Deny All” rule and is given the lowest priority on the ACL. Above that we have the single IP of the Private Link Endpoint that is allowed. As a result, the only IP that’s allowed to call this app service is that of the private link endpoint.

Great! Okay, now we’ve setup my app service, my site-to-site VPN, private link, and access restrictions. We should be all set to test connectivity from on-premises now, let’s go give that a shot. As noted above, since this is just a lab environment I’m just going to be using a host file entry on my test machine on my lab network, so lets set that up real quick.

Great, now let’s validate this in a browser and with powershell to both make sure the page loads and that it’s using the private endpoint. We can see that the page loads using the “public” DNS name, but the address is the address that is assigned to the private link endpoint. We can also see that on a machine that’s just using the typical internet path is given a 403, since the source traffic is not coming from the private endpoint – perfect!

If we go back and look at the site-to-site VPN configuration now we’ll also see that traffic has started passing over that link which further validates the fact that the traffic is remaining private.

Now that we have confirmed the hybrid traffic is passing through the private endpoint and that all internet traffic is denied, let’s configure the Application Gateway with the Web Application Firewall SKU so that we can facilitate external traffic communicating with our private Web App (forgive my ever so descriptive naming convention) and create a new public IP address to be associated with the Application Gateway. Lastly, we’ll use the subnet that we designated earlier for the Application Gateway for the deployment.

When creating the backend pool we’ll enter the IP address of the private link endpoint. For the routing rule, since this is a lab and I don’t have a TLS certificate on-hand I’ll use an HTTP listener on the App Gateway and use the same HTTPS endpoint for the App Service that we’ve been using internally. In a production environment you’ll want to have HTTPS on the public listener. When creating the HTTP setting to use for this routing rule we want to override the host name to that of the App Service (in this case, “”) because remember that private link needs to be called using a DNS name and this way we can add that host name to the request header.

*Note* Since we deployed a Private DNS Zone, which is automatically attached to our virtual network, and the application gateway is deployed into the virtual network, we can use the DNS name of the app here because it will resolve correctly. Though, for clarity in this lab environment I’m just using the IP address.

After the Application Gateway is finished deploying I want to go add a DNS label to the public IP which is associated and confirm that it was applied so I can use that DNS name rather than the IP itself.

Okay, moment of truth! Remember the machine we used earlier that used the public path to get to the app service but got a 403 error because of the access restrictions? Let’s go ahead and try hitting the newly provisioned Application Gateway that does have a public listener and try the site now. We see that it’s using the DNS label and subsequent HTTP listener that we setup on the App Gateway, using the public interface and is routing us appropriately back to our private web app!

That’s all folks! As a retrospective, here’s what we’ve done:

  • Configured an App Service with Hybrid, Private Networking
  • Configured a scalable public endpoint that’s using a WAF in an IDS mode
  • Maintained the flexibility, scalability, and cost effectiveness of the non-ASE App Service

If you have any questions or suggestions for future blog posts feel free to comment below, or reach out to me via email, twitter, or LinkedIn.


Change Feed:

10/8/20: Private Endpoints for App Service Web Apps now GA

12/25/20: Updated with link to new blog post for “Azure Site-to-Site VPN with a PaloAlto Firewall

12/27/20: Updated with link to new blog post for “DNS Load Balancing in Azure

Custom “Virtual Network Operator” Role in Azure

Posted on Updated on

Reading Time: 4 minutes

*Edit* 6/9/20: At the time of writing this blog custom roles were not available in the Azure Portal. Since that feature now exists, I’ve added how to create this role through the portal to the end of this blog.

We all know that cloud environments are different than on premises. Development environments can be even more difficult, the right cross between giving freedom to produce business value and enough security and controls to maintain a good security posture. Recently I came across a scenario where the design was that the networking components (Virtual Network, S2S VPN, Peerings, Service Endpoints, UDRs, Subnets, etc.) were all under the control of a Networking team so that the Azure environment would be an extension of their local network. From a permissions standpoint though this caused some problems. The goal would be to empower developers to build whatever they need and just attach it to the vNet when connectivity was needed, with the caveat that they shouldn’t be able to modify any of the network settings or configurations in the shared vNet. This however, turned out not to be possible with default IAM roles.

In my testing, even though you’re not “modifying” anything per se in the vNet – when attaching a network interface card to a Subnet it does require write access. Reader access will give the user this error.

After looking through the default IAM roles, there aren’t any that do what I need them to do. Alas, a custom role is needed. For reference, please checkout this docs page for custom roles –

First, I need to see what actions are available to pack into the custom role so I run the following command in Azure Powershell and get these results.

Get-AzProviderOperation “Microsoft.Network/virtualNetworks/*” | FT OperationName, Operation, Description -AutoSize

Great, now I can see what is available. It looks like the actions that I need are the /subnet/join/action and /subnet/joinViaServiceEndpoint/Action. Based on the descriptions these two will essentially give “operator” role to the vnet. The assignee will be able to use the subnet(s) but not able to modify them.

Next, you can use one of the template references in the Microsoft Docs link ( and modify the actions.

  “Name”: “Custom – Network Operator Role”,
  “Id”: null,
  “IsCustom”: true,
  “Description”: “Allows for read access to Azure network and join actions for service endpoints and subnets.”,
  “Actions”: [
  “NotActions”: [],
  “AssignableScopes”: [

After modifying the assignable subscription IDs, save that as a .json file and use it in the following command to import the custom role.

New-AzRoleDefinition -InputFile “C:\FileFolderLocationPath\CustomNetworkOperatorRole.json”

After a portal refresh you get the custom role available as an IAM role assignment.

There you go, after assigning this role the user was able to create VMs and attach them to the Virtual Network while still leaving control of the Network configuration to the Networking Team.



EDIT 6/9/20: Now that custom roles are available in the portal I’ve added how to create this role that way below. 

Using the documentation on how to create custom IAM roles in the Azure portal you can clone any role, but I’m just going to use the “Network Contributor” role to clone. 

I am going to just “start from scratch”, though you can easily use this page to modify the permissions set of a certain role that already exists. Additionally, if you have the JSON (like we generated before with powershell) you can just upload that here.

Now go ahead and add new permissions by going to the “Microsoft Network” Category and selecting the same permissions we wrote in the JSON and used with the powershell command above.

After adding those the next page will show you the permission that you’ve added, let’s double check those and make sure we have what we need.

Looks good, now we can set a scope where this role can be applied. I have this one scoped to the whole subscription, but you can use the interface to easily change that on this page.

Alright, that’s about it – let’s take a quick look at the summary pages. You can see that the portal actually generates the JSON, which is nice if you want to assign roles using Azure Policy or any other automation tooling.

Nice! Now we can see that users can use the four above actions which will allow them to interact with, and use a shared hub-style virtual network while now allowing full contributor access.

I hope I’ve made your day at least a little bit easier!

If you have any questions or suggestions for future blog posts feel free to comment below, or reach out to me via email, twitter, or LinkedIn.


Isolated Virtual Machines in Azure

Posted on

Reading Time: 2 minutesIn a recent design session, I had someone mention how they wish that Azure had dedicated and/or isolated IaaS VMs like some of the other cloud providers. They were shocked with I said – they do! Azure has a number of isolated VM types that you can provision and leverage just the same as any other IaaS VM.

The link to Microsoft documentation below states that “Azure Compute offers virtual machine sizes that are Isolated to a specific hardware type and dedicated to a single customer. These virtual machine sizes are best suited for workloads that require a high degree of isolation from other customers for workloads involving elements like compliance and regulatory requirements.”

Yup, you read that right, dedicated hardware. This means that using one of the isolated VM types (listed below) will guarantee that your virtual machine will be the only one running on that server instance (as of Nov. 1, 2018).

  • Standard_E64is_v3
  • Standard_E64i_v3
  • Standard_M128ms
  • Standard_GS5
  • Standard_G5
  • Standard_DS15_v2
  • Standard_D15_v2


If VM isolation is what you’re after, you can also consider Nested Virtualization in Azure (link below). A few VM types in Azure support Nested Hyper-V (not suitable for production, mind you) and Hyper-V Containers using Docker.


Lastly, there is a new feature that recently went GA that’s sort of in-line with VM isolation – Confidential Compute. While this isn’t VM isolation, it is an isolated enclave for executing code in TEEs (Trusted Execution Environments) using Intel’s SGX capabilities. This allows for code to be executed in an environment that, while it’s being executed, can not be accessed by any source outside of the enclave (see diagrams in the links below).


Using physical isolation techniques in the cloud can be very different than the way we’re used to doing it on-prem. Hopefully, though, this post and its associated links will get you thinking about different ways you can design for isolation in Azure. Feel free to comment, or reach out on Twitter @MattHansen0.

Thanks, and I hope I’ve made your day at least a little bit easier.

Azure Point-to-Site VPN with RADIUS Authentication

Posted on Updated on

Reading Time: 4 minutesFor the money, it’s hard to beat the Azure VPN Gateway. Until recently though, Point-to-Site VPNs were a bit clunky because they needed mutual certificate authentication. It wasn’t bad, but it certainly wasn’t good. Thankfully, Microsoft now allows RADIUS backed authentication. This post is how you impliment said configuration.


To start off, here is my environment information I’m using to setup this configuration.

Virtual Network: “raidus-vnet”

Virtual Network Address Space:

Virtual Network VM Subnet:

Virtual Network Gateway Subnet:

VPN Gateway SKU: VpnGW1

VPN Client Address Pool:

Domain Controler/NPS Server Static IP:



Virtual Network (VNET) Setup:

You most likely already have a VNET where you will be configuring this setup, but if you don’t you need to create one with two subnets. One subnet for infrastructure, and one “Gateway Subnet”. The Gateway Subnet will be used automatically, and is required,  when you configure the VPN Gateway.


VPN Gateway Setup:

The Azure VPN Gateway is just about as easy as it gets to configure and to managed (sometimes to a fault). The only caveat you need to be aware of in this scenerio, is that RADIUS Point-to-Site authentication is only available on the SKU “VPNGW1” and above. You’ll then need to choose the vnet where you have created the VPN Gateway, and create a Public IP Address resource.


This will take anywhere from 20-45 minutes to provision, as noted. While that’s running, you can provision your NPS (Network Policy Server) VM. This being a test environment, I provisioned a VM to be both the domain controler and the NPS box. Make sure to set a static IP on the NPS box’s NIC in Azure, you’ll need a static for your VPN configuration. I used

After complete,  you will need to configure the VPN Gateway’s Point-to-Site configuration. Choose “RADIUS authentication”, enter in the static IP of the will-be NPS server, and set a Server Secret. This being a test environment, my password is obviously not as secure as I hope yours would be.


Configure NPS:

Now, go back into that VM that was created earlier and install the NPS role.


After it’s installed, you need to create a Network Policy with a condiational access clause (I used a group in AD) and tell it what security type you want to allow.





Next, you’ll need to create a Client Access Policy. Here you need the IP of the VPN Gateway you created, and the shared secret. Here is the interesting bit, you can’t view the IP of your VPN Gateway in the Gateway Subnet. If you looked at “Connected Devices” in the VNET the VPN Gateway doesn’t show any IP. I know that the VPN Gateway is deployed (behind the scenes) as an H/A pair, but I would assume they’re using a floating IP that they could surface. Anyways, there is no real way to find it – but it looks like (after testing with a dozen different deployments) it uses the 4th available IP in the subnet. This subnet being a, the 4th IP is This is the IP that goes in the address of the RADIUS Client.



Next, I configure NPS Accounting. You don’t have to do this, but I think it helps for the sake of connection logging and for troubleshooting. You can log a few different ways, and choose here just to use a text file to a subfolder I created called “AzureVPN”.




Generate VPN Client Package:

Now that everything is set, you need to generate a VPN Client Package to distribute to your users.


After it is installed, you can see the VPN Connection in the VPN list and users can logon using their domain credentials.




After logging in, we can go back and look at the accounting log which shows us the successfull authentication of that user.



There we go, connecting to an Azure VPN Gateway with RADIUS authentication using domain credentials. I think we can all thank Microsoft for this one, and not having to do cert management anymore.

I hope I’ve made your day at least a little bit easier.