Welcome to Part 3 of this 5-part Series on Shared Storage Options in Azure. In this post I’ll be covering Azure Storage Services. You may be thinking to yourself, wait a minute, what have we been talking about this whole time then? Azure Storage Services is easiest thought of being the term used for the services offered under an Azure Storage Account (Blob, File, Queue, Table). Given the context of this series, I’ll be discussing Azure Blob Storage and Azure File storage in this post. Though, I do want to add a disclaimer that technically Queue, and Table Storage can be “shared” also since multiple apps can call the same Queue or Table using the APIs. Since the focus here is more on the system-level, I’m not going to cover those two, but I’ll add some links to documentation where you can read more.
- Part 1: Azure Shared Disks
- Part 2: IaaS Storage Server
- Part 3: Azure Storage Services
- Part 4: Azure NetApp Files
- Part 5: Conclusion
Azure Blob Storage:
In the majority of cases, when people discuss “cloud storage” they’re talking about Blob – binary large object. What this service allows us to do is store massive amounts of unstructured “objects” in Azure. There are a couple ways we can Blob storage as shared storage from a system-level.
Shared Blob Storage:
As I mentioned in the introduction, all Azure Storage Services can be accessed over HTTP/S via API or using any of the client libraries. This means that they can all technically be “shared” storage, but what about system-level access? While I find most applications and solutions can be adapted using a client library, there is a project called “Blobfuse” which can be used for more traditional applications.
Blobfuse is an open-source project on GitHub which uses the libfuse library to pair together the Linux FUSE kernel module and the Azure Blob REST APIs to create a virtual filesystem. The result of this configuration is a mount point on a Linux machine directly to a Blob Storage Account. There can be certain challenges in using Blobfuse though, for example the result is NOT a POSIX-compliant filesystem and if you use mount the same Blob Storage from multiple machines you should keep those limitations in mind.
The default configuration for the setup of Blobfuse is to have your Storage Account name and Access Key in a plain-text configuration file sitting on your server, which is not ideal from a security perspective and should be noted. However, it is possible to use a Managed Service Identity with Blobfuse which significantly improves the security posture of the deployment (if you use a System Assigned Managed Identity) and something I would recommend over the default configuration. Lastly, Blobfuse is not available on Windows – Linux only.
As of the time of writing this blog post (January, 2021) NFS is not yet Generally Available (GA) on Blob Storage, but NFS 3.0 has been in preview since July 2020 . Once this goes GA I will update this post with that information, but won’t quote this as an option until that point.
Typical Use Cases:
The majority of the use cases I’ve seen that use Blob as shared storage at the system level are wanting to use consumption-based cloud storage without the overhead or limitations of a managed disk. Specifically, in applications that don’t support SMB natively and require a local mount point, that’s where Blobfuse comes into play. I have seen this with a lot of apps that are migrated into the cloud and want lower cost, higher capacity than is available from managed disks with more legacy applications where this may be the case. I’ve also seen this configuration with many HPC applications since NFS as an access protocol is not yet GA for Blob storage.
Cost, Performance, Availability and Limitations:
The cost of using Blob storage is always the same regardless of the access protocol, since as of now it all ends up going through the Azure Storage API anyways.
Blob storage is incredibly performant. There are two tiers of Blob storage, Standard and Premium. In most cases, Standard will be the appropriate tier. Premium is for storage that needs single-digit transaction times and is better suited for larger block sizes (256KiB+). Though do keep in mind, that similar to my comparison of Managed Disk Types and the cost calculation of capacity and transaction costs, in some scenarios Premium Block Blob Storage may be cheaper.
If you’re using a standard Blob storage account (not configured with a Hierarchical Namespace) which is most common, you’ll enjoy the following performance (as of January, 2021).
When configuring containers in your Blob Storage Account you’ll notice an access tier setting with the options “Hot, Cool, Archive”. I covered Archive Storage a few years back but it’s not really relevant to this topic. What is relevant though is the different between Hot and Cool storage. There seems to still be a lot of confusion around the difference between the two, but at it’s core the main different is transaction cost.
Similar to the difference between Premium and Standard SSD Managed disks, Hot Blob storage has a higher capacity cost but a lower transaction cost while Cool Blob Storage has a lower capacity cost and a higher transaction cost. If you’re storing data that is infrequently accessed but still needs to be constantly available, Cool Blob Storage is the way to go. If you’re storing data that has a lot of transactions then Hot Blob storage is your best bet. Don’t get caught up in the “per GB” sticker on each tier – this can be misleading to the resulting cost depending on your workload characteristics.
As far as durability and availability goes, Storage Accounts have a few different options depending on the storage service being used: LRS, ZRS, GRS, ZGRS, GRS-RA. There is a lot of information on these different redundancy levels, so take a look at the durability and availability table below and if you want to read more, click the link below.
Additional Reading on Understanding Azure Storage Redundancy Offerings: https://techcommunity.microsoft.com/t5/azure-storage/understanding-azure-storage-redundancy-offerings/ba-p/1431700
Lastly, I recently re-created an outdated version of an infographic on capacity limits for Azure Storage Accounts and I thought I would share that here.
Feel free to reference this image using the following: https://urls.hansencloud.com/azure-storage-limits
Azure Files is another storage service under the Azure Storage Account and has similar shared features, but some very distinct to itself as well. The primary purpose of Azure Files is to provide file-level storage services like you would get from a Network Attached Storage appliance or a Server providing those access protocols to a filesystem share. Azure Files provides SMB access (as well as HTTP/S API) to provisioned shares.
Like I mentioned, the primary access methods for Azure Files are through the API or by using SMB (NFS v4.1 is currently in preview so I won’t be considering it an option in this post as of right now, but will update it when it goes GA). Even though SMB is most typically used in with Windows machines, shares on Azure Files can be used by Windows, Linux, or even MacOS.
Something really interesting about Azure Files though is its Azure File Sync capability which allows for a centralized file share in Azure Files which can be facilitated through agents deployed on Windows Servers which then act as a cache for the Azure Files data. This is particularly interesting because it allows the Server itself to present whichever access method it would like to the client, but use the backing of a centralized Azure Files Share.
The way Azure File Sync works at a high-level is a File Share is created, then linked to what is called a “sync group”, which facilitates the registrations from any agents deployed on Windows Servers (in Azure or on-prem).
Azure Files also allows, in conjunction with typical access key authentication, Active Directory-based authentication options . The ability to use this type of AuthN directly on the Azure Files PaaS endpoint is really interesting and makes it a great choice for a solution where you want to leverage the identity systems you already have in place. It’s also worth noting that if you’re using Azure File Sync, the deployed agent is the only one communicating with the File Share directly and the access to the data locally can be controlled through whichever method you prefer (SMB ACLs with ADDS, for example).
Typical Use Cases:
I see a mix of uses with Azure Files. A mix between using it for a file-based backend for various applications and services to an environment where the data is access directly by users. A scenario I’ve run into more frequently though is when companies want to replace traditional on-prem File Servers and even things like DFS. Anywhere you want to leverage SMB in a fully managed PaaS way, Azure Files is for you.
Cost, Performance, Availability and Limitations:
Similar to Blob Storage, Azure Files has multiple Tiers to help optimize for performance and Cost.
Image Reference: https://azure.microsoft.com/en-us/pricing/details/storage/files/
Again, these tiers are priced based on capacity (provisioned or consumed) in combination with transactions and any snapshots or backups.
Performance for Azure Files is based on whether or not you use a standard storage account or a specific “Azure Files” storage account SKU which will enable “Premium” File Shares. The performance specifications for a standard storage account (eg. General Purpose v2) are the same as the limits posted for the blob storage earlier. If you’re using Premium Files though, here are the performance targets.
Keep in mind that the 100TB limit is per share, and you can create multiple (just like you would with traditional file shares) up to the limit of the Storage Account (5 PB by default, as stated earlier in this post).
Lastly, availability for Azure Files is no different than Blob since they’re both contained by a storage account and will be subject to the availability and durability of the storage account data redundancy setting.
Pros and Cons:
Okay, here we go with the Pros and Cons for using an Azure Storage Services (Blob & File) for your shared storage configuration on Azure.
- Both are PaaS and fully managed which greatly reduces operational overhead.
- Significantly higher capacity limits as compared to IaaS.
- Ability to migrate workloads as-is and use existing storage configurations when using SMB or Blobfuse.
- Ability to use Active Directory Authentication in Azure Files
- Both integrate with native backup solutions.
- Both integrate with Azure Defender for Storage.
- Blobfuse stores connection information in plain-text, by default.
- Does not support older access protocols like iSCSI.
Alright, that’s it for Part 3 of this blog series – Shared Storage on Azure Storage Services. Please reach out to me in the comments, on LinkedIn, or Twitter with any questions or comments about this post, the series, or anything else!