Azure Storage - Basics

Azure Storage

Azure Storage is used by applications which require high durability, high availability and high scalability.

Advantages of Azure Storage

Azure Storage is massively scalable, so you can store and process hundreds of terabytes of data to support big data scenarios. Azure Storage is scalable both in terms of handling the requests and in terms of storing the data. It automatically load-balances the requests based on the traffic. You can access the Storage via REST API. So, your client application could be anything and could run in any platform.

We can have 100 Storage Accounts per subscription as of this writing. Please go through official link for more updated details.

Accounts - You can two type of Accounts and each account can have two tiers. Please find the details below.

  1. General Purpose Storage Accounts - Using the General Purpose account, you can have multiple storage options like Tables, queues, Blobs and Files. Below are the wo tiers that are available in the General Purpose Storage Account.

    a. Standard Performance tier - This tier is used for basic usages of Tables, Queues, Files and Blobs. You can also store Azure VHDs in this tier. All the data would be stored in HDDs.

    b. Premium Performance tier - This tier currently supports only Azure VHDs. Choose this tier only if you have a requirement of high-performing VMs that required high IOPS. This would be costlier than the Standard Performance tier. All the data would be stored in SSDs for better performance.

  2. Blob Storage Accounts - Blob Storage account is used to stored unstructured data such as Blobs. These Blob Storage Accounts. Microsoft recommends the Blob Storage accounts to store Block or Append Blob Storage.

    a. Hot Access Tier - Choose this tier when you think that your objects stored in the Storage Account are frequently accessed by your applications. This allows you to store data at a lower access cost.

    b. Cool Access Tier - Choose this tier when you think that your objects are less frequently used. This allows you to store data at a lower data storage cost.
Azure Storage Services

Blob Storage

     
It’s basically used for storing unstructured data such as the following.
  • Any kind of document
  • Videos, Pictures, audio files etc.
  • Logs etc.
  • You can even store .MSI files.
Blob Storage offers the following three type of blobs-
  • Block Blobs - These are used for storing backups, media files, documents etc.
  • Append Blobs - These are similar to Append block, but they are optimized for append operations. These are used for purposes like logging the data where we just need to append the data at the end of file.
  • Page Blobs - These Page Blobs are optimized for storing Virtual Machine’s Disks. A Virtual Hard Dis (VHDs) that are attached the Virtual Machines are generally stored in Page Blobs.
Table Storage

It’s used for storing Key-Value pairs much like other NoSQL databases. Table Storage provide high available, massively scalable storages. Other useful information about Table Storage:
  • We can manage, access Table Storage via REST APIs.
  • Table Storage also supports a subset of OData protocol.
  • It supports both JSON and AtomPub formats.
Queue Storage

For designing high-scalable applications, one of the patterns followed is Queue-Centric Pattern where we develop code in multiple components hosted in a distributed environment and we use Queues (Container of Messages) to communicate across those components. In other words, it provides low-latency high throughput messaging system type workloads. Azure storage provides the facility of Queue Storage for designing a reliable message communication system for asynchronous communication between the applications / components. Other useful information about Queue Storage.
  • Good for applications where workflows are required.
  • To store data backlog that need to be processed later asynchronously.
  • Cross service communication
  • Messages within a queue could be up to 64KB.
File Storage

File Storage is very useful when you need a centralized location for storing Log Files/Diagnostic dumbs etc. so that multiple applications running on VMs or Cloud Services could use them via REST APIs. It’s also possible for an application running on -premise to access the File Storage via REST APIs. Please note that you don’t have to create a VM for this.

Storage Replication
  • Local Redundant Storage

    Replicates 3 copies of data in a single facility. This is useful for hardware failures. For example, if the VM is crashed, you can get the data from the backup. If the entire facility is down because of some reason, you wouldn’t get the data. So, this is not suggestable for maximum durability. LRS are helpful for some applications which has some data governance requirements which require to store data within the region.

  • Zone Redundant Storage

    Replicates 3 copies of data in two or three facilities across two or three regions. ZRS is currently available only for block blobs. Once you have created your storage account and selected ZRS, you cannot convert it to use to any other type of replication, or vice versa. ZRS is currently available only for block blobs.

  • Geo-redundant Storage

    This is the default storage selected when you create a storage account. GRS maintains six copies of your data which is maintained three times within the primary region and three times in the secondary region. If there is a failure in the primary region, automatically one of the copies located in the secondary region will be used as a failover. With GRS, requests to write data are replicated asynchronously to the secondary region. It is important to note that opting for GRS does not impact latency of requests made against the primary region. However, since asynchronous replication involves a delay, in the event of a regional disaster it is possible that changes that have not yet been replicated to the secondary region may be lost if the data cannot be recovered from the primary region.

  • Read access geo-redundant Storage

    It’s almost similar to GRS. However, RA-GRS allows you to read data located in the Secondary location. When you create a storage account, you select the primary region for the account. The secondary region is determined based on the primary region, and cannot be changed. Read-access geo-redundant storage (RA-GRS) maximizes availability for your storage account, by providing read-only access to the data in the secondary location. When you enable read-only access to your data in the secondary region, your data is available on a secondary endpoint, in addition to the primary endpoint for your storage account. The secondary endpoint is similar to the primary endpoint, but appends the suffix –secondary to the account name. For example, if your primary endpoint for the Blob service is myaccount.blob.core.windows.net, then your secondary endpoint is myaccount-secondary.blob.core.windows.net. The access keys for your storage account are the same for both the primary and secondary endpoints.

  • Other useful information - You can change the replication type after it is created (except any other replication to ZRS and vie-versa). There would be one-type costs if you switch from LRS to GRS / RA-GRS.


Few important Metrics

Storage Accounts per subscription 100
Max allocated space per Storage Account 500 TB
Maximum number of Blobs/Queues/Tables/Files etc. Unlimited number of services up to a storage limit of 500 TB
Max size of a page blob 1 TB
Max size of a table entity 1 TB
Max size of a message in a queue 64 KB
Max size of a file share 5 TB
Total transactions rate per second 20,000 IOPS
Max ingress per Storage Account (US Region)
GRS/ZRS LRS
10 GBPS 20 GBPS
Max egress per Storage Account (US Region)
GRS/ZRS LRS
20 GBPS 30 GBPS
Max ingress per Storage Account (European and Asian Regions)
GRS/ZRS LRS
5 GBPS 10 GBPS
Max egress per Storage Account (European and Asian Regions)
GRS/ZRS LRS
10 GBPS 15 GBPS