Introduction
Azure data lake is also a cloud based storage service that supports big data analytics. You can store any kind of structured, non structured or semi structured data into it. It provides unlimited storage for any size of data.
Microsoft Azure Data Lake Storage (ADLS) is a completely overseen, versatile, adaptable and secure file system that upholds HDFS semantics and works with the Apache Hadoop environment. It gives industry-standard dependability, venture grade security and limitless storage to store a huge amount of data.
Source system like Device, Web, LOB App, Social, Relations etc.
We have separate article on over view of azure data lake
Features of Azure Data Lake Storage
Unlimited Storage
The main characteristic of big data is its variety. Azure data lake provide the storage for limitless data. Data can be of any size and comes from many different sources like database, application or web etc. It additionally permits clients to store relational and non-relational data. Furthermore, it doesn't need a pattern to be characterized before data is store into the store.
- Unstructured data: data that doesn't have a pre-characterized data model or isn't coordinated into a specific arrangement or standard, i.e., tweets, instant messages, and so forth
- Semi-organized data: data that doesn't adjust with the conventional construction of data models related with a social data set or different types of data tables (i.e., JSON, XML).
- Organized data: data that lives in a field inside a record document; the most widely recognized models are accounting pages and data contained in a social data set.
ADLS Support for Heavy Analytic Workloads
ADLS is used for big data analytics to improve performance and reduce idleness. ADLS is able to process data up to petabytes in size by partition data into multiple nodes, where mappers and reducers, in the size of thousands, measure the data in corresponding to give quick outcomes.
High Availability and Reliability
To prevent data from any hardware failure, Azure always keep three copies of each data object stored across many regions to ensure availability. “Read” transactions can be directed toward any of the three copies of the data object. Microsoft recommends as a best practice to consistently guarantee appropriate access approaches for your data just as make duplicates of basic data as a component of disaster mitigation routines.
Security
ADLS gives rich security capacities so clients can experience harmony of psyche while putting away their resources in the ADLS framework. Clients can screen execution, review use and access control through the incorporated Azure Active Directory (AAD).
Auditing
ADLS makes review logs for all activities acted in it. These logs can be broke down with U-SQL contents.
Access Control
ADLS gives access control through the help of POSIX-agreeable access control records (ACL) on documents and envelopes put away in its foundation. It additionally oversees verification through the combination of AAD dependent on OAuth tokens from upheld character suppliers. Tokens will convey the client's security group data, and this data will be gone through all the ADLS microservices.
Information Encryption
ADLS encodes information on the way and very still, furnishing worker side encryption of information with the assistance of keys incorporating client oversaw keys in the Azure Key Vault.
Information Encryption Key Types
ADLS utilizes a Master Encryption Key (MEK) put away in Azure's critical vault to encode and decode information. Clients have the alternative to deal with this key themselves yet there is consistently the danger of not having the option to decode the information if the key is lost. ADLS additionally incorporates the accompanying keys:
- Block Encryption Key (BEK) - these are keys produced for each square of data.
- Data Encryption Key (DEK) - these keys are encoded by the MEK and are liable for producing BEKs to scramble data blocks.
Azure Data Lake Storage Gen 2 Pricing
Microsoft has announced ADLS gen 2. Azure Data Lake Storage Gen2 is the world’s most productive Data Lake. The below information is based on ADLS gen 2 pricing of hierarchical structure (file structure) in central India region and Indian currency.
Data Storage Prices for LRS Redundancy
PREMIUM |
PREMIUM
|
HOT
|
COOL
|
ARCHIVE
|
First 50 terabyte (TB) / month
|
₹15.12951 per GB
|
₹1.44091 per GB
|
₹1.09509 per GB
|
₹0.14410 per GB
|
Next 450 TB / month
|
₹15.12951 per GB
|
₹1.38327 per GB
|
₹1.09509 per GB
|
₹0.14410 per GB
|
Over 500 TB / month
|
₹15.12951 per GB
|
₹1.32564 per GB
|
₹1.09509 per GB
|
₹0.14410 per GB
|
Data Storage price for ZRS Redundancy
PREMIUM |
PREMIUM
|
HOT
|
COOL
|
ARCHIVE
|
First 50 terabyte (TB) / month
|
₹20.17268 per GB
|
₹1.80114 per GB
|
₹1.36886 per GB
|
N/A
|
Next 450 TB / month
|
₹20.17268 per GB
|
₹1.72909 per GB
|
₹1.36886 per GB
|
N/A
|
Over 500 TB / month
|
₹20.17268 per GB
|
₹1.65705 per GB
|
₹1.36886 per GB
|
N/A
|
Data Storage price for GRS Redundancy
PREMIUM |
PREMIUM
|
HOT
|
COOL
|
ARCHIVE
|
First 50 terabyte (TB) / month
|
N/A
|
₹2.88182 per GB
|
₹2.19018 per GB
|
₹0.36023 per GB
|
Next 450 TB / month
|
N/A
|
₹2.76654 per GB
|
₹2.19018 per GB
|
₹0.36023 per GB
|
Over 500 TB / month
|
N/A
|
₹2.65127 per GB
|
₹2.19018 per GB
|
₹0.36023 per GB
|
How to Create an Azure Data Lake Storage Account
Below are some simple steps that you need to follow to created azure data lake storage.
- Login on azure portal i.e. portal.azure.com.
- Click on 'Storage Account' at home page.
- Now click on '+New' to create a new storage account.
- Now a new window will be opened to create a storage account. Provide the basic details of subscription, resource group name, storage account name and location like below image.
No need to change anything at 'Networking' tab, 'Data Protection' tab. Go to the 'Advanced' tab
- Here you need to enable the data lake storage gen 2. by default it is disabled. So, make sure to check it enabled. Click on 'review+create'.
- If your validation is passed at review stage then click on 'Create'
It will take few seconds to submitting deployment.
After completion of deployment, your ADLS account has been created.
here you can create any container, file, table or queue. You can also check preview on storage explorer.
or you can download azure storage explorer on your local sysytem and login here with your account.
Summary
Azure data lake storage gen 2 is very helpful for analytic purpose and big data analysis. hope you understand about data lake storage and its purpose. thanks for reading. have a good day.