Build your First Azure Data Factory (ADF) Pipeline

Introduction

Nowadays, the organization is struggling to manage and extract insights from vast amounts of data; Azure Data Factory can be more efficient and cost-effective. This article will explain how to build the first Azure Data Factory Pipeline with a step-by-step comprehensive guide.

What is Azure Data Factory?

Azure Data Factory is the cloud-based ETL  (Extract, Transform, and Load) and data integration service provided by Microsoft Azure. It allows you to interact with different data sources such as Azure SQL Database, Azure Blob Storage, Azure Data Lake Storage, and on-premises databases.

Steps to Create Azure Data Factory Pipeline

Here are 4 steps process to create Azure Data Factory Pipeline

Step 1. Create Storage Accounts, especially Azure Blob Storage.

Step 2. Create Azure SQL Database service.

Step 3. Create Azure Data Factory.

Step 4. Build first Azure Data Factory Pipeline.

Problem Scenario

Azure

This scenario will take input from the CSV file, store this file in the Azure Blob storage, and create a table in the Azure SQL Database service with no data. After, we will create Azure Data Factory Pipeline.

Once the pipeline is created and published the pipeline, it will write data from the pipeline to the Azure SQL Database service using Azure Data Factory.

Create a Storage Account

  • Sign in to the Azure portal at https://portal.azure.com/
  • Search Storage Accounts in the global search bar.
  • First, provide an Azure subscription and create a new Resource group.
  • Provide the storage account name as retailstorage08
  • Choose the Region as US East US
  • Provide Performance and Redundancy in the Basics tab.
  • Leave the remaining tabs optional.
  • Click the Review button.

Home

Home

  • Click Create button.
  • Deployment started initialized in a minute or two this become successful.

Home

  • Create a container named retail container.

Home

  • Upload the blob in the container.

Home

Create Azure SQL Database service

  • In the search bar, type Azure SQL Database service.
  • First, provide an Azure subscription and create a new Resource group.
  • Create Database name and server.

Home

  • Provide the Server name and location as follows below.
  • Choose the Authentication method as SQL Authentication.
  • Provide server admin login credentials.

Home

  • In the Networking tab, choose Public endpoint and provide yes to allow Azure services and resources to access this server.

Home

  • In the Security tab, don't enable Microsoft Defender for SQL.

Home

  • In the Additional settings, tabl choose a None data source.

Home

  • Click Create button.
  • Deployment started initialized in a minute or two this become successful.

Home

  • Click the Goto resource button.
  • In the Query Editor window, provide admin credentials to log in.

Preview

  • Allow the IP address for this server.

Preview

  • In the Query Editor window, create a table and create the clustered index as mentioned below.

Preview

There is no data present via Azure Data Factory Pipeline. We will insert the data into the table.

Create Azure Data Factory

  • In the search bar, type Azure Data Factory.
  • First, provide an Azure subscription and create a new Resource group.
  • Type Name as retailadf08 and region as US East US.
  • Specify the version as V2.

Home

Home

  • Click Create button.
  • Deployment started initialized in a minute or two this become successful.

Home

  • Click Goto resource.
  • Click the Launch Studio button.

Preview

Build First Azure Data Factory Pipeline

  • Create a New Pipeline click that one.

Azure

  • In the activities tab and click Copy data activity.

pipeline

  • Click the Source tab and in that Azure Blob Storage dataset tab.

New

  • Choose the format type of your data as Delimited text.

Select format

  • In the Set Properties, choose the Linked service option.

set

New

New

set

  • Click Sink Tab.

select

  • Click Azure SQL Database in the New Dataset option.

n

New

New

set

  • Click Validate All button.

Resources

  • Ensure that the factory has been validated.

Home

  • Click Publish All button for the pipeline.

all

  • From the Azure Blob, Storage data is written to the Azure SQL Database service.

Detail

Detail

  • Finally, we can see the results from the Azure SQL Database service.

Login

Summary

In this article, we learned and create the first Azure Data Factory Pipeline end-to-end implementation. Azure Data Factory is an excellent platform for data extraction and data consolidation tool from multiple data sources. It can be used as a data engineering platform to extract business value from different varieties of data. For organizations dealing with huge volumes of data, Azure Data Factory can be a cost-effective solution for accelerating data transformation and unlocking new customer insights, and driving business values.