Creating Linked Services In Azure Data Factory

Introduction

This article will help you to create an Azure Data Factory and create a linked service in it.

Why do we need a Data Factory?

The Data Factory will help us in creating pipelines which will help us in copying the data from one data store to another.

Requirements

  1. Microsoft Azure account

Follow the steps given below

Step 1. Log in to the Azure portal, using the link given below and you will get the home screen of the Azure portal.

Link- www.portal.azure.com

Azure portal

Step 2. Click New -->Databases --> Data Factory

 Data Factory

You will get a new blade now for configuring your new Data Factory.

Fill in the details for the name of the Data Factory, Subscription, Resource Group, and Location, and pin to the dashboard what you wish to do. Click Create once the details are given.

Resource Group

Your Azure Data Factory will be deployed now.

Your Data Factory on Azure, which has been deployed, goes here.

Azure Data Factory

Need for Linked Services

We need to create a data factory with a few entities first before we start working with the pipeline. Thus, now we will be creating a Linked Service to link the data stores to our data store to define the input/output and represent the same. Later, we will be creating the pipeline.

Further, we will be linking our Azure storage account with the Azure HDInsight Cluster towards our Azure Data Factory. The storage account will have the input and output data for the pipeline here.

Step 3. Open Azure Data Factory, which was created now. Go for author and deploy.

Open Azure Data

Now, click "New Data Store" and go for "Azure Storage".

Azure Storage

We will be getting a JSON script to create Azure Storage Linked Service now.

JSON script

Here is the editor with the JSON script.

Note. You should have a storage account, which was created earlier to configure this connection string

Step 4. Replace the connection string code given below with your storage account credentials

"connectionString" "DefaultEndpointsProtocol=https;AccountName=<accountname>;AccountKey=<accountkey>"

Click "Deploy" once, as the connection string for the storage is configured.

Connection string

Once the Linked Service is deployed, we can find the Draft-1 editor, which will be unavailable on the pane and we can see AzureStorageLinkedService on the left side of the Data Factory pane.

 Linked Service

Step 5. We will be creating an Azure HDInsight Linked Service cluster now to the Data Factory.

Move to the Data Factory Editor and click "more" at the topmost right pane in the "New Datastore".

 Azure HDInsight

Click "New compute" here.

Select the “OnDemand HDInisght Cluster”.

HDInisght Cluster

Step 6. Copy the code snippet given below and place it in the editor of the Drafts/Drafts-1.

{
    "name": "HDInsightOnDemandLinkedService",
    "properties": {
        "type": "HDInsightOnDemand",
        "typeProperties": {
            "version": "3.2",
            "clusterSize": 1,
            "timeToLive": "3",
            "linkedServiceName": "AzureStorageLinkedService"
        }
    }
}

Drafts

The code given above defines the JSON properties including Version, ClusterSize, TimeToLive, and LinkedServiceName. Once the code is copied towards the editor, click Deploy.

JSON properties

ClusterSize

Now, you can find the two things at Linked Services as AzureStorageLinkedService and HDInsightOnDemandLinkedService.

Follow my next article to work on pipelines in Azure Data Factory.