Merging Arrays within Synapse / Azure Data Factory Pipeline

Problem Statement

In my previous Blog Listing Unique Records Within an Array in Azure Data Factory, we identified that the UNION function merges all the arrays by removing the duplicates and listing only unique records.

Is it possible to merge arrays within the Synapse / Azure Data Factory pipeline by retaining all values (including duplicates)?

Prerequisites

Azure Data Factory / Synapse

Solution

  1. The “join” function returns a string that has all the items from an array, separated by the specified character. We can leverage this function to convert the Array into a String.
  2. The “concat” function combines two or more strings and returns the combined string. We can leverage this function to combine all values obtained in #1.
  3. The “split” function returns an array containing substrings separated by commas from a larger string based on a specified delimiter character in the original string. We can leverage this function to transform the String obtained in #2 back to an Array.
  4. Let’s say we have a list of values in an Array variable.

Variables

SplitVar

GitHub code

UsingSet Variable activity and Join Function ( Similar for JoinVar1 and Joinvar2).

User properties

ConcatVars

@concat(variables('String1'),';',variables('String2'))

MergedArray

Output

Output

Note. One can also do it by leveraging a single Set Variable Activity.

@split(concat(join(variables('Array1'),';'),';',join(variables('Array2'),';')),';' )