Source |
Reads data from various data sources like Azure Blob Storage, Azure Data Lake, SQL databases, etc. |
Extract data from multiple sources to process and analyze. |
Derived Column |
Creates new columns or modifies existing ones using expressions. |
Standardize data formats, create calculated fields. |
Filter |
Filters rows based on a specified condition. |
Remove unwanted data from the pipeline, such as filtering out records with null values. |
Select |
Selects specific columns from the dataset. |
Reduce the dataset size by selecting only the necessary columns. |
Aggregate |
Aggregates data by performing operations like sum, average, min, max, and count. |
Summarize data, such as calculating total sales or average order value. |
Join |
Joins data from two or more streams based on a condition. |
Combine data from multiple sources, such as joining customer data with order data. |
Union |
Combines data from multiple streams into a single stream. |
Merge datasets from different sources. |
Lookup |
Enriches data by looking up values in another dataset. |
Add additional data fields, such as looking up customer details based on customer ID. |
Conditional Split |
Splits data into multiple streams based on conditions. |
Route data to different paths based on specific conditions, such as separating transactions. |
Exists |
Checks if rows from one stream exist in another stream. |
Filter out records that do not exist in a reference dataset. |
Pivot |
Converts row data into columns. |
Reshape data for reporting purposes, such as pivoting sales data by region. |
Unpivot |
Converts columns into rows. |
Normalize data for processing, such as converting year-wise columns into a single year column. |
Flatten |
Transforms hierarchical structures into a flat structure. |
Simplify nested data, such as flattening JSON or XML data. |
Surrogate Key |
Generates unique surrogate keys for rows. |
Add unique identifiers to records, such as generating IDs for new data entries. |
Window |
Applies window functions over a specified range of rows. |
Perform calculations over a set of rows, such as running totals or moving averages. |
Assert |
Validates data against specified conditions. |
Ensure data quality by asserting business rules, such as checking for non-negative values. |
Cross Join |
Produces a Cartesian product of two datasets. |
Generate combinations of data, such as pairing all products with all stores. |
Rank |
Assigns ranks to rows within a partition. |
Rank data, such as assigning ranks to students based on their scores. |
External Call |
Invokes external services during data processing. |
Call external APIs or services to enrich data or perform operations outside ADF. |
Alter Row |
Specifies row-level insert, update, delete, and upsert policies. |
Implement row-level changes in databases based on conditions. |
Destination |
Writes data to various destinations such as databases, data lakes, etc. |
Load transformed data into target storage or database systems. |