Problem Statement
In current times, when multiple log files of similar structures/schemas get generated and/or extracting data from Paginated APIs as individual files, is there a way to consume/process a final single file rather than individual files separately? JSON is one of the most essential technologies used in the modern software landscape, so in our use case, we would take JSON files as the sample files.
Prerequisites
- Azure Data Factory /Synapse
- Azure Blob Storage
Solution
We would use 3 JSON files present within Azure blob storage as sources for the merging process.
Sample
2. To merge the JSON files, we would be leveraging the Synapse/ ADF Copy Activity task.
a) Source Settings
Source dataset
Where the Source dataset is of type JSON with POC being the Azure blob storage Container containing the individual files.
b) Sink settings
Sink dataset
Output
Merged file