Problem Statement
In current times, when multiple log files of similar structures/schemas get generated and/or extracting data from Paginated APIs as individual files, is there a way to consume/process a final single file rather than individual files separately? JSON is one of the most essential technologies used in the modern software landscape, so in our use case, we would take JSON files as the sample files.
Prerequisites
- Azure Data Factory /Synapse
- Azure Blob Storage
Solution
We would use 3 JSON files present within Azure blob storage as sources for the merging process.
![JSON Files]()
Sample
![JSON File]()
![JSON File 2]()
2. To merge the JSON files, we would be leveraging the Synapse/ ADF Copy Activity task.
a) Source Settings
![Copy Data]()
Source dataset
Where the Source dataset is of type JSON with POC being the Azure blob storage Container containing the individual files.
![JSON Files]()
b) Sink settings
![Merge JSON Files]()
Sink dataset
![File Sink]()
Output
![Output]()
Merged file
![Merged Files]()