Introduction
Parameterization is very useful when you want a reusable code that you can use forever and get the output by executing it only by changing the parameter for all your future requirements. Traditionally while coding you will declare variables which are static(see image below) but with parameterization you can use dynamic parameters all through your program without declaring it multiple times at different locations. We will see how can we parameterize using Toggle option for a cell in the azure synapse analytics.
Static parameters
Let’s see a real-time demo on how to call files from different ADLSGen2 storage paths and read them into a dataframe. We will also see how can we call the parameterized notebook into another synapse notebook with a different parameter and see how it fares.
Steps
Create a new notebook and attach it to your sparkpool which is already available or create a new one for this purpose, I have named this notebook as “Parameterization_Demo”. The first cell is where we are going to declare our parameters but not in a way we do with normal coding.
Before we go on to declare our parameters through code one important step is to enable the cell to accept the parameter by using Toggle button. On the top right side of the cell and under the ellipsis button, the Toggle parameter cell option will be available which once after you enable it, can see the word parameters at the bottom of the cell. This indicates that the cell is now ready and the variables that will be created inside them will be parameterized.
In the below cells, I have declared parameters and called a CSV file from my storage location to create a spark dataframe.
Note
Remember to use separate cells for declaring variable and code execution. If you combine both of them into the same cells the parameter logic would not work. You will be getting the same static value for the variables and the dynamic values you specify won’t get applied because the parameter cannot override the default values with that of the values which have been newly declared.
Further, I am going to create a new notebook and call this code which is saved in “Parameterization_Demo” notebook but with different location and another file name i.e., different parameters.
Notebook2 is the new notebook I have created from which I am going to call the previous one which we saw earlier in this demo.
I called using %run /<notebook_name> to access from the cell and made sure all are published before I attempt to get the output. I have given different parameters for the key value pairs where the key is the parameter name and value is filename or path. You could see though the parameter values are different it generates the value properly from the storage location.
The same can be called from a dataflow by creating manual parameter list to it but the names have to match to what we declare in the code with the one we create.
Summary
This article with the demo is to understand how can we parameterize to reuse the codes instead of writing a new one every time. This also explains how can one notebook be called into another one with different parameters explicitly stated.
Reference
Microsoft official docs