Introduction
This article will help you to learn the mapping data flow and mapping components in Talend Open Studio.
Mapping Components
Mapping components are advanced and deal with multiple input and output data from a different source to destination. All ETL mapping components are under Processing category in palette.
Advantage of Mapping Components
- To transform the data
- Join the data from different tables or different servers
- Filter the data using constraints
- Condition the data
- Handle the multiple data flow.
- Route to data flows
- Reject the data
- Data concatenation and interchange
Type of Mapping Components in ETL
tMap Component
- tMap is the all in one component. All the processes take place in the Map Editor in tMap component. To Open the Map Editor, double click on the tMap component or In On basic Settings open the Map Editor window. Several panels are in Map Editor. tMap component is not the starting and ending of the job. The tMap should have both input and output component.
- There will be only one main row connection and remaining all input table are considered as lookup connection. Lookup rows are from secondary flows of data. These reference data might depend directly or indirectly on the primary flow.
- The name of input/output tables in the Map Editor is same as the name of the row connections of incoming and outgoing flows.
- We can preview the Map Editor in basic settings.
- Map Editor has below panels
- Input Panel
- Variable Panel
- Search Panel
- Output Panel
- Schema Editor
- Expression Editor
Input Panel
Input Panel is at the top left side of the Map Editor. Graphical representation of Main and Lookup table data from multiple input is in Input Panel. The top table in panel reflects the Main flow connection. Can't change the main table to down in Map Editor. Lookup table's order can interchange with the help of up and down arrows.
tMap Settings
Input panel tMap setting helps to set the model between main and lookup table and store temp data. It has below properties.
- Lookup Model
- Match Model
- Join Model
- Store temp data
1) Lookup Model
How to load the lookup table data to join and condition with the main table. Below are 3 types of lookup model.
- Load Once
- Reload at each row
- Reload at each row(cache)
2) Match Model
How to match the lookup data with main table data. Below are the 3 types of match model.
- Unique match
- First match
- All matches
3) Join Model
How to join the main and lookup table fields. Below are 2 types of join model.
- Left Outer Join
- Inner Join
4) Store temp data
How to store the lookup table data in the system, it is the boolean input.
- True - Store the lookup data in the disk instead of a system.
- False - Store the Lookup data in the system memory.
Default value for the tMap settings
Variable Panel
Variable panel is the central part of Map Editor. We can transform and store the data in variables for further use. Global or context variables can use in the output table.
Search Panel
Search panel is above the variable panel, search the text in Map Editor in expression, input or output panel as same as entered search field.
Output Panel
Output Panel is on the left side of the Map Editor. Graphical representation of the output data from the mapping from the input panel and also from variable panel to output flow.
tMap settings
Output panel tMap setting helps to set the type of outputs. It has below properties.
- Catch output reject
- Catch lookup inner join reject
- Schema Type
Default value for the tMap setting
Schema Editor
Schema panel is at the bottom of the Map Editor. It has both input and output tables and data schema. Schema will pre-fill based on the input data to Map Editor.
Expression Editor
Expression editor is the edition tool for all expression keys of Input/Output data, variable expressions or filtering conditions.
How tMap components work
Mapping connection for input files should be on priority. The 1st row connection between file and Map Editor is taken at the Main table. Remaining followup connection between file and Map Editor is considered as Lookup tables.
In this job, we are going to join multiple data from different sources (here files) into a single data (table) using tMap component. Input files are read by tFileInputDelimited component and output data is printed by tLogRow component. Here student data are taken as an example. Keeps all tMap settings as a default value.
Input data
In this example, we take the input data from the 3 text files using tFileInputDelimited component.
- Input Files
- Primary.txt
- Mark.txt
- Parent.txt
- Main table
- Primary File (which has student Id, Name, and Address) is considered as main table.
- Lookup table
- Marks File (which has student marks) is considered as lookup table.
- Parent File (which has student-parent details) is considered as lookup table.
Key
Student Id is taken as the Primary Key. Student Id from main table is joined with mark table and also parent table to retrieve the corresponding student detail.
Run the Job
To run the job, go to Run view -> Basic Run -> Click Run button.
In this example, left outer join is used. So, the output data have all the main table data, primary key matched lookup table data and if primary key value is not in the lookup table then the lookup table columns have null value.
Summary
In this article, we have learned about talend mapping component (tMap) in Talend Open Studio. For tMap component real-time example, read my other article Working with the tMap component in real-time examples.