Workflow in Informatica: Create, Task, Parameter, Reusable, Manager
Workflow is a group of instructions/commands to the integrations service in Informatica. The integration service is an entity which reads workflow information from the repository, fetches data from sources and after performing transformation loads it into the target.
Workflow - It defines how to run tasks like session task, command task, email task, etc.
To create a workflow
- You first need to create tasks
- And then add those tasks to the workflow.
A Workflow is like an empty container, which has the capacity to store an object you want to execute. You add tasks to the workflow that you want to execute. In this tutorial, we are going to do following things in workflow.
Workflow execution can be done in two ways
- Sequence : Tasks execute in the order in which they are defined
- Event based : Tasks gets executed based on the event conditions.
Step1 – In the Informatica Designer, Click on the Workflow manager icon
Step 2 – This will open a window of Workflow Manager. Then, in the workflow Manager.
- We are going to connect to repository "guru99", so double click on the folder to connect.
- Enter user name and password then select "Connect Button".
Step 3- In the workflow manager.
- Right click on the folder
- In the pop up menu, select open option
This will open up the workspace of Workflow manager.
To execute any task in workflow manager, you need to create connections. By using these connections, Integration Service connects to different objects.
For Example, in your mapping if you have source table in oracle database, then you will need oracle connection so that integration service can connect to the oracle database to fetch the source data.
Following type of connections can be created in workflow manager.
- Relational Connection
- Ftp Connection
The choice of connection you will create, will depend on the type of source and target systems you want to connect. More often, you would be using relational connections.
Step 1 – In Workflow Manager
- Click on the Connection menu
- Select Relational Option
Step 2 – In the pop up window
- Select Oracle in type
- Click on the new button
Step 3 – In the new window of connection object definition
- Enter Connection Name (New Name-guru99)
- Enter username
- Enter password
- Enter connection string
- Leave other settings as default and Select OK button
Step 4 – You will return on the previous window. Click on the close button.
Now you are set with the relational connection in workflow manager.
There are three component tools of workflow manager that helps in creating various objects in workflow manager. These tools are
- Task Developer
- Worklet Designer
- Workflow Designer
Task Developer – Task developer is a tool with the help of which you can create reusable objects. Reusable object in workflow manager are objects which can be reused in multiple workflows. For Example, if you have created a command task in task developer, then you can reuse this task in any number of workflows.
The role of Workflow designer is to execute the tasks those are added in it. You can add any no of tasks in a workflow.
You can create three types of reusable tasks in task developer.
- Command task
- Session task
- Email task
Command task – A command task is used to execute different windows/unix commands during the execution of the workflow. You can create command task to execute various command based tasks. With help of this task you can execute commands to create files/folders, to delete files/folders, to do ftp of files etc.
Session Task - A session task in Informatica is required to run a mapping.
- Without a session task, you cannot execute or run a mapping
- A session task can execute only a single mapping. So, there is a one to one relationship between a mapping and a session
- A session task is an object with the help of which informatica gets to know how and where to execute a mapping and at which time
- Sessions cannot be executed independently, a session must be added to a workflow
- In session object cache properties can be configured and also advanced performance optimization configuration.
Email task - With the help of email task you can send email to defined recipients when the Integration Service runs a workflow. For example, if you want to monitor how long a session takes to complete, you can configure the session to send an email containing the details of session start and end time. Or, if you want the Integration Service to notify you when a workflow completes/fails, you can configure the email task for the same.
Step 1- To create a command task we are going to use Task Developer. In Workflow Manager, open the task developer by clicking on tab "task developer" from the menu.
Step 2 – Once task developer is opened up, follow these steps
- Select Tasks menu
- Select Create option
Step 3 – In the create task window
- Select command as type of task to create
- Enter task name
- Select create button
This will create command task folder. Now you have to configure the task to add command in it, that we will see in next step.
Step 4 – To configure the task, double click on the command task icon and it will open an "edit task window". On the new edit task window
- Select the commands menu
- Click on the add new command icon
- Enter command name
- Click on the command icon to add command text
This will open a command editor box.
Step 5 – On the command editor box, enter the command "mkdir C:\guru99" (this is the windows command to create a folder named "guru99") and select OK.
Afther this step you will return to the edit tasks window and you will be able to see the command you added in to the command text box.
Step 6 – Click OK on the edit task window,
The command task will be created in the task developer under "Guru99" repository.
Note – use ctrl+s shortcut to save the changes in repository
To execute command taks you have to switch on to workflow designer. A workflow designer is a parent or container object in which you can add multiple tasks and when workflow is executed, all the added tasks will execute. To create a workflow
Step 1 – Open the workflow designer by clicking on workflow designer menu
Step 2 – In workflow designer
- Select workflows menu
- Select create option
Step 3 – In create workflow window
- Enter workflow name
- Select OK Button ( leave other options as default)
This will create the workflow.
Naming Convention - Workflow names are prefixed with using 'wkf_', if you have a session named 's_m_employee_detail' then workflow for the same can be named as 'wkf_s_m_employee_detail'.
When you create a workflow, it does not consist of any tasks. So, to execute any task in a workflow you have to add task in it.
Step 4 - To add command task that we have created in Task developer to the workflow desinger
- In the navigator tree, expand the tasks folder
- Drag and drop the command task to workflow designer
Step 5 - Select the "link task option" from the toolbox from the top menu. (The link task option links various tasks in a workflow to the start task, so that the order of execution of tasks can be defined).
Step 6 – Once you select the link task icon, it will allow you to drag the link between start task and command task. Now select the start task and drag a link to the command task.
Now you are ready with the workflow having a command task to be executed.
Step 1 – To execute the workflow
- Select workflows option from the menu
- Select start workflow option
This will open workflow monitor window and executes the workflow
Once the workflow is executed, it will execute the command task to create a folder (guru99 folder) in the defined directory.
A session task in Informatica is required to run a mapping.
Without a session task, you cannot execute or run a mapping and a session task can execute only a single mapping. So, there is a one to one relationship between a mapping and a session. A session task is an object with the help of which Informatica gets to know how and where to execute a mapping and at which time. Sessions cannot be executed independently, a session must be added to a workflow. In session object cache properties can be configured and also advanced performance optimization configuration.
In this exercise you will create a session task for the mapping "m_emp_emp_target" which you created in the previous article.
Step1 – Open Workflow manager and open task developer
Step 2 – Now once the task developer opens, in the workflow manager go to main menu
- Click on task menu
- Select create option
This will open a new window "Create Task"
Step 3 – In the create task window
- Select session task as type of task.
- Enter name of task.
- Click create button
Step 4 – A window for selecting the mapping will appear. Select the mapping which you want to associate with this session, for this example select "m_emp_emp_target" mapping and click OK Button.
Step 5 – After that, click on "Done" button
Session object will appear in the task developer
Step 6 – In this step you will create a workflow for the session task. Click on the workflow designer icon.
Step 7 – In the workflow designer tool
- Click on workflow menu
- Select create option
Step 8 – In the create workflow window
- Enter workflow name
- Select OK. ( leave other properties as default, no need to change any properties)
In workflow manager a start task will appear, it's a starting point of execution of workflow.
Step 9 – In workflow manager
- Expand the sessions folder under navigation tree.
- Drag and drop the session you created in the workflow manager workspace.
Step 10 - Click on the link task option in the tool box.
Step 11 - Link the start task and session task using the link.
Step 12 – Double click on the session object in wokflow manager. It will open a task window to modify the task properties.
Step 13 – In the edit task window
- Select mapping tab
- Select connection property
- Assign the connection to source and target, the connection which we created in early steps.
- Select OK Button
Now your configuration of workflow is complete, and you can execute the workflow.
The start task is a starting point for the execution of workflow. There are two ways of linking multiple tasks to a start task.
In parallel linking the tasks are linked directly to the start task and all tasks start executing in parallel at same time.
Step 1 – In the workflow manager, open the workflow "wkf_run_command"
Step 2 – In the workflow, add session task "s_m_emp_emp_target". ( by selecting session and then drag and drop)
Step 3 – Select the link task option from the toolbox
Step 4 - link the session task to the start task (by clicking on start taks, holding the click and connecting to session task)
After linking the session task, the workflow will look like this.
Step 5 – Start the workflow and monitor in the workflow monitor.
But before we add tasks in serial mode, we have to delete the task that we added to demonstrate parallel execution of task. For that
Step 1 – Open the workflow "w.kf_run_command"
- Select the link to the session task.
- Select edit option in the menu
- Select delete option
Step 2 – Confirmation dialogue box will appear in a window, select yes option
The link between the start task and session task will be removed.
Step 3 – Now again go to top menu and select the link task option from the toolbox
Step 4 – link the session task to the command task
After linking the workflow will look like this
Step 5 - To make the visual appearance of workflow more clear
- Right click on wokspace of workflow
- Select arrange menu
- Select Horizontal option
If you start the workflow the command task will execute first and after its execution, session task will start.
Workflow variables allows different tasks in a workflow to exchange information with each other and also allows tasks to access certain properties of other tasks in a workflow. For example, to get the current date you can use the inbuilt variable "sysdate".
Most common scenario is when you have multiple tasks in a workflow and in one task you access the variable of another task. For example, if you have two tasks in a workflow and the requirement is to execute the second task only when first task is executed successfully. You can implement such scenario using predefined variable in the workflow.
We had a workflow "wkf_run_command" having tasks added in serial mode. Now we will add a condition to the link between session task and command task, so that, only after the success of command task the session task will be executed.
Step 1 - Open the workflow "wkf_run_command"
Step 2 - Double click on the link between session and command task
An Expression window will appear
Step 3 – Double click the status variable under "cmd_create_folder" menu. A variable "$cmd_create_folder.status" will appear in the editor window on right side.
Step 4 - Now we will set the variable "$cmd_create_folder.status" condition to succeeded status . which means when the previous tasks is executed and the execution was success, then only execute the next session task.
- Change the variable to "$cmd_create_folder.status=SUCCEEDED" value.
- Click OK Button
The workflow will look like this
When you execute this workflow, the command task executes first and only when it succeeds then only the session task will get executed.
Workflow parameters are those values which remain constant throughout the run. once their value is assigned it remains same. Parameters can be used in workflow properties and their values can be defined in parameter files. For example, instead of using hard coded connection value you can use a parameter/variable in the connection name and value can be defined in the parameter file.
Parameter files are the files in which we define the values of mapping/workflow variables or parameters. There files have the extension of ".par". As a general standard a parameter file is created for a workflow.
Advantages of Parameter file
- Helps in migration of code from one environment to other
- Alows easy debugging and testing
- Values can be modified with ease without change in code
The structure of parameter file
Folder_name is the name of repository folder, workflow name is the name of workflow for which you are creating the parameter file.
We will be creating a parameter file for the database connection "guru99" which we assigned in our early sessions for sources and targets.
Step 1 – Create a new empty file (notepad file)
Step 2 – In the file enter text as shown in figure
Step 3 – Save the file under a folder guru99 at the location "C:\guru99" as "wkf_run_command.par"
In the file we have created a parameter "$DBConnection_SRC", we will assign the same to a connection in our workflow.
Step 4- Open the workflow "wkf_run_command"
- Select workflows menu
- Select edit option
Step 5 – This will open up edit workflow window, in this window
- Go to properties tab menu
- Enter the parameter file name as "c:\guru99\wkf_run_command.par"
- Select OK Button
Now we are done with defining the parameter file content and point it to a workflow.
Next step is to use the parameter in session.
Step 6 - In workflow double click on the session "s_m_emp_emp_target", then
- Select mappings tab menu
- Select connection property in the left panel
- Click on the target connection, which is hardcoded now as "guru99"
Step 7 - A connection browser window will appear, in that window
- Select the option to use connection variable
- Enter connection variable name as "$DBConnection_SRC"
- Select Ok Button
Step 8 – In the edit task window connection variable will appear for the target, Select OK button in the edit task window.
Now we are done with creating parameter for a connection and assigning its value to parameter file.
When we execute the workflow, the workflow picks the parameter file looks for the value of its paramters/variables in the parameter file and takes those values.