All About Informatica Transformations
In
Informatica, Transformations help us to transform the source data according to
the requirements of target system and thereby ensuring the quality of the data
being loaded into target.
Transformations
mainly are of two types: Active and Passive.
Active
Transformation
An
active transformation can change the number of rows that pass through it from
source to target i.e it eliminates rows that do not meet the condition in
transformation.
Passive
Transformation
A
passive transformation does not change the number of rows that pass through it
i.e it passes all rows through the transformation.
Transformations
can be Connected or UnConnected.
Connected
Transformation
Connected
transformation is connected to other transformations or directly to target
table in the mapping.
UnConnected
Transformation
An
unconnected transformation is not connected to other transformations in the
mapping. It is called within another transformation, and returns a value to
that transformation.
List
of Transformations
Below
are the list of Transformations available in
Informatica
PowerCenter:
- Aggregator Transformation
- Expression Transformation
- Filter Transformation
- Joiner Transformation
- Lookup Transformation
- Normalizer Transformation
- Rank Transformation
- Router Transformation
- Sequence Generator Transformation
- Stored Procedure Transformation
- Sorter Transformation
- Update Strategy Transformation
- XML Source Qualifier Transformation
- Advanced External Procedure Transformation
- External Transformation
- Union Transformation
Now
we will discuss in detail, about the Informatica Transformations and their significances
in the ETL process.
Aggregator Transformation
Aggregator transformation is an Active and Connected
transformation. This transformation is useful to perform calculations such as
averages and sums (mainly to perform calculations on multiple rows or groups).
For example, to calculate total of daily sales or to calculate average of
monthly or yearly sales. Aggregate functions such as AVG, FIRST, COUNT,
PERCENTILE, MAX, SUM etc. can be used in aggregate transformation.
Expression
Transformation
Expression transformation is a Passive and Connected
transformation. This can be used to calculate values in a single row before
writing to the target. For example, to calculate discount of each product or to
concatenate first and last names or to convert date to a string field.
Filter
Transformation
Filter transformation is an Active and Connected
transformation. This can be used to filter rows in a mapping that do not meet
the condition. For example, to know all the employees who are working in
Department 10 or to find out the products that falls between the rate category
$500 and $1000.
Joiner
Transformation
Joiner Transformation is an Active and Connected
transformation. This can be used to join two sources coming from two different
locations or from same location. For example, to join a flat file and a
relational source or to join two flat files or to join a relational source and
a XML source. In order to join two sources, there must be at least one matching
port. While joining two sources it is a must to specify one source as master
and the other as detail. The Joiner transformation supports the following types
of joins:
- Normal
- Master Outer
- Detail Outer
- Full Outer
Normal join discards all the rows of data from the master
and detail source that do not match, based on the condition.
Master outer join discards all the unmatched rows from the
master source and keeps all the rows from the detail source and the matching
rows from the master source.
Detail outer join keeps all rows of data from the master
source and the matching rows from the detail source. It discards the unmatched
rows from the detail source.
Full outer join keeps all rows of data from both the master
and detail sources.
Lookup Transformation
Lookup transformation is Passive and it can be both
Connected and UnConnected as well. It is used to look up data in a relational
table, view, or synonym. Lookup definition can be imported either from source
or from target tables.
For example, if we want to retrieve all the sales of a
product with an ID 10 and assume that the sales data resides in another table.
Here instead of using the sales table as one more source, use Lookup
transformation to lookup the data for the product, with ID 10 in sales table.
Difference
between Connected and UnConnected Lookup Transformation:
- Connected lookup receives input values directly from mapping pipeline whereas UnConnected lookup receives values from:LKP expression from another transformation.
- Connected lookup returns multiple columns from the same row whereas UnConnected lookup has one return port and returns one column from each row.
- Connected lookup supports user-defined default values whereas UnConnected lookup does not support user defined values.
Normalizer Transformation
Normalizer
Transformation is an Active and Connected transformation. It is used mainly
with COBOL sources where most of the time data is stored in de-normalized
format. Also, Normalizer transformation can be used to create multiple rows
from a single row of data.
Rank Transformation
Rank transformation is an Active and Connected
transformation. It is used to select the top or bottom rank of data. For
example, to select top 10 Regions where the sales volume was very high or to
select 10 lowest priced products.
Router
Transformation
Router is an Active and Connected transformation. It is
similar to filter transformation. The only difference is, filter transformation
drops the data that do not meet the condition whereas router has an option to
capture the data that do not meet the condition. It is useful to test multiple
conditions. It has input, output and default groups. For example, if we want to
filter data like where State=Michigan, State=California, State=New York and all
other States. It’s easy to route data to different tables.
Sequence
Generator Transformation
Sequence Generator transformation is a Passive and Connected
transformation. It is used to create unique primary key values or cycle through
a sequential range of numbers or to replace missing keys.
It has two output ports to connect transformations. By
default it has two fields CURRVAL and NEXTVAL(You cannot add ports to this
transformation). NEXTVAL port generates a sequence of numbers by connecting it
to a transformation or target. CURRVAL is the NEXTVAL value plus one or NEXTVAL
plus the Increment By value.
Stored
Procedure Transformation
Stored Procedure transformation is a Passive and Connected
& UnConnected transformation. It is useful to automate time-consuming tasks
and it is also used in error handling, to drop and recreate indexes and to
determine the space in database, a specialized calculation etc.
The stored procedure must exist in the database before
creating a Stored Procedure transformation, and the stored procedure can exist
in a source, target, or any database with a valid connection to the Informatica
Server. Stored Procedure is an executable script with SQL statements and
control statements, user-defined variables and conditional statements. In case
of stored procedure transformation procedure will be compiled and executed in a
relational data source. You need data base connection to import the stored
procedure in to your mapping.
Sorter
Transformation
Sorter transformation is a Connected and an Active
transformation. It allows to sort data either in ascending or descending order
according to a specified field. Also used to configure for case-sensitive
sorting, and specify whether the output rows should be distinct.
Source
Qualifier Transformation
Source Qualifier transformation is an Active and Connected
transformation. When adding a relational or a flat file source definition to a
mapping, it is must to connect it to a Source Qualifier transformation. The
Source Qualifier performs the various tasks such as overriding default SQL
query, filtering records; join data from two or more tables etc.
Update
Strategy Transformation
Update strategy transformation is an Active and Connected
transformation. It is used to update data in target table, either to maintain
history of data or recent changes. You can specify how to treat source rows in
table, insert, update, delete or data driven.
XML
Source Qualifier Transformation
XML Source Qualifier is a Passive and Connected
transformation. XML Source Qualifier is used only with an XML source
definition. It represents the data elements that the Informatica Server reads
when it executes a session with XML sources.
Advanced
External Procedure Transformation
Advanced External Procedure transformation is an Active and
Connected transformation. It operates in conjunction with procedures, which are
created outside of the Designer interface to extend PowerCenter/PowerMart
functionality. It is useful in creating external transformation applications,
such as sorting and aggregation, which require all input rows to be processed
before emitting any output rows.
Union
Transformation
The union transformation is used to merge multiple datasets
from various streams or pipelines into one dataset. This transformation works
similar to the UNION ALL, it does not remove any duplicate rows. It is
recommended to use aggregator to remove duplicates are not expected at the
target.
External
Procedure Transformation
External Procedure transformation is an Active & Connected/UnConnected
transformations. Sometimes, the standard transformations such as Expression
transformation may not provide the functionality that you want.In such cases
External procedure is useful to develop complex functions within a dynamic link
library (DLL) or UNIX shared library, instead of creating the necessary
Expression transformations in a mapping.
Differences
between Advanced External Procedure and External Procedure Transformations:
- External Procedure returns single value, where as Advanced External Procedure returns multiple values.
- External Procedure supports COM and Informatica procedures where as AEP supports only Informatica Procedures.
No comments:
Post a Comment