Friday, February 15, 2008

Informatica: remove duplicate values in a source(relational table or flat file)

How to remove duplicate values in a source using Informatica?
Which transformations to use?
what is the query?

There are 2 ways to do this and both of them are efficient.
Method 1: Sorter -Filter.
Send all the data to a sorter and , sort by all feilds that u want to remove duplicacy from . note that in the preoperties tab, select Unique .
This will select and send forward only Unique Data .

Method 2: Use an Aggregator
Use AGG Transformation and group by the keys /feilds that u want to remove duplicacy from.

Method 3:
1. We can remove them at staging area, by using SQL before entering into informatica.
2. By using aggregator transformation we can remove by using group by port.

Method 4:
if soure is relational then we can used source qualifier transformation,
if source is flat file then we can used joiner (by using option sort) or by using Aggregator (by using Group by port) or look up Transformation.

No comments: