How do we eliminate duplicate records in a flat file without using Sorter and Aggregator?

Answers were Sorted based on User's Feedback



How do we eliminate duplicate records in a flat file without using Sorter and Aggregator?..

Answer / kiran

We can use dynamic cache in lookup to eliminate duplicates.

Is This Answer Correct ?    11 Yes 0 No

How do we eliminate duplicate records in a flat file without using Sorter and Aggregator?..

Answer / joe

Option 1: using Unix for flat files

Option2: Using Checksum function in the expression to
generate a unique hexadecimal code for each record.
and comparing the same with the next record.

Is This Answer Correct ?    5 Yes 2 No

How do we eliminate duplicate records in a flat file without using Sorter and Aggregator?..

Answer / ankur saini

sol--seq gen---rank ---filter

add a sequence generator ...

ex input is
1 a
1 b
2 a
2 b
after seq generator
1 a 1
1 b 2
2 a 3
2 b 4

then ranl it group by all file ports rank on the seq gen key
input seq rank
1 a 1 1
1 b 2 2
2 a 3 1
2 b 4 2

add filter on rank=1

enjoy!!!!!

Is This Answer Correct ?    2 Yes 0 No

How do we eliminate duplicate records in a flat file without using Sorter and Aggregator?..

Answer / harish konda

Give the SQL query to sort the data in source in source
qualifier t/f.

And then connect to exp t/f and add one more port (say flag)
to generete numbers like, when prev row and current row
values are same, then increment number, or else give 1.

And next connect to Filter t/f and give the condition in
filter as flag=1.

Then rout the data to target.

Is This Answer Correct ?    2 Yes 1 No

How do we eliminate duplicate records in a flat file without using Sorter and Aggregator?..

Answer / isha

Select all source rows.
The Dynamic Lookup transformation builds the caches from the target table.
When the lookup evaluates a row from the source that does not exist in the lookup cache, it inserts the row into the cache and assigns the NewLookupRow output port the value of 1. When the lookup evaluates a row from the source that exists in the lookup cache, it does not insert the row into cache and assigns the NewLookupRow output port the value of 0.
The filter in this mapping checks if the row is a duplicate or not by evaluating the NewLookupRow output port from the Lookup. If the value of the port is 0, the row is filtered out, as it is a duplicate row. If the value of the port is not equal to 0, then the row is passed out to the target table.

Is This Answer Correct ?    1 Yes 0 No

How do we eliminate duplicate records in a flat file without using Sorter and Aggregator?..

Answer / chandu

we can achieve this by using first value or last value in lkp properties.

Is This Answer Correct ?    0 Yes 0 No

How do we eliminate duplicate records in a flat file without using Sorter and Aggregator?..

Answer / priyank

There are several ways of achieving this. We can do it
through expression transformation and other is look up on
the target.

Expression transformation:

Create ports,

Var_PREV_KEY=Key
Var_CURR_KEY=Var_PREV_KEY
Var_CHK_DUPLICATE --> IIF(Var_CURR_KEY=Key,'DUP','NODUP')
OUT_DUPLICATE --> Var_CHK_DUPLICATE

Note: I have taken a scenario where the target table
contains only 1 Key. In case of multiple keys, will have to
create a few more Variable ports for both CURR and PREV and
in the Var_CHK_DUPLICATE port, we need to add those checks
with an 'AND' operator.E.g. For 2 keys,

Var_PREV_KEY1=Key1
Var_CURR_KEY1=Var_PREV_KEY1
Var_PREV_KEY2=Key2
Var_CURR_KEY2=Var_PREV_KEY2
Var_CHK_DUPLICATE --> IIF(Var_CURR_KEY1=Key1 AND
Var_CURR_KEY2=Key2,'DUP','NODUP')
OUT_DUPLICATE --> Var_CHK_DUPLICATE


If the Informatica version is Unix installation, then in
the pre session command you can give an unix command to
remove the duplicates from the file like

sort <file_name> | uniq > <file_name>.new

Hope it helps.

Is This Answer Correct ?    4 Yes 12 No

Post New Answer

More Informatica Interview Questions

Differentiate between source qualifier and filter transformation?

0 Answers  


What is a surrogate key?

0 Answers  


How can we eliminate duplicate rows from flatfile,explain?

3 Answers  


COL1,COL2 ABC,1 XYZ,2 HERE IN COL2 VALUES 1,2 NOT STSANDARD(IE MEANS NOT FIXED VALUES LIKE OTHER SOME VALUES LIKE 10,20) O/P IS COL1,COL2 ABC,2 XYZ,1

0 Answers  


Why use the lookup transformation ?

2 Answers   Informatica,






How to read data from flat file source if the data is in paragraph format?

2 Answers   Wipro,


WE HAVE 10 RECORDS IN SOURCE IN THAT GOOD RECORDS GO TO RELATIONAL TARGET AND BAD RECORDS GO TO TARGET FLAT FILE ? HERE IF ANY BAD RECORDS MEANS ITS LOAD INTO FLAT FILE AND SEND AN EMAIL , IF NO BAD RECORDS MEANS NO NEED TO SEND EMAIL . PLZ HELP ME ...

1 Answers  


How can you differentiate between powercenter and power map?

0 Answers  


Scenario:-  Below is the requirement. Source:-  NAME          ID                    Requirement  RAVI            1                      (no need to repeat as it ID is 1) KUMAR        3                     (repeat 3 times as it ID is 3) John             4                     (repeat 4 times as it ID is 4) Required Out Put:- Name                   ID RAVI                     1                    KUMAR                3      KUMAR                3 KUMAR                3  John                     4 John                     4 John                     4 John                     4 Scenario 2:- Source Data  ID              NAME 1,2            NETEZZA,ORACLE 3,4,5         SQL Server, DB2, Teradata Required Output:-  ID                   NAME 1                  NETEZZA 1                  ORACLE 3                  SQL Server 3                  DB2 3                  Teradata

1 Answers   Cognizant,


Can you please mail me a copy of Informatica Certification Exam dumps to r_balakrishna@yahoo.com

21 Answers  


I am having a FLAT FILE SOURCE as first line: 1000,null,null,null second line as:null,2000,null,null 3rd line as :null,null,3000,null and final line as: null,null,null,4000 ............................Now i want the OUTPUT as 1000,2000,3000,4000 to a FLAT FILE only.For more clarification i want to elimate nulls and want in a single line. Please help me out

5 Answers   IBM,


How to generate or load values in to the target table based on a column value using informatica etl tool.

0 Answers   Informatica,


Categories