Drop duplicate records ...
SOURCE LIKE ..........
ID flag1 flag2
100 N Y
100 N N
100 Y N
101 Y Y
101 N Y
102 Y N
103 N N
104 Y Y
105 N N
106 N Y
102 N Y
105 Y Y
in above file if any id having both the flags as "N" then
that corresponding id records should be dropped,
in above case o/p should be as
ID flag1 flag2
101 Y Y
101 N Y
102 Y N
102 N Y
104 Y Y
106 N Y
Steps to do :
1) Identified the id’s that got duplicated (both the
flag values having vales “N”)
2) Look up with these id’s to existing id’s to drop .
Answers were Sorted based on User's Feedback
Answer / dipal
step-1
Filter the record based on condition
Flag1=N AND Flag2=N ....link1
also defined a reject link
step-2
read link1 as left link and reject link as right link and
do inner join with Look up stage based on id
also define a reject link.
now the reject link will have required output.
| Is This Answer Correct ? | 3 Yes | 0 No |
Answer / vz
Put a constraint in Transformer stage as shown bellow.
flag1=y or flag2=y
means
feald1=y or feald2=y
I think it's help you.
| Is This Answer Correct ? | 3 Yes | 1 No |
what is snow flack schema?
i have a scenario in which i/p columns areID,salary with 1,1000 2,2000 and 3,3000 i need an extra column in the o/p named avg(salary)how can i get it?
How can we perform 2nd time extraction of client database without accepting the data which is already loaded in first time extraction?
What are the different plug-ins stages used in your projects?
Can you implement SCD2 using join, transformer and funnel stage?
What is ibm datastage?
What is the sortmerge collector?
Define Job control?
What is difference between symmetric multiprocessing and massive parallel processing?
i have seq file that contents 10 million records load to target any data base.. in that case it takes lot of time for loading..how do performance tuning in that situation...?
How we can covert server job to a parallel job?
explain about completely flow of sequencers technicaly,without using example??explain about lookup,nullhandling?