one file contains
col1
100
200
300
400
500
100
300
600
300
from this i want to retrive the only duplicate like this
tr1
100
100
300
300
300 how it's possible in datastage?can any one plz explain
clearley..........?

Answer Posted / pooja

Follow the following steps -

1. Seq file stage - Read the input data in seq file - input1.txt
2. Aggregate stage - count the number of rows (say CountRow) for each ID(group=ID)
3. Filter stage - Filter the data where CountRow<>1
4. Perform join on the output of the step 3 and input1.txt.
You will get the result :)

Is This Answer Correct ?    0 Yes 0 No



Post New Answer       View All Answers


Please Help Members By Posting Answers For Below Questions

Which warehouse using in your datawarehouse

1682


Source has 2 columns: USA,NewYork INDIA,MUMBAI INDIA,DELHI UDS,CHICAGO INDIA,PUNE i want data in target like below: INDIA,MUMBAI1 INDIA,DELHI2 INDIA,PUNE3 USA,NEWYORK1 USA,CHICAGO2

348


Can you define merge?

666


How the ipc stage work?

673


root tree will find which is server job and which is parallel job?

1446






Is the value of staging variable stored temporarily or permanently?

573


Why fact table is in normal form?

673


How we can covert server job to a parallel job?

593


Define orabulk and bcp stages?

678


Define Routines and their types?

618


Why we use surrogate key?

697


Hi All , in PX Job I have passed 4 Parameters and when i run the same job in sequence i dont want to use those parameters , is this possible if yes then how

1118


What is staging variable?

622


What is ibm datastage flow designer?

683


On which interface you will be working as a developer?

634