one file contains
col1
100
200
300
400
500
100
300
600
300
from this i want to retrive the only duplicate like this
tr1
100
100
300
300
300 how it's possible in datastage?can any one plz explain
clearley..........?

Answer Posted / pooja

Follow the following steps -

1. Seq file stage - Read the input data in seq file - input1.txt
2. Aggregate stage - count the number of rows (say CountRow) for each ID(group=ID)
3. Filter stage - Filter the data where CountRow<>1
4. Perform join on the output of the step 3 and input1.txt.
You will get the result :)

Is This Answer Correct ?    0 Yes 0 No



Post New Answer       View All Answers


Please Help Members By Posting Answers For Below Questions

Which algorithm you used for your hashfile?

681


Describe the main features of datastage?

648


how to connect source to db?generally what r stages u used? how to find the data is having delimiter format?

1910


Why do we use link partitioner and link collector in datastage?

675


What is process model?

1542






Describe stream connector?

817


Can we use target hash file as a lookup ?

2806


how to read 100 records at a time in source a) hw is it fr metadata Same and b) if metadata is nt same?

1709


What are some different alternative commands associated with "dsjob"?

643


How complex jobs are implemented in datstage to improve performance?

584


Differentiate between Join, Merge and Lookup stage?

635


What are the job parameters?

694


If we take 2 tables(like emp and dept),we use join stage and how to improve the performance?

1618


What is use Array size in datastage

1294


Does datastage support slowly changing dimensions ?

651