Golgappa.net | Golgappa.org | BagIndia.net | BodyIndia.Com | CabIndia.net | CarsBikes.net | CarsBikes.org | CashIndia.net | ConsumerIndia.net | CookingIndia.net | DataIndia.net | DealIndia.net | EmailIndia.net | FirstTablet.com | FirstTourist.com | ForsaleIndia.net | IndiaBody.Com | IndiaCab.net | IndiaCash.net | IndiaModel.net | KidForum.net | OfficeIndia.net | PaysIndia.com | RestaurantIndia.net | RestaurantsIndia.net | SaleForum.net | SellForum.net | SoldIndia.com | StarIndia.net | TomatoCab.com | TomatoCabs.com | TownIndia.com
Interested to Buy Any Domain ? << Click Here >> for more details...


one file contains
col1
100
200
300
400
500
100
300
600
300
from this i want to retrive the only duplicate like this
tr1
100
100
300
300
300 how it's possible in datastage?can any one plz explain
clearley..........?

Answers were Sorted based on User's Feedback



one file contains col1 100 200 300 400 500 100 300 600 300 from this i want to retrive th..

Answer / vinod upputuri

In order to collect the duplicate values:

first cal the count output col in aggregator stage
group by col.
aggregator type: count rows.
count output col..

next, use the filter stage to separate the multiple occurrence.

finally, use the join stage or lookup stage to map the two
tables join type INNER ..

then u can get the desired output..

Is This Answer Correct ?    14 Yes 1 No

one file contains col1 100 200 300 400 500 100 300 600 300 from this i want to retrive th..

Answer / chandu

use aggregator and calculate count of source column after
that use filter or transaformer stage use condition as count
>1 it gives only duplicate records


Thanks
Chandu-9538627859

Is This Answer Correct ?    6 Yes 0 No

one file contains col1 100 200 300 400 500 100 300 600 300 from this i want to retrive th..

Answer / prasad

>Agg--->Filter1------->|
| |
| |
file-->cp-------------------->Join---->Filter2---->target1
|
|
Target2
Agg: use aggregator and select Agg_type=count rows and then give the Count O/P column=Count (User defined)

Count
------------
100--2
200--1
300--3
400--1
500--1
600--1
it will generate in Agg stage then

Filter1: give condition like Count=1( u will get unique records from Filter1)

Join Stage: take Left Outer Join

Filter2:
where=column_name=''(null){u will get duplicates records)

Target1 o/p:
100
100
300
300
300

where= column_name<>''(u will get unique records)

Target2 o/p:

200
400
500
600

Please correct, if am wrong :)

Is This Answer Correct ?    2 Yes 0 No

one file contains col1 100 200 300 400 500 100 300 600 300 from this i want to retrive th..

Answer / sudheer

- aggregator -
seq. file - copy join - filter - seq.op


in arrg - cnt rows
in join - left outer join - left as seq.file data
in filter - where cond. - cnt>1

Is This Answer Correct ?    1 Yes 0 No

one file contains col1 100 200 300 400 500 100 300 600 300 from this i want to retrive th..

Answer / reddyvaraprasad

Job Design:

|----->Agg--->Filter1-->|
| |
| |
file-->cp-------------------->Join---->Filter2---->target

Agg: use aggregator and select Agg_type=count rows and then give the Count O/P column=Count (User defined).

Filter1: give the condition Count<>1

Join: select left outer join

Filter2: give the condition Count<>0

u will get the right output....what ever the duplicate records.

and if u want unique records, give the condition Count=0

Is This Answer Correct ?    0 Yes 0 No

one file contains col1 100 200 300 400 500 100 300 600 300 from this i want to retrive th..

Answer / reddymkl.dwh

Job Design:

Agg--->Filter1---------->|
| | Unique
file-->cp-------------------->Join---->Filter2---->target1
|
|-->Duplicate
Target2

Agg: use aggregator and select Agg_type=count rows and then give the Count O/P column=Cnt (User defined).

Filter1: give the condition Where=Cnt=1

U will get unique values like 200,400,500,600

Use Join (Or) Lookup stage: select left outer join

Filter2:

Where=Column_name='' (Duplicate values like 100,100,300,300,300)
Where=Column_name<>'' (Unique Values like 200,400,500,600)


u will get the right output....what ever the duplicate records.

Plz correct me if am wrong.....

Is This Answer Correct ?    0 Yes 0 No

one file contains col1 100 200 300 400 500 100 300 600 300 from this i want to retrive th..

Answer / pooja

Follow the following steps -

1. Seq file stage - Read the input data in seq file - input1.txt
2. Aggregate stage - count the number of rows (say CountRow) for each ID(group=ID)
3. Filter stage - Filter the data where CountRow<>1
4. Perform join on the output of the step 3 and input1.txt.
You will get the result :)

Is This Answer Correct ?    0 Yes 0 No

one file contains col1 100 200 300 400 500 100 300 600 300 from this i want to retrive th..

Answer / me

seq----> copy

from copy stage one link to aggregator apply count rows option ---> filter (on count rows output 1 ) send as reference to look up below
from copy stage second link to lookup

apply filter

Is This Answer Correct ?    0 Yes 0 No

Post New Answer

More Data Stage Interview Questions

I have source file which contains duplicate data,my requirement is unique data should pass to one file and duplicate data should pass another file how?

7 Answers   CTS,


what is diff b/w datastage 8.1,8.5,8.7?

1 Answers   IBM,


in one scenario source flat file like Fileld1 00122001550056200568 00256002360014500896 00123004560078900258 00147004560025800256 divide each 5 numbers as one column i.e here i need field1 field2 field3 field4 00122 00155 00562 00568 00256 00236 00145 00896 00123 00456 00789 00258 00147 00456 00258 00256 plz help me....

4 Answers  


Describe link sort?

0 Answers  


What is a range lookup?

2 Answers   IBM,


1)How will u implement SCD2 by using surrogate key. 2)What are the disadvantages with surrogate key. 3)How will you handle nulls in your project for the varchar, integer data types. 4)Can I use two fact tables in star schema. 5)3 jobs are running on the 2 nodes after I added one more node so can I compile those jobs to run on three nodes.

0 Answers  


SOURCE LIKE I_D,F1,F2 --------- 100,N,Y 100,N,N 100,Y,N 101,Y,Y 101,N,Y 102,Y,N 103,N,N 104,Y,Y 105,N,N 106,N,Y 102,N,Y 105,Y,Y O/P LIKE ID flag1 flag2 101 Y Y 101 N Y 102 Y N 102 N Y 104 Y Y 106 N Y

4 Answers  


5) A file contains 10 (1-10) I want trgt like Trgt 1 trgt 2 trgt 3 1 2 3 4 5 6 7 8 9 10

2 Answers  


diff between OLTP and OLAP? what TOP-DOWN and BOTTOM-UP Approach? which is best? what are Star Schema and Snow Flake Schema?

2 Answers   TCS,


Explain the datastage parallel extender (px) or enterprise edition (ee)?

0 Answers  


Nls stands for what in datastage?

0 Answers  


I HAVE EMP TABLE, 4 COLS R THERE COL1,COL2,COL3,COL4 ID-- 101,102,103,104 SAL-- 1000,4000,2000,5000 DATE-- COLUMN. I WANT TO DISPLAY THE DATA PREVIOUS MONTH HIGEST SAL ?

2 Answers   Wipro,


Categories