Golgappa.net | Golgappa.org | BagIndia.net | BodyIndia.Com | CabIndia.net | CarsBikes.net | CarsBikes.org | CashIndia.net | ConsumerIndia.net | CookingIndia.net | DataIndia.net | DealIndia.net | EmailIndia.net | FirstTablet.com | FirstTourist.com | ForsaleIndia.net | IndiaBody.Com | IndiaCab.net | IndiaCash.net | IndiaModel.net | KidForum.net | OfficeIndia.net | PaysIndia.com | RestaurantIndia.net | RestaurantsIndia.net | SaleForum.net | SellForum.net | SoldIndia.com | StarIndia.net | TomatoCab.com | TomatoCabs.com | TownIndia.com
Interested to Buy Any Domain ? << Click Here >> for more details...


I have 2 files 1st contains duplicate records only, 2nd file contains Unique records.EX:
File1:
1 subhash 10000
1 subhash 10000
2 raju 20000
2 raju 20000
3 chandra 30000
3 chandra 30000
File2:
1 subhash 10000
5 pawan 15000
7 reddy 25000
3 chandra 30000
Output file:--&#61664; capture all the duplicates in both file with count.
1 subhash 10000 3
1 subhash 10000 3
1 subhash 10000 3
2 raju 20000 2
2 raju 20000 2
3 chandra 30000 3
3 chandra 30000 3
3 chandra 30000 3

Answers were Sorted based on User's Feedback



I have 2 files 1st contains duplicate records only, 2nd file contains Unique records.EX: File1: 1 ..

Answer / subbuchamala

File1,File2====&#61672;Funnel-----&#61664;Copy=======1st link AGG, 2nd link JOIN----&#61664;Filter----&#61664;OutputFile
1. pass the 2 files to funnel stage and then copy stage.
2. from copy stage 1st link to AGG stage, 2nd link to JOIN stage
3. In AGG stage, Group by Key column say ID, NAME take the count and JOIN based on KEY column
4. Filter on COUNT>1 send the output OutputFile
we get desired output

Is This Answer Correct ?    14 Yes 0 No

I have 2 files 1st contains duplicate records only, 2nd file contains Unique records.EX: File1: 1 ..

Answer / ankit gosain

Hi,

This problem can be solved by creating a job with following
stages:

File2 File2
| |
| |
| |
File1-----Funnel----Aggregator----Join----Filter---Tgt_File
|
|
|
File1

1. Funnel both the files (Now you have Unique & Duplicates
records).
2. Aggregate on the basis of any i/p column and mention the
calculation type = Count Rows (say o/p column row_count).
3. Join the aggregated o/p with the i/p file1,2 one the
basis of key & mention the join type = Inner Join.
4. In filter stage, mention the where clause as row_count>1.

If you have further doubt or query, catch me on
ankitgosian@gmail.com

Cheers,
Ankit :)

Is This Answer Correct ?    1 Yes 0 No

Post New Answer

More Data Stage Interview Questions

Explain the scenarios where sequential file stage runs in parallel?

3 Answers   Accenture,


Hi am sundar, i have datas like 00023-1010 00086-1010 00184F2-1010 . . . . SCH-AS-1010 200-0196-039 . . . Now i want the result "SCH-AS" in onee column and "1010" in another column.. Can any one tell the answer...

5 Answers  


Differentiate between validated and Compiled in the Datastage?

0 Answers  


I have a source table with column name CITY having 100 records, I want target table with column name start with 'A' and 'B',remaining columns as reject outputs. how can achieve this by data stage?please help me?????

5 Answers  


On which Dimension Table you implemented SCD Type in your Project

0 Answers  


what are the different type of errors in datastage?

2 Answers   Wipro,


What are the areas of application?

0 Answers  


how to add a new records into source?

0 Answers  


CAN ANY ONE TELL ME THE ARCHITECTURE OF DATASTAGE CLEARLY....

2 Answers   Wipro,


what is factless fact table?

3 Answers   IBM,


What is RCP?

2 Answers   TCS,


i have a scenario like two columns(Empno, Ename) in that duplicate records are there, so my question is how to get second duplicate record in datastage.

4 Answers   Wipro,


Categories