Golgappa.net | Golgappa.org | BagIndia.net | BodyIndia.Com | CabIndia.net | CarsBikes.net | CarsBikes.org | CashIndia.net | ConsumerIndia.net | CookingIndia.net | DataIndia.net | DealIndia.net | EmailIndia.net | FirstTablet.com | FirstTourist.com | ForsaleIndia.net | IndiaBody.Com | IndiaCab.net | IndiaCash.net | IndiaModel.net | KidForum.net | OfficeIndia.net | PaysIndia.com | RestaurantIndia.net | RestaurantsIndia.net | SaleForum.net | SellForum.net | SoldIndia.com | StarIndia.net | TomatoCab.com | TomatoCabs.com | TownIndia.com
Interested to Buy Any Domain ? << Click Here >> for more details...

I have 2 files 1st contains duplicate records only, 2nd file contains Unique records.EX:
File1:
1 subhash 10000
1 subhash 10000
2 raju 20000
2 raju 20000
3 chandra 30000
3 chandra 30000
File2:
1 subhash 10000
5 pawan 15000
7 reddy 25000
3 chandra 30000
Output file:--&#61664; capture all the duplicates in both file with count.
1 subhash 10000 3
1 subhash 10000 3
1 subhash 10000 3
2 raju 20000 2
2 raju 20000 2
3 chandra 30000 3
3 chandra 30000 3
3 chandra 30000 3

Answer Posted / ankit gosain

Hi,

This problem can be solved by creating a job with following
stages:

File2 File2
| |
| |
| |
File1-----Funnel----Aggregator----Join----Filter---Tgt_File
|
|
|
File1

1. Funnel both the files (Now you have Unique & Duplicates
records).
2. Aggregate on the basis of any i/p column and mention the
calculation type = Count Rows (say o/p column row_count).
3. Join the aggregated o/p with the i/p file1,2 one the
basis of key & mention the join type = Inner Join.
4. In filter stage, mention the where clause as row_count>1.

If you have further doubt or query, catch me on
ankitgosian@gmail.com

Cheers,
Ankit :)

Is This Answer Correct ?    1 Yes 0 No



Post New Answer       View All Answers


Please Help Members By Posting Answers For Below Questions

How many types of sorting methods are available in datastage?

1054


What is the command line function to import and export the ds jobs?

1128


Describe routines in datastage? Enlist various types of routines.

993


Hi All , in PX Job I have passed 4 Parameters and when i run the same job in sequence i dont want to use those parameters , is this possible if yes then how

1535


Differentiate between hash file and sequential file?

1051


Explain the datastage parallel extender (px) or enterprise edition (ee)?

1213


im new to this tool im now at project plz tell me step by step process how to design plz help me i wnt to go with exp for job plz give me d proper design and explination

2155


Differentiate between operational datastage (ods) and data warehouse?

1152


Define orabulk and bcp stages?

1112


Explain Quality stage?

1141


What are the functionalities of link partitioner?

1003


What are the differences between datastage and informatica?

1161


Differentiate between odbc and drs stage?

1034


Difference between sequential file and data set?

1185


What are the various kinds of the hash file?

1133