I have 2 files 1st contains duplicate records only, 2nd file contains Unique records.EX:
File1:
1 subhash 10000
1 subhash 10000
2 raju 20000
2 raju 20000
3 chandra 30000
3 chandra 30000
File2:
1 subhash 10000
5 pawan 15000
7 reddy 25000
3 chandra 30000
Output file:-- capture all the duplicates in both file with count.
1 subhash 10000 3
1 subhash 10000 3
1 subhash 10000 3
2 raju 20000 2
2 raju 20000 2
3 chandra 30000 3
3 chandra 30000 3
3 chandra 30000 3
Answer Posted / ankit gosain
Hi,
This problem can be solved by creating a job with following
stages:
File2 File2
| |
| |
| |
File1-----Funnel----Aggregator----Join----Filter---Tgt_File
|
|
|
File1
1. Funnel both the files (Now you have Unique & Duplicates
records).
2. Aggregate on the basis of any i/p column and mention the
calculation type = Count Rows (say o/p column row_count).
3. Join the aggregated o/p with the i/p file1,2 one the
basis of key & mention the join type = Inner Join.
4. In filter stage, mention the where clause as row_count>1.
If you have further doubt or query, catch me on
ankitgosian@gmail.com
Cheers,
Ankit :)
| Is This Answer Correct ? | 1 Yes | 0 No |
Post New Answer View All Answers
How many types of sorting methods are available in datastage?
What is the command line function to import and export the ds jobs?
Describe routines in datastage? Enlist various types of routines.
Hi All , in PX Job I have passed 4 Parameters and when i run the same job in sequence i dont want to use those parameters , is this possible if yes then how
Differentiate between hash file and sequential file?
Explain the datastage parallel extender (px) or enterprise edition (ee)?
im new to this tool im now at project plz tell me step by step process how to design plz help me i wnt to go with exp for job plz give me d proper design and explination
Differentiate between operational datastage (ods) and data warehouse?
Define orabulk and bcp stages?
Explain Quality stage?
What are the functionalities of link partitioner?
What are the differences between datastage and informatica?
Differentiate between odbc and drs stage?
Difference between sequential file and data set?
What are the various kinds of the hash file?