Hi guys,
please design job for this,
MY INPUT IS
COMPANY,LOCATION
IBM,CHENNAI
IBM,HYDRABAD
IBM,PUNE
IBM,BANGLOORE
TCS,CHENNAI
TCS,MUMBAI
TCS,BANGLOORE
WIPRO,HYDRABAD
WIPRO,CHENNAI
HSBC,PUNE
MY OUTPUT IS
COMPANY,LOCATION,COUNT
IBM,chennai,hydrabad,pune,banglore,4
TCS,chennai,mumbai,bangloore,3
WIPRO,hydrabad,chennai,2
HSBC,pune,1
Thanks
Answer Posted / ankit gosain
Hi All,
Create a job design like below:
SeqFile--->SortStage--->Transformer--->RemoveDup--->SeqFile
Steps:
-----
1. At sort stage, take sort key = Company and sort key mode
= Don't sort (Previously Grouped) & take a
CreateClusterKeyChange column.
2. At Transformer Stage, create two stage variables:
temp of integer type with 0 as default,
temp1 of varchar type.
now, write in their derivation:
if clusterKeyChange=1 then 1 else temp+1----temp
if clusterKeyChange=1 then Location else temp1:',':Location-
---temp1
Create one o/p column (say count).
Now derive the o/p derivation columns as:
Company--------Company
temp1----------Location
temp-----------Count
3. At remove duplicate stage, take key=Company and
Duplicate to retain = Last
now just drag and drop the i/p columns to o/p derivation
& you will get the desired result.
For further queries, mail me on ankitgosain@gmail.com
Cheers,
Ankit :)
| Is This Answer Correct ? | 14 Yes | 0 No |
Post New Answer View All Answers
how to delete one row in target dataset
EXPLAIN SCD
What is a ds designer?
How to manage date conversion in Datastage?
Explain ibm infosphere information server and highlight its main features?
if i have two tables table1 table2 1a 1a,b,c,d 1b 2a,b,c,d,e 1c 1d 2a 2b 2c 2d 2e how can i get data as same as in tables? how can i implement scd typ1 and type2 in both server and in parallel? field1 field2 field3 suresh , 10,324 , 355 , 1234 ram , 23,456 , 450 , 456 balu ,40,346,23 , 275, 5678 how to remove the duplicate rows,inthe fields?
Difference between IBM DATA STAGE8.5 and DATA STAGE9.1 ?
what is repositery?
What is the difference between validated and compiled in the datastage?
What are the some differences between 7.x and 8.x version of datastage?
Explain connectivity between datastage with datasources?
What is ibm datastage flow designer?
Can you explain how could anyone drop the index before loading the data in target in datastage?
explain about citrix scheduling tool in datastage
Can you implement SCD2 using join, transformer and funnel stage?