Input Data is:
Emp_Id, EmpInd
100, 0
100, 0
100, 0
101, 1
101, 1
102, 0
102, 0
102, 1
103, 1
103, 1
I want Output
100, 0
100, 0
100, 0
101, 1
101, 1
Means Indicator should either all ZEROs or all ONEs per
EmpId.
Impliment this using SQL and DataStage both.
Answers were Sorted based on User's Feedback
In DataStage:
SRC--->CPY---->JOIN----TFM---TGT
--------|---- /
--------|--- /
--------|-- /
--------|- /
--------AGG
In AGG, GROUP BY EmpId, calculate MIN and MAX for each
EmpId.
JOIN both one copy from CPY and 2nd Aggrigated copy from
AGG.
In TFM, put constraint: IF MIN=MAX, then populate to TGT
then u will get required output.
Is This Answer Correct ? | 11 Yes | 0 No |
Answer / subhash
1. SQL:
SELECT * FROM
( SELECT EmpId, COUNT(*) AS CNT1 FROM EMP GROUP BY
EmpId) E1,
( SELECT EmpId, COUNT(*) AS CNT2 FROM EMP GROUP BY
EmpId, EmpInd) E2,
WHERE E1.EmpID = E2.EmpId AND
E1.CNT1 = E2.CNT2;
2.DataStage:
SRC--->CPY---->JOIN----TFM---TGT
| /
| /
| /
| /
AGG
In AGG, GROUP BY EmpId, calculate CNT and SUM.
JOIN both one copy from CPY and 2nd Aggrigated copy from
AGG.
In TFM, put constraint: IF CNT=SUM, then populate to TGT
then u will get required output.
Is This Answer Correct ? | 5 Yes | 2 No |
Answer / akila ramu
DB--->Transformer--->Output File
Sample data propagation through these stages:
In table->DB stage--->Tfm----->outputfile
101 0---->100 0 2 2-->100 0
100 0---->101 0 2 1-->100 0
101 1---->101 1 2 1
100 0
DB: Use the bvelow query in this stage
select emp_id, ind, count(emp_id) c1, count(emp_id ind) c2
from table_name
group by emp_id, ind
order by emp_id, ind
So similar empid-ind are grouped and the count of each
empid-ind pair is also sent in a seperate column c2. The
count of each emp_id is sent in c1.
Tfm: Output link Contraint:c1=c2
Looping contraint: @ITERATION<=c2
Looping variables: l_empid=emp_id
l_ind=ind
Pass these two looping variables as the emp_id and the ind
to the output file.
Is This Answer Correct ? | 3 Yes | 1 No |
Answer / lb14447
The sql query would be
SELECT * FROM EMPTEST WHERE EMP_ID IN (SELECT EMP_ID FROM EMPTEST GROUP BY EMP_ID HAVING SUM(EMP_IND)/COUNT(EMP_IND) = 0
OR SUM(EMP_IND)/COUNT(EMP_IND) = 1);
Datastage implementation:
SRC --> CPY ----> AGG---> FILTER
- |
- |
- |
- |
- |
--------> Look up ----> TGT
In the Aggregator stage calculate the Sum and Count fields.In the filter stage bypass the unwanted records using Sum and Count calculated in Aggr stage.
Is This Answer Correct ? | 1 Yes | 0 No |
Answer / mahalakshmi
SRC--SRT---FLT---|
| LKP
CPY---| |
| |
|__FNL_|
|
TGT
1.In Sort Stage (Keys on Empid & Emp Ind), Create Key
changeColumn = True
2. In Filter WhereClause-(Keychange=0)OutputLink=0
WhereClause-(Keychange=1)OutputLink=1
3.In Lookup InnerJoin on EmpId
4.In Funnel, Funnel Type=Sorted Funnel
Is This Answer Correct ? | 1 Yes | 0 No |
Ok
1) sql:
select empno, rank() over(order by empno) Ind from e2;
then we get:
100 1
100 1
100 1
101 2
101 2
103 3
2) Datastage:
sequl file-->sort-->tx--> Target.
@sort:
create keychange column then we get
100 0
100 1
100 1
101 0
101 1
101 1
102 0
102 1
103 0
@ Tx use two stage variables:
sv1 integer = 0 value
sv2 integer = 0 value
and at derivations :
assign
1) keychange ---> sv1
2) if sv1=0 then sv2+1 else sv2 ----> sv2
map the columns empno and sv2
and we get the results.
Thats it.
shar
Is This Answer Correct ? | 1 Yes | 1 No |
what is the diff b/w switch and filter stage in datastage
Drop duplicate records ... SOURCE LIKE .......... ID flag1 flag2 100 N Y 100 N N 100 Y N 101 Y Y 101 N Y 102 Y N 103 N N 104 Y Y 105 N N 106 N Y 102 N Y 105 Y Y in above file if any id having both the flags as "N" then that corresponding id records should be dropped, in above case o/p should be as ID flag1 flag2 101 Y Y 101 N Y 102 Y N 102 N Y 104 Y Y 106 N Y Steps to do : 1) Identified the id’s that got duplicated (both the flag values having vales “N”) 2) Look up with these id’s to existing id’s to drop .
how many dimentions and fact tables used in your project and what are names of it?
What are the differences between datastage and informatica?
explain about completely flow of sequencers technicaly,without using example??explain about lookup,nullhandling?
hi All, i have one scenario like if source--->transformer-->2 target sequential files the 1 st target sequential file is loads the data from source and 2nd target sequntial file contain the 1st target total record count,and file name of 1 st target seq file and timestamp seperated by delimeter for example if source have 10 record the 1st target seq file hav 10 records and 2nd target seq file example 10|xyz.txt|20101110 00:00:00 could you please help me out how can i implement in datastage job.
source has 2 fields like COMPANY LOCATION IBM HYD TCS BAN IBM CHE HCL HYD TCS CHE IBM BAN HCL BAN HCL CHE LIKE THIS....... AND I WILL GET THE OUTPUT LIKE THIS.... Company loc count TCS HYD 3 BAN CHE IBM HYD 3 BAN CHE HCL HYD 3 BAN CHE PLZ SEND ME ANSWER FOR THIS QUESTION..........
What is ds designer?
col1 123 abc 234 def jkl 768 opq 567 789 but i want two targetss target1 contains only numeric values and target2 contains only alphabet values like trg1 123 234 768 567 789 trg2 abc def jkl opq
Explain the ChangeApply stage?
1) In a dataset how to delete a single row? 2) i have 50 rows , i want to display 5-7 records only? How to write the sql query? 3)i have 40 rows,i want to display last row? write sql query?
tell me abt Datastage trigger?