INPUT file 'A' contains:
1
2
3
4
5
6
7
8
9
10
input file 'B' contains:
6
7
8
9
10
11
12
13
14
15
Output file 'X' contains:
1
2
3
4
5
Output file 'Y' contains:
6
7
8
9
10
Output file 'Z' contains:
11
12
13
14
15
How can we implement this in a single ds job?
Answers were Sorted based on User's Feedback
Yes Union of Two Files is a Good idea
but No need of Transformer Stage we can do it in this way:
src1 ,src2 -->Funnel--->Sort--->Filter---->trg1,trg2,trg3
at sort just put allow duplicates=false
&
at filter just give conditions
1. col1>=1 and col1<6 trg1
2. col1>=6 and col1<11 trg2
3. col1>=11 trg3
| Is This Answer Correct ? | 9 Yes | 3 No |
Answer / noothan
This is another option for implementing the same.
Implement a full outer join with File A as Left and File B
as right. Let the output contain Col_File_A from file A and
Col_File_B from file B. Pass this output through a filter
stage with the filter condition as below.
Col_File_B is null= file x
Col_File_B is not null and Col_File_A is not null = file Y
Col_File_A is null= file Z.
| Is This Answer Correct ? | 4 Yes | 0 No |
Answer / subbuchamala
1)
We can solve this by using Change capture stage. First,we use source as 'A' and refrerence as 'B' both of them are connected to Change capture stage. From, change capture stage it connected to filter stage and then targets X,Y and Z.
(Change Code '0'--->exist in both, copy record
Change Code '1'--->Exists in SRC 'A', INSERT record
Change Code '2'--->Exists in SRC 'B', DELETE record
Change Code '3'--->Exists in 'A' & 'B' with changes, UPDATE record)
In the filter stage:
Change_code column=1 it goes to 'X' [1,2,3,4,5]
Change_code column=0 it goes to 'Y' [6,7,8,9,10]
Change_code column=2 it goes to 'Z' [11,12,13,14,15]
2)
Add an extra column 'colA' and 'colB' to the files 'A' and 'B' respectively. Let the value for colsA be 'a' for all the rows in file 'A' and the value for colB be 'b' in file 'B'(using the column generator stage).Now join both the files using join stage using 'ID' column. Perform full outer join. Map the ID col, colA and colB to output. Next pass it through a transformer.
Transformer constraint:
1) file X - colA=a and colB<>b ----->[1,2,3,4,5]
2) file Y - colA=a and colB=b------->[6,7,8,9,10]
3) file Z - colA<>a and colB=b------>[11,12,13,14,15]
| Is This Answer Correct ? | 5 Yes | 3 No |
Answer / shar
alternative ans lot similar to subu's.
to both files add a dummy column let say
File A
col1,dummy1
1,1
2,1
3,1
4,1
5,1
6,1
7,1
8,1
9,1
10,1
And for File B
col1,dummy2
6,2
7,2
8,2
9,2
10,2
11,2
12,2
13,2
14,2
15,2
Using col generator . Now join the two files using join
stage with full outer join. With key col1.
Then the output will be:
Left_col1, dum1, right_col1, dum2
1 1 0 0
2 1 0 0
3 1 0 0
4 1 0 0
5 1 0 0
6 1 6 2
7 1 7 2
8 1 8 2
9 1 9 2
10 1 10 2
0 0 11 2
0 0 12 2
0 0 13 2
0 0 14 2
0 0 15 2
Now assign the output to Transformer stage where at
constraints we mention:
DSLink6.dummy1 = 1 And DSLink6.dummy2 <> 2
DSLink6.dummy1 = 1 And DSLink6.dummy2 = 2
DSLink6.dummy1 <> 1 And DSLink6.dummy2 = 2
Then at three targets we map
Target 1 : left_col1
Target2 : left_col1
Target3: Right_col1
Hence the o/p.
| Is This Answer Correct ? | 2 Yes | 1 No |
Shar,
You explained the same as my 2nd method with examples.
Thanks
Subhash
| Is This Answer Correct ? | 1 Yes | 0 No |
Use Full outer join
Right ------------->X table
| |
| |
Left--> Join----->Filter---------->Y table
|
|
Z table
Left.colname=Right.colname
Right.colname='' (A-B)
Left.colname<>'' and Right.colname<>'' (A intersection B)
Left.colname<>'' (B-A)
Will get desired output
Please correct if am wrong....
| Is This Answer Correct ? | 1 Yes | 0 No |
Answer / shar
ok i shouldn't have hard coded with values.
Can u plz be more clear with the solution u gave with stages
and flows properly.
Thanks in advance.
| Is This Answer Correct ? | 0 Yes | 0 No |
Answer / sreenivas ramanathan
We can achieve this by using Change data capture stage as comparing the 2 inputs.
| Is This Answer Correct ? | 0 Yes | 0 No |
Answer / huma
Why're we complicating the solution. Simple use merge stage. Output Y will be result of merge (inner join) while rest all will be the rejected outputs from update link.
| Is This Answer Correct ? | 0 Yes | 0 No |
Answer / vaibhav
Hi Subbuchamala..
I marked this ans as no bcoz CDC stage works row by row i.e. on key column basis.. here for all the 10 records change_code will be 3 so it cannot be differentiated into 3 target files...
Different approach will be.. take union of 2 input tables and use transformer stage and filter on the basis of rownumber..
Let me know if I am wrong..
| Is This Answer Correct ? | 1 Yes | 2 No |
how many datamarts we will use in real time project and when will use the datamart?pls send the replay early
what are fact tables and dimension tables? give example assuming one table.
What are the primary usages of datastage tool?
Can we use sequential file as source to hash file? Have you do it ?if what error it will give?
why we use hash file for lookup?
what is the Difference Between Datastage Server Edition and Parallel Edition?
I have a file it contain 2 records like empname,company as Ram, Tcs and Ram, IBM. But i want empname, company1,company2 as Ram, TCS,IBM in the target. How?
there are indexes on a table as index1 with col1, col2 index2 with col2 index3 with col1,col2,col3. if i run a query with col1='100' which index will be used and why
difference between server shared container and parallel shared container
How many areas for files does datastage have?
Can anyone tell me a difficult situation who have handled while creating Datastage jobs?
what is the difference between lookup stage reject link and merge stage reject link in datastage Parallel jobs? interm of output in Merge Reject link and Look Up Reject link ?