ALLInterview.com :: Home Page            
 Advertise your Business Here     
Browse  |   Placement Papers  |   Company  |   Code Snippets  |   Certifications  |   Visa Questions
Post Question  |   Post Answer  |   My Panel  |   Search  |   Articles  |   Topics  |   ERRORS new
   Refer this Site  Refer This Site to Your Friends  Site Map  Bookmark this Site  Set it as your HomePage  Contact Us     Login  |  Sign Up                      
Google
   
 
Categories >> Software >> Data-Warehouse >> Data-Stage
 
 


 

Back to Questions Page
 
Question
I have a source table with column name CITY having 100 
records,
I want target table with   column name start with 'A' 
and 'B',remaining columns as reject outputs.
how can achieve this by data stage?please help me?????
Rank Answer Posted By  
 Question Submitted By :: Vikram
I also faced this Question!!   © ALL Interview .com
Answer
Job design will be:
seq --- Tx ---- target.txt
          |_____ reject.txt  

IN transformer use below constraint for target.txt

Left(city,1)='A' or Left(city,1)='B'

Check the otherwise and send it to reject file.
 
4
Bg
 
 
Answer
Job design will be:
seq --- Tx ---- target.txt
          ------rejct.txt 
IN transformer use below constraint for target.txt

Left(city,1)='A' or Left(city,1)='B'
 
0
Satish
 
 
Question
Hi friends,Two input files, wants to validate only if the reference data has '0' otherwise no validation should be done..how to do this??
Rank Answer Posted By  
 Question Submitted By :: Sameermjcet
This Interview Question Asked @   IBM
I also faced this Question!!   © ALL Interview .com
Answer
perform lookup operation
 
0
Anayka
 
 
 
Question
Hi Friends,
I have a input data like,
class_id  Marks
101        50  
101        60
101        40
102        90
102        35

And i want my output data like 
class_id  Marks  Rank
101        50     2
101        60     1
101        40     3
102        90     1
102        35     2

how to do this in datastage?
Rank Answer Posted By  
 Question Submitted By :: Rajeev
This Interview Question Asked @   Cognizent
I also faced this Question!!   © ALL Interview .com
Answer
Input->Sort1->Sort2->Trnsformer->Output

Sort1-->Declare class_id and Marks as key column and  sort 
in descding order.
Sort2-->Declare class_id and Marks as key column(Sort Mode 
to Dont sort previously sorted for both) and set clustered 
key change column to true.

Otuput for Sort 2 will be 

class_id Marks Rank
102      90    1
102      35    0
101      60    1
101      50    0
101      40    0

In the Transformer declare stage variable temp and 
initialize to o

Derive temp--> If Rnak=1 then Rank else temp+1

Derive output columns as --->

class_id ---> class_id
Marks ----> Marks
Rank----> temp
 
0
Indu
 
 
Answer
If you think ur solution is correct just try out with below
input and recheck ur solution..
class_id,Marks
101,50  
101,60
101,60
101,50
101,40
102,90
102,90
102,35
102,35
 
0
Shar
 
 
Answer
Src>> Sort1 >> Sort2 >> Transfrmr >> Trgt

Sort1 --> sort with class_id and marks as well.
Sort2 --> declare key as id and select Dont Sort(Previously Sorted) and set create cluster key change column as True.

Transfrmr --> set two stage variables.
              StageVar1=If Clusterkeychange=1 then    Clusterkeychange else StageVar+1

              StageVar=StageVar1
Create a new row as rank in transfrmr output and map StageVar1 to rank
 
0
Anki_sri
 
 
Answer
you are use 
transformer if (Marks > 50 or 40 <Marks ) then rank=2 else
if Marks > 60 then rank=1 else rank=4
 
0
Amulya Kumar Panda
 
 
Question
file1
1
2
3
4
file2
3
4
5
6

output should be in three targets
T1 T2 T3
1  3  5
2  4  6

how to do this? can any one help?

Thanks
Rank Answer Posted By  
 Question Submitted By :: Sameer
This Interview Question Asked @   Cap-Gemini
I also faced this Question!!   © ALL Interview .com
Answer
ok
Follow this:

seqStg1,seqStg2-->Funnel-->Filter-->seq1,seq2,seq3.

Since we have less data to sort and to remove duplicates use
hash partition and unique option at input link. 

Thats it.

I wonder why such a goofy question asked by capegemini?
 
0
Shar
 
 
Answer
This can be done using change capture stage:

Seq1,Seq2 ----> change capture (here keep Drop Output for 
copy : False) ---> Filter ----> seq1,seq2,seq3

from this above output will get
 
0
Yuvraj
 
 
Answer
Using change capture stage:

File1(Master),File2----> change capture---> Filter ----> 
T1, T2, T3
In Filter,
Change_code=1 then send then to T1-->(Insert records)
Change_code=0 then send then to T2-->(Copy records)
Change_code=2 then send then to T3-->(Delete records)
 
5
Subhash
 
 
Question
My input has a unique column-id with the values 10,20,30.....how can i get first record in one o/p file,last record in another o/p file and rest of the records in 3rd o/p file?
Rank Answer Posted By  
 Question Submitted By :: Premox5
This Interview Question Asked @   Wipro
I also faced this Question!!   © ALL Interview .com
Answer
As you have a single text file as Source. Use folloowing 
approach to get the desired output.

	                Head1	      Target1

Seq. File	Copy	Tail2	      Target2
	               
                        Head3	Tail       Target3 
Steps:
1.> Read your source file using sequential file stage.
2.> Pass the records to copy stage and take 3 output link.
3.> 1 to Head stage head1, 2nd to Head2 and 3rd to Head3.
4.> In the 1st Head Stage Head1, in the properties specify 
1, it will pick up the 1st record and make that record to 
target 1.

5.> Similarly, to capture last record in target2, in Tail 
stage property mention 1. It will take last record and pass 
it to target2.

6.> To load rest records 1st using head stage, capture top 
records say, if u have 10 records in the source pick top 9 
records using head stage then use tail stage followed by 
head stage and mention 8, it will pick all records except 
1st one. then u can load these to target3.

If u get confused ask me ....

Thanks 
Kumar
 
3
Kumar
 
 
Answer
We can do in this way as well:

In the same job,
SeqFile--->Target1
SeqFile--->Target2
SeqFile--->Target3
for Target1,
add Filter condition in the SeqFile as below:
Head -1

for Target2,
add Filter condition in the SeqFile as below:
Tail -1

for Target3,
add Filter condition in the SeqFile as below:
sed '1,$ d'
 
4
Subhash
 
 
Question
What is the Difference between Change capture stage and
Difference Stage ? What are its significance individually ?
Rank Answer Posted By  
 Question Submitted By :: Vemuri.sriharsha
I also faced this Question!!   © ALL Interview .com
Answer
There is not much difference between cdc stage and difference stage.Both the stage similar.

Difference between cdc stage and difference are following:-

1.In CDC stage,we should have same number of columns and same column name from both the input link.However,In difference stage we can have different number of column and Different column name.
2.The CDC stage takes two input data sets, denoted before and after, and outputs a single data set whose records represent the changes made to the before data set to obtain the after data set.However,Difference stage performs a record-by-record comparison of two input data sets, which are different versions of the same data set designated the before and after data sets. The Difference stage outputs a single data set whose records represent the difference between them.
 
3
Rakesh Gupta
 
 
Question
How can we read latest records in a text file named 
file1.txt using seq file stage only?

  file1 having 100 records in that 5 record sare latest 
records.How can we read that latest records?
Rank Answer Posted By  
 Question Submitted By :: Sai
This Interview Question Asked @   Caterpillar
I also faced this Question!!   © ALL Interview .com
Answer
Use change Capture stage.
 
0
Sriharsha Vemuri
 
 
Answer
by using seq file how we can?
 
0
Sai
 
 
Answer
Question is not clear... Be more specific.
As my asumption, the letest records might be the last 5 records in the Sequential file. If so...
 
In Sequential File stage,
In Filter option Give the Command "tail -5 File_name".

It will give last 5 records.

Correct me if I wrong ...
 
0
Dileep J
 
 
Question
Hi guys,

please design a job with derivation(solution).
write exact conditions.

My requirement
Source table
emp_no		qualification
1		a
1		c
2		a
3		c
3		b

Target table
emp_no		qualification
1		b
2		b
2		c
3		a

Here every employer have three qualifications i.e a,b and c.
what ever source table dont have some qualification, that 
will be move to target table.
Like above.

Hope u get the point.

Thanks.
Rank Answer Posted By  
 Question Submitted By :: Suneelbabu.etl
This Interview Question Asked @   UHG
I also faced this Question!!   © ALL Interview .com
Answer
Here as my knowledge
Seq--->Tx---->DS

In Tx by using stage varible we can do dis.

Thanks.
 
0
Suneelbabu.etl
 
 
Answer
Take source as emp_no,qualification 1,b 2,b 2,c 3,a and
reference as date as emp_no,qualification 1,a 1,b 1,c 2,a
2,b 2,c3,a3,c 3,b now take lookup for both
emp_no,qualification columns and reject the date. This Worked.
 
0
Sriharsha Vemuri
 
 
Answer
Since Each Employee should have 3 qualification then Primary
file should be like below because Primary is always Static.
In which terms it should be like this.
Primary file:
empno,qua
1,a
1,b
1,c
2,a
2,b
2,c
3,a
3,b
3,c

And this is our Reference data we have.

RefFile:
empno,qua
1,a
1,c
2,a
3,c
3,b

Primary,ref-->lookup-->output & Reject.
and match the empno and qua and set lookup failue = reject
at reject file 
U will get desired output.
Thats it.
 
0
Shar
 
 
Answer
A) Read distinct emp_no we get 1,2,3 add a new column and populate as 1
B) read qualification, we get a,b,c add add a new column and populate as 1
Inner join A and B on new column…
we get 1-A,1-B,1-C 2-A,2-B,2-C and 3-A,3-B,3-C
Change capture with input file and drop all records with no change.
 
0
Sher
 
 
Question
how does work server jobs?
Rank Answer Posted By  
 Question Submitted By :: Mca0037
I also faced this Question!!   © ALL Interview .com
Answer
When we have lots of functionality to implement for lower volume and hardware is less and ease of implementation we can go for Server jobs.
We can improve the performance of server job by enabling inter process row buffering. This helps stages to exchange data as soon as it is available in the link. IPC stage also helps passive stage to read data from another as soon as data is available. 
The choice of server or parallel depends upon time to implement, functionality and cost.
Server jobs do not run on multiple node.
server job runs on windows platform usually
server job runs on one node
Server jobs process in sequence one stage after other
Server jobs doesn't support partition parallelism
server jobs the transformer is compiled in Basic language.
 
0
Archana
 
 
Question
what is diff b/w datastage 8.1,8.5,8.7?
Rank Answer Posted By  
 Question Submitted By :: Mca0037
This Interview Question Asked @   ibm
I also faced this Question!!   © ALL Interview .com
Answer
1. Design & Runtime Performance Changes:
DS8.5: Implemented by Internal code change. Design and Runtime performance is better than 8.1, 40% performance improvement in job open, save, compile etc.
DS8.7: Improvements in Xmeta. Significant Performance improvement in Job Open, Save, Compile etc.
2. PX Engine Performance Changes :
DS8.5: Not Exist
DS8.7: Improved partition/sort insertion algorithm. XML parsing performance is improved by 3x or more for large XML files.
3. Added View Job Log in Designer client
DS8.5: Not Exist
DS8.7: New Feature has been added (Menu -> View -> Job Log) .
             Job log is now viewed in Designer client.
5. Interactive Parallel Job Debugging
DS8.5: Not Exist
DS8.7: Breakpoints with conditional logic per link and node. 
          (Link -> Rclick -> Toggle Breakpoint)
          The running job can be continued or aborted by using multiple breakpoints with                 conditional logic per link and node. (row data or job parameter values can be                   examined by breakpoint conditional logic)
6. Added Vertical Pivot : Pivot Enterprise Stage
DS8.5: Extended to current horizontal parallel pivot. Enhanced pivot stage to support vertical     pivoting. (mapping multiple input rows with a common key, to a single output row containing multiple columns)
DS8.7: No Change
7. Balance Optimization :
DS8.5: Balanced Optimization is that to redesign the job automatically with maximize performance by minimizing the amount of input and output performed, and by balancing the processing against source, intermediate, and target environments. The Balanced Optimization enables to take advantage of the power of the databases without becoming an expert in native SQL.
DS8.7: No Change
8. Transformer Enhancements
DS8.5: Looping in the transformer, Multiple output rows to be produced from a single input row. 1. New input cache: SaveInputRecord(), GetSavedInputRecord().
         2. New System Variables: @ITERATION, @Loop Count, @EOD(End of data flag for last row).
         3. Functions : LastRowInGroup(InputColumn).
         4. Null Handling more Options.
DS8.7: No Change
9. Big Data File Stage :
DS8.5: Not Exist
DS8.7: Big Data File Stage for Big Data sources (Hadoop Distributed File System-HDFS).
11. Added Encryption Techniques
DS8.5: Not Exist
DS8.7: Encrypted because of security reasons.
1. Strongly encrypted credential files for command line utilities.
2. Strongly encrypted job parameter files for dsjob command.
3. Encryption Algorithm and Customization.
 
0
Archana
 
 
 
Back to Questions Page
 
 
 
 
 


   
Copyright Policy  |  Terms of Service  |  Articles  |  Site Map  |  RSS Site Map  |  Contact Us
   
Copyright © 2013  ALLInterview.com.  All Rights Reserved.

ALLInterview.com   ::  KalAajKal.com