if we take 2 tables(like emp and dept), we use join stage and
how to improve the performance?
Answers were Sorted based on User's Feedback
Answer / kiran
when ever join 2 tables based on key columns if the key
column is numeric ,set modulus,if the key column is non
numeric set hash partition technique.and compare to look up
join give better performance coz join has sort operation
by default.
| Is This Answer Correct ? | 11 Yes | 2 No |
Answer / ashok
above answer has one mistake
i.e join doesn't has sort operation bydefault we explicitly
specify
| Is This Answer Correct ? | 9 Yes | 3 No |
Hi this is Poorna ,
We can Improve the performance on join stage by doing
pre sorting for both left and right data based on
key .Then we can Improve the performance in join stage .
Plz correct me if any mistake in thinking .
| Is This Answer Correct ? | 6 Yes | 1 No |
Answer / rajeshchunduri
in emp and dept tables key column is deptno so it is key
based and datatype for key column is int . At this time we
change partion tech from hash to modulus.
chunduri
| Is This Answer Correct ? | 1 Yes | 1 No |
Answer / professional
Hi,
For the above query to improve the performance based on key columns in emp and dept joins by default sort in datastage for better performance if you have already a sorted data just go for environmental variables and do the operation #APT_Not_SORTDATA option then performance increase automatically...
| Is This Answer Correct ? | 0 Yes | 0 No |
how to identifie,is it innerjoin,leftouter join in lookup?
A job is having only 2 stages I/p dataset and target table.Job is taking very long time to load 50 million records.How to improve performance of this job.
What is the use of Row generator stage?
What is the Main difference between Lookup Failure and Lookup Not Met? Plz explain with Example.
Explain usage analysis in datastage?
How to convert RGB Value to Hexadecimal values in datastage?
DataStage Scenario based Interview Questions
What are the main differences you have observed between 7.x and 8.x version of datastage?
source contains 2 columns comes to target 4 columns how
what is the Difference Between Datastage Server Edition and Parallel Edition?
HOw Hash Partion Works Thank you in Advance i have doubts on Hash Partion TEch Could please give me the clear understandable notation example e_id,dept_no 1,10 2,10 3,20 4,20 5,30 6,40 i have TWo Nodes/Three Nodes My questions are: 1).if i select hash key as e_id how Hash partion will distribute the data in to two NOdes/three NOdes 2).if i select hash key as dept_no how Hash partion will distribute the data in to two NOdes/three NOdes sivakumar.katta7@gmail.com
I have load a Dataset in UAT with 2 Node configuration, imported the job into PROD environment which is 4 node configuration and using this DataSet as SRC to other job. will the job run fine or give any errors? If job runs fine, on how many nodes? 2 nodes or 4 nodes?