how to do pergformence tuning in datastage?
Answers were Sorted based on User's Feedback
1. Staged the data coming from ODBC/OCI/DB2UDB stages
or any database on the server using Hash/Sequential files
for optimum performance also for data recovery in case job
aborts.
2. Tuned the OCI stage for 'Array Size' and 'Rows per
Transaction' numerical values for faster inserts, updates
and selects.
3. Tuned the 'Project Tunables' in Administrator for
better performance.
4. Used sorted data for Aggregator.
5. Sorted the data as much as possible in DB and
reduced the use of DS-Sort for better performance of jobs
6. Removed the data not used from the source as early
as possible in the job.
7. Worked with DB-admin to create appropriate Indexes
on tables for better performance of DS queries
8. Converted some of the complex joins/business in DS
to Stored Procedures on DS for faster execution of the
jobs.
9. If an input file has an excessive number of rows
and can be split-up then use standard logic to run jobs in
parallel.
10. Before writing a routine or a transform, make sure
that there is not the functionality required in one of the
standard routines supplied in the sdk or ds utilities
categories.
Constraints are generally CPU intensive and take a
significant amount of time to process. This may be the case
if the constraint calls routines or external macros but if
it is inline code then the overhead will be minimal.
11. Try to have the constraints in the 'Selection'
criteria of the jobs itself. This will eliminate the
unnecessary records even getting in before joins are made.
12. Tuning should occur on a job-by-job basis.
13. Use the power of DBMS.
14. Try not to use a sort stage when you can use an
ORDER BY clause in the database.
15. Using a constraint to filter a record set is much
slower than performing a SELECT … WHERE….
16. Make every attempt to use the bulk loader for your
particular database. Bulk loaders are generally faster than
using ODBC or OLE.
| Is This Answer Correct ? | 28 Yes | 3 No |
Answer / raji
1.Avoid using transformer stage for renaming some
columns.Because it will slow down the performance of the
jobs.. Try to use Copy stage for renaming the columns.
2.Take only the required columns during the table level
lookups. Remove all the unnessary columns
3.By using partioning technique. It depends on our
requirement. It will increase the performance as well
| Is This Answer Correct ? | 29 Yes | 4 No |
Answer / venugopal [patni]
1. By using hashfile stage we can improve the performance.
In case of hashfile stage we can define the read cache size
& write cache size but the default size is 128M.B.
2. By using active-to-active link performance also we can
improve the performance.
Here we can improve the performance by enabling the row
buffer, the default row buffer size is 128K.B.
3. By removing unwanted columns.
4. By selecting appropriate update actions.
5. In parallel by replacing transformer with copy or filter
stage we can improve the performance.Because if you are
using more than 5 transformers in a stage the performance
will degrade,so to avoid transformer you can use copy or
filter.
6. In server by using linkpartitioner,linkcollectoe & IPC
stages also we can improve the performance.
| Is This Answer Correct ? | 20 Yes | 3 No |
Answer / veera
HI
1. sorted the data as much as possible in sourc database
2. Remove the unwanted columns from soure DB.
3. Drop the indexes before loading the data and Recreate
after loading the data
4. not use more than 20 stages in a job
5. Reduce the Tx stage
6. Use sort stage before an Aggregator stage (in sort mode)
7. Tuned the Project tunnables in administator for better
performance
| Is This Answer Correct ? | 12 Yes | 2 No |
whom do you report?
Hi Every one, I have a scenario plz suggest me 1)On daily we r getting some huge files data so all files metadata is same we have to load in to target table how we can load? 2) One column having 10 records at run time we have to send 5th and 6th record to target at run time how we can send? Hi plz help me for above scenarios and If any one is having JobSequence kindly send me one example and the scenario to my mail ID(nrvdwh@gmail.com)
if 3 table having different columes. like first table having 4 columns , second table having 3 columns and third table having 2 columns then how to capture the data by using funnel stage in parallel jobs...srinu.thadi
17 Answers IBM, TCS,
what is the main difference between sorragate key n primary key in one word
How can we improve performance of data stage jobs?
i have a scenario with i/p as ID,salary with values 1,1000 2,2000 and 3,4000 i need an extra column in the o/p named amount with values 2000,4000 and NULL. how can i get it?
If seg file having 10 records ex:eid 1 2 " " 10 if oracle database having 100 records ex:eid 1 2 " " 100 how to delete matched records permenently from oracle database using datastage ?
What all the types of jobs you developed?
How rejected rows are managed in datastage?
how do you pass parameters in a script?
How can i approach to write datastage 7.5 Certification? and how much they will charge for examination .What exactly should i do? Can anyone guide me plz?
how to configure databases through datastage