how to do pergformence tuning in datastage?
Answer Posted / prams
1. Staged the data coming from ODBC/OCI/DB2UDB stages
or any database on the server using Hash/Sequential files
for optimum performance also for data recovery in case job
aborts.
2. Tuned the OCI stage for 'Array Size' and 'Rows per
Transaction' numerical values for faster inserts, updates
and selects.
3. Tuned the 'Project Tunables' in Administrator for
better performance.
4. Used sorted data for Aggregator.
5. Sorted the data as much as possible in DB and
reduced the use of DS-Sort for better performance of jobs
6. Removed the data not used from the source as early
as possible in the job.
7. Worked with DB-admin to create appropriate Indexes
on tables for better performance of DS queries
8. Converted some of the complex joins/business in DS
to Stored Procedures on DS for faster execution of the
jobs.
9. If an input file has an excessive number of rows
and can be split-up then use standard logic to run jobs in
parallel.
10. Before writing a routine or a transform, make sure
that there is not the functionality required in one of the
standard routines supplied in the sdk or ds utilities
categories.
Constraints are generally CPU intensive and take a
significant amount of time to process. This may be the case
if the constraint calls routines or external macros but if
it is inline code then the overhead will be minimal.
11. Try to have the constraints in the 'Selection'
criteria of the jobs itself. This will eliminate the
unnecessary records even getting in before joins are made.
12. Tuning should occur on a job-by-job basis.
13. Use the power of DBMS.
14. Try not to use a sort stage when you can use an
ORDER BY clause in the database.
15. Using a constraint to filter a record set is much
slower than performing a SELECT … WHERE….
16. Make every attempt to use the bulk loader for your
particular database. Bulk loaders are generally faster than
using ODBC or OLE.
| Is This Answer Correct ? | 28 Yes | 3 No |
Post New Answer View All Answers
What are the steps needed to create a simple basic datastage job?
how do u catch bad rows from OCI stage? And what CLI stands for?
What are data elements?
project Steps,hits, Project level HArd things,Solved methods?
1)what is the size of Fact table and dimension table? 2)how to find the size of Fact table and dimension table? 3)how to implement the surrogate key in transform stage? 4)write the configuration file path? 5)how many types of datasets explain? 6)diff b/w developed projects and migration projects? 7)how to delete the header and footer file of the sequencer file? 8)how can u call the parameters in DS in unix environment? 9) how much data ur getting daily ? 10)
how to export or import the jobs in .ISX file
How a source file is populated?
if i have two tables table1 table2 1a 1a,b,c,d 1b 2a,b,c,d,e 1c 1d 2a 2b 2c 2d 2e how can i get data as same as in tables? how can i implement scd typ1 and type2 in both server and in parallel? field1 field2 field3 suresh , 10,324 , 355 , 1234 ram , 23,456 , 450 , 456 balu ,40,346,23 , 275, 5678 how to remove the duplicate rows,inthe fields?
What is a merge?
State the difference between an operational datastage and a data warehouse?
what is 'reconsideration error' and how can i respond to this error and how to debug this
What is the purpose of interprocessor stage in server jobs?
How one source columns or rows to be loaded in to two different tables?
What is the Environment Variable need to Set to TRIM in Project Level?(In transfermer, we TRIM function but I need to impliment this project level using Environment variable)
table actions available in oracle connector?