when U have a remove dublicate option in sort stage, why we
have a remove dublicate stage in PX, thought it is
recamended to sort data before using a remove dublicate
stage. I hae been thinking this from days....
Answers were Sorted based on User's Feedback
Answer / prasu
In Duplicate Stages we have more number of optionscompare
to sort while removing duplicates.If you have less number
if data you can go with Sort stage to remove duolicats.If
you have large number of data go for Remove Duplicates
Stage.
| Is This Answer Correct ? | 8 Yes | 0 No |
Answer / phani kumar
Sort stage is used to sort the data and having option of
identifying the duplicate records with the value of Key
change column. But, to perform sort and remove duplicates is
leads to decrease the performance. So, it is preferable for
less amount of data.
Remove duplicates stage is used to get only unique records
either first occurrence or last occurrences. For large
amount of data, sorted data is required for better performance.
Correct me if iam wrong..........
Thanks and regards....
Phani kumar
| Is This Answer Correct ? | 8 Yes | 0 No |
Answer / data master
Sort Stage do Sorting of data and performing Remove
Duplicate records, which will slow the performance of job
(Hence it is better to sort data at database level).
If the data is already sorted than use the Remove Duplicate
Stage to remove duplicate records, Which will give better
performance of job than above situation.
| Is This Answer Correct ? | 3 Yes | 2 No |
Answer / swati
In Remove Duplicate stage you will get only unique records.
In sort Stage you will get both unique and duplicate records based on key change column.
| Is This Answer Correct ? | 1 Yes | 0 No |
what is the difference between the active datawarehouse and datawarehouse
there are indexes on a table as index1 with col1, col2 index2 with col2 index3 with col1,col2,col3. if i run a query with col1='100' which index will be used and why
What are data elements?
table actions available in oracle connector?
What is a merge in datastage?
EXPLAIN SCD
how to delete one row in target dataset
Terminate Activity
Give an idea of system variables.
how to call sequential generator in datastage?
How do you import and export data into datastage?
how to use self join using datastage ? can u tell me using stage how can we implemnet the self join