when U have a remove dublicate option in sort stage, why we
have a remove dublicate stage in PX, thought it is
recamended to sort data before using a remove dublicate
stage. I hae been thinking this from days....
Answers were Sorted based on User's Feedback
Answer / prasu
In Duplicate Stages we have more number of optionscompare
to sort while removing duplicates.If you have less number
if data you can go with Sort stage to remove duolicats.If
you have large number of data go for Remove Duplicates
Stage.
| Is This Answer Correct ? | 8 Yes | 0 No |
Answer / phani kumar
Sort stage is used to sort the data and having option of
identifying the duplicate records with the value of Key
change column. But, to perform sort and remove duplicates is
leads to decrease the performance. So, it is preferable for
less amount of data.
Remove duplicates stage is used to get only unique records
either first occurrence or last occurrences. For large
amount of data, sorted data is required for better performance.
Correct me if iam wrong..........
Thanks and regards....
Phani kumar
| Is This Answer Correct ? | 8 Yes | 0 No |
Answer / data master
Sort Stage do Sorting of data and performing Remove
Duplicate records, which will slow the performance of job
(Hence it is better to sort data at database level).
If the data is already sorted than use the Remove Duplicate
Stage to remove duplicate records, Which will give better
performance of job than above situation.
| Is This Answer Correct ? | 3 Yes | 2 No |
Answer / swati
In Remove Duplicate stage you will get only unique records.
In sort Stage you will get both unique and duplicate records based on key change column.
| Is This Answer Correct ? | 1 Yes | 0 No |
How u implement the slowly changing dimensions if my source table is consisting of cid,cname,add,phno,email but i need to capture the changes for first three columns how u implement?
Hi Vijay here For Four CPU's how many nodes will required?
How to delete the data in dataset?types of deleting the data in dataset?
source contains 2 columns comes to target 4 columns how
hi, how would i run job1 then job 3 , then job2 in a sequence of job1 ,job2,job3. Thanks sunitha
where the log files or tables can store in DS?
1)How to do error handling in datastage? 2)Did sequential stage accepts .xl files ,xml? znd how?
What is container and then types?
Field,NVL,INDEX,REPLACE,TRANSLATE,COLESC
it is possible to load two tables data into one sequential file?if possible how?plz share with me?
How many types of sorting methods are available in datastage?
i have a scenario in which i/p columns areID,salary with 1,1000 2,2000 and 3,3000 i need an extra column in the o/p named avg(salary)how can i get it?