when U have a remove dublicate option in sort stage, why we have a remove dubl

when U have a remove dublicate option in sort stage, why we
have a remove dublicate stage in PX, thought it is
recamended to sort data before using a remove dublicate
stage. I hae been thinking this from days....

Question Posted / balu1138

4 Answers
23617 Views
Target, I also Faced
E-Mail Answers

Answers were Sorted based on User's Feedback

when U have a remove dublicate option in sort stage, why we have a remove dublicate stage in PX, t..

Answer / prasu

In Duplicate Stages we have more number of optionscompare
to sort while removing duplicates.If you have less number
if data you can go with Sort stage to remove duolicats.If
you have large number of data go for Remove Duplicates
Stage.

Is This Answer Correct ?

8 Yes

0 No

when U have a remove dublicate option in sort stage, why we have a remove dublicate stage in PX, t..

Answer / phani kumar

Sort stage is used to sort the data and having option of
identifying the duplicate records with the value of Key
change column. But, to perform sort and remove duplicates is
leads to decrease the performance. So, it is preferable for
less amount of data.

Remove duplicates stage is used to get only unique records
either first occurrence or last occurrences. For large
amount of data, sorted data is required for better performance.

Correct me if iam wrong..........

Thanks and regards....
Phani kumar

Is This Answer Correct ?

8 Yes

0 No

when U have a remove dublicate option in sort stage, why we have a remove dublicate stage in PX, t..

Answer / data master

Sort Stage do Sorting of data and performing Remove
Duplicate records, which will slow the performance of job
(Hence it is better to sort data at database level).

If the data is already sorted than use the Remove Duplicate
Stage to remove duplicate records, Which will give better
performance of job than above situation.

Is This Answer Correct ?

3 Yes

2 No

when U have a remove dublicate option in sort stage, why we have a remove dublicate stage in PX, t..

Answer / swati

In Remove Duplicate stage you will get only unique records.

In sort Stage you will get both unique and duplicate records based on key change column.

Is This Answer Correct ?

1 Yes

0 No

Post New Answer

More Data Stage Interview Questions

tell me abt Datastage trigger?

1 Answers HP, IBM,

Hi, Please tell me how to solve this scenario in datastage ? Here we have 3 columns in a table TEST CODE,ENTRY DATE and BATCH The table looks like CODE ENTRYDATE BATCH 100 100716 1 100 100716 1 100 100716 1 200 122517 2 200 122517 2 302 555555 8 302 555555 8 302 555555 8 We need to create a seqno on grouping these 3 columns. The result should be like this. CODE ENTRYDATE BATCH SEQNO 100 100716 1 1 100 100716 1 2 100 100716 1 3 200 122517 2 1 200 122517 2 2 302 555555 8 1 302 555555 8 2 302 555555 8 3

1 Answers Alpharithm Technologies,

I am running a job with 1000 records.. If the job gots aborted after loading 400 records into target... In this case i want to load the records in the target with 401 record... How will we do it??? This scenario is not for sequence job it's only in the job Ex: Seq file--> Trans--> Dataset..

9 Answers Cognizant, IBM, TCS, Virtusa,

what is flow of project?

0 Answers HSBC, IBM,

How many jobs in ur project? Explain any complex job u have done in ur project?

1 Answers IBM, TCS,

Name the third party tools that can be used in datastage?

0 Answers

Field,NVL,INDEX,REPLACE,TRANSLATE,COLESC

0 Answers CTS,

if we using two sources having same meta data and how to check the data in two sources is same or not? and if the data is not same i want to abort the job ?how we can do this?

1 Answers IBM,

To see hidden files in LINIX?

0 Answers CTS,

1.what is materialized data? 2.how to view the materialized data?

0 Answers HCL, IBM,

What are stage variables and constants?

0 Answers

What is the Difference Between DataStage 7.5 version and 8.1 Version?

10 Answers IBM,

For more Data Stage Interview Questions Click Here