How would you delete duplicate observations?
Answers were Sorted based on User's Feedback
Answer / poornima
we can delete duplicate observations by using nodup or
nodupkey option in the proc sort
Example :-
Proc sort data=datasetname nodup;
run;
| Is This Answer Correct ? | 33 Yes | 12 No |
Answer / padmasri
In 3 ways we can delete the duplicate records.
1.procedure proc sort
in proc sort there are two ways to delete duplicate
observations:
* nodupkey
*noduprec
2.first. and last.
3.proc sql
these 3 ways we can delet the duplicate records in sas.
| Is This Answer Correct ? | 16 Yes | 0 No |
Answer / srinivas
there 3 options to delete duplicate obs
1. nodup
2. nodupkey
3.noduprec
if a entire record is duplicated in sense we use nodup of
nodup rec in proc sort procedure.
proc sort data=dsn nodup/noduprec;
by var;
run;
if a variable is repeated not a entire record . this time
we use nodupkey
proc sort data=dsn nodupkey;
by var;
run;
ex; in a dataset empid is repeated then use this syn. and
the empid is used in by var statement.
| Is This Answer Correct ? | 19 Yes | 5 No |
Answer / vijay
There are several ways to do this. However the easliest
code-wise is to use PROC SORT. For example:
PROC SORT DATA=mydata NODUPKEY;
BY variable;
RUN;
| Is This Answer Correct ? | 17 Yes | 7 No |
There are two ways of deleting the records from the dataset
with the help of PROC SORT.
1. Using NODUP/NODUPRECS
2. Using NODUPKEY
The first option deletes the records only if all the
variables values are repeated in the subsequent records.
The second options deletes the records only if the value of
the BY variables given in the BY clause are repeated in the
subsequent records.
| Is This Answer Correct ? | 4 Yes | 1 No |
What is the difference between match merge and one to one merge?
how could you generate test data with no input data? : Sas programming
i have a dataset with var1,var2,var3; i want to upload the titles for the variables . How can we?
how to handle in stream data containing semicolon in it?
data data1; input dt account; format dt date9.; cards; 1745 1230 1756 1120 1788 1130 1767 1240 ; data data2; input startdt enddt total; format startdt date9. enddt date9.; cards; 1657 1834 12300 1557 1758 16800 1789 1789 12300 1788 1345 12383 1899 1899 13250 ; proc sql; create table data3 as select * from data1 as x left join data2 as y on x.dt>=y.startdt and x.dt<=y.enddt; quit; Here, we are getting cartision product. But,I want left join report consisting of this program. It should not get duplicate values. you can modify the program also.
I have a dataset concat having variable a b & c. How to rename a b to e & f?
What are the automatic variables for macro? : sas-macro
how to perform paired t-test using Base/SAS & SAS/Stat?
What is the difference between the proc sql and data step?
data data1; input dt account; format dt date9.; cards; 1745 1230 1756 1120 1788 1130 1767 1240 ; data data2; input startdt enddt total; format startdt date9. enddt date9.; cards; 1657 1834 12300 1557 1758 16800 1789 1789 12300 1788 1345 12383 1899 1899 13250 ; proc sql; create table data3 as select * from data1 as x left join data2 as y on x.dt>=y.startdt and x.dt<=y.enddt; quit; Here, we are getting cartision product. But,I want left join report consisting of this program. It should not get duplicate values. you can modify the program also.
what do you mean by data staging area? : Sas-di
How do you read in the variables that you need?