Sunday, October 18, 2009
Saturday, October 17, 2009
SAS Tip_less code: Assigning 1 or 0 to flag variable
*Creating a flag variable when a test variable meets certain criteria is very common for SAS programmer….
Many SAS programmers use the below code to assign a flag of 1 or 0 depending on of the test variable meets criteria or not.;
*Ex:;
*Create a test dataset;
data test;
input id age sex $;
cards;
1 25 Male
2 35 Female
3 29 Female
4 37 Male
5 32 Male
;
run;
*Most programmers use the following code to assign avalue of 1 0r 0 to flag variable;
data test1;
set test;
if sex='Male' then flag=1;
else flag=0;
run;
*Some programmers use the following code to do the same task;
data test;
set test;
flag=ifn(sex='Male',1,0);
run;
Many SAS programmers use the below code to assign a flag of 1 or 0 depending on of the test variable meets criteria or not.;
*Ex:;
*Create a test dataset;
data test;
input id age sex $;
cards;
1 25 Male
2 35 Female
3 29 Female
4 37 Male
5 32 Male
;
run;
*Most programmers use the following code to assign avalue of 1 0r 0 to flag variable;
data test1;
set test;
if sex='Male' then flag=1;
else flag=0;
run;
*Some programmers use the following code to do the same task;
data test;
set test;
flag=ifn(sex='Male',1,0);
run;
*You can write ....even simpler code than the above 2 dataset step methods.;
data test2;
set test;
flag='Male'=sex;
run;
*Or;
data test3;
set test;
flag=sex='Male';
run;
*Note: The above code does the same thing as the 1st and 2nd method;
Caveat: This code works only when you are trying to assign a value of 1 and 0 to test variable;
Thursday, October 15, 2009
Dummy Dataset or SAS Options: Which is better to insert a Zero Row?
Always, programmers need to summarize the demographics data and show it in a table and to do so they use Proc Freq procedure. Even though proc Freq calculates the Frequency exactly, it may not be the write procedure in all cases especially when data do not exist.
Some times statistician wants to see all the data values on the CRF in the final table, even though there is no combination as such exists in the dataset. In this case we have to insert observations with 0 values.
Here I will present you ….the different methods to insert a zero row.
1) Creating a Dummy Dataset and Concatenate the dummy dataset with the input dataset.
2) Proc Freq SPARSE option
3) Proc Means COMPLETETYPES Option
4) Proc Means COMPLETETYPES Option with PRELOADFMT option.
2) Proc Freq SPARSE option
3) Proc Means COMPLETETYPES Option
4) Proc Means COMPLETETYPES Option with PRELOADFMT option.
Dummy Dataset:
Adv: Simple and doesn’t need any formats
Caveat: Programmer has to know all the possible combinations
Sparse Option:
Lists all possible combinations of variable levels even when a combination does not occur.
Syntax:
proc freq data=demo noprint;
table sitec*race /sparse out=freq (drop=percent);
run;
table sitec*race /sparse out=freq (drop=percent);
run;
Using SPARSE option in Proc Freq, SAS outputs one record for each possible combination of variables mentioned in tables’ statement.
Adv: Convenient and Simpler.
Dis.Adv: Sometimes CRF has more types than we normally see in dataset. If Statistician want us to keep one record for each type mentioned in the CRF, SPARSE option in the proc freq doesn’t work as expected. Because SAS doesn’t know what other possible combination occurs in the dataset.
Caveat: There must be at least one occurrence of a value for SPARSE to summarize appropriately.
Proc Means using Complete Types Option:
Syntax:
proc means data=demo completetypes noprint nway;
class sitec race;
output out =race(rename=(_freq_=count) drop=_type_);run;
class sitec race;
output out =race(rename=(_freq_=count) drop=_type_);run;
Adv: Simple and easy to write…..Proc Means with COMPLETETYPES option works similar to Proc Freq SPARSE option.
Caveat: There must be at least one occurrence of a value for COMPLETETYPES option to summarize appropriately.
Proc Means using COMPLETETYPES and the PRELOADFMT option:
PRELOADFMT Option tells SAS to load all the formats (mentioned in the Proc Format procedure for particular variable) in memory before start executing the Proc Means CLASS statement.
One important thing here you should know is about how to use this option.
If you want to use this PRELOADFMT option in the CLASS statemnt, you should also use either of COMPLETETYPES, EXCLUSIVE or ORDER=DATA options.
When you use the PRELOADFMT option in combination with the COMPLETETYPES option, SAS create the output with all the possible combinations even if the combination doesn't seen in the input dataset.
Syntax:
proc format;
VALUE $RACEF
'Asian'=3
'Black'=2
'White'=1
'American Indian or Alaska Native'=4
'Native Hawaiian or Other Pacific Islander'=5;
run;
With PRELOADFMT in the CLASS statement and COMPLETETYPES option in the PROC MEANS statement, SAS will include all the possible combinations of classification variables in the output as well as zero rows (0 observations).VALUE $RACEF
'Asian'=3
'Black'=2
'White'=1
'American Indian or Alaska Native'=4
'Native Hawaiian or Other Pacific Islander'=5;
run;
data demo;
set demo;
format race $racef.;
run;
proc means data=demo completetypes noprint nway;
class sitec race/preloadfmt;
output out =race(rename=(_freq_=count) drop=_type_);
run;
proc sort data=demo;
by siteid;
run;
*Without any options in proc freq;
proc freq data=demo noprint;
table siteid*race /out=nooptions (drop=percent);
run;
*With Completetypes and preloadfmt options in proc means;
proc format;
VALUE $RACEF
'Asian'='Asian'
'Black'='Black'
'White'='White';
run;
data demo;
set demo;
format race $racef.;
run;
proc means data=demo completetypes noprint nway;
class siteid race/preloadfmt;
output out =race(where=(_stat_='N')rename=(_freq_=count) keep=siteid race _freq_ _stat_);
run;
Output:
set demo;
format race $racef.;
run;
proc means data=demo completetypes noprint nway;
class sitec race/preloadfmt;
output out =race(rename=(_freq_=count) drop=_type_);
run;
Adv: Simplicity of use
There is no requirement to have at least one occurrence of a value in the data.
Caveat: This method only works if we use formats in combination with our data. You don’t necessarily need to know what the format values are, but we have to make sure formats are assigned to all variables we are trying to summarize.
YOu can use PRELOADFMT option in Proc means , Proc summary and Proc Tabulate.
Example:
data demo ;
input siteid $ sex $ race $ age ;
cards;SITE1 M White 23
SITE1 F White 43
SITE1 M White 34
SITE2 M Black 21
SITE2 M White 56
SITE2 F Black 33;
run;
input siteid $ sex $ race $ age ;
cards;SITE1 M White 23
SITE1 F White 43
SITE1 M White 34
SITE2 M Black 21
SITE2 M White 56
SITE2 F Black 33;
run;
proc sort data=demo;
by siteid;
run;
*Without any options in proc freq;
proc freq data=demo noprint;
table siteid*race /out=nooptions (drop=percent);
run;
*With Sparse option in proc freq;
*With Completetypes option in proc means;
proc means data=demo completetypes noprint nway;
class siteid race;
output out =comptyp(where=(_stat_='N')rename=(_freq_=count) keep=siteid race _freq_ _stat_);
run;
proc means data=demo completetypes noprint nway;
class siteid race;
output out =comptyp(where=(_stat_='N')rename=(_freq_=count) keep=siteid race _freq_ _stat_);
run;
*With Completetypes and preloadfmt options in proc means;
proc format;
VALUE $RACEF
'Asian'='Asian'
'Black'='Black'
'White'='White';
run;
data demo;
set demo;
format race $racef.;
run;
proc means data=demo completetypes noprint nway;
class siteid race/preloadfmt;
output out =race(where=(_stat_='N')rename=(_freq_=count) keep=siteid race _freq_ _stat_);
run;
Output:
Wednesday, October 7, 2009
Saturday, October 3, 2009
Subscribe to:
Posts (Atom)
Learn how to view SAS dataset labels without opening the dataset directly in a SAS session. Easy methods and examples included!
Quick Tip: See SAS Dataset Labels Without Opening the Data Quick Tip: See SAS Dataset Labels With...
-
1) What do you know about CDISC and its standards? CDISC stands for Clinical Data Interchange Standards Consortium and it is developed ke...
-
Comparing Two Approaches to Removing Formats and Informats in SAS Comparing Two Approaches to Removing Formats...
-
USE OF THE “STUDY DAY” VARIABLES The permissible Study Day variables (--DY, --STDY, and --ENDY) describe the relative day of the observ...