Saturday, October 17, 2009

SAS Tip_less code: Assigning 1 or 0 to flag variable

*Creating a flag variable when a test variable meets certain criteria is very common for SAS programmer….

Many SAS programmers use the below code to assign a flag of 1 or 0 depending on of the test variable meets criteria or not.;

*Ex:;

*Create a test dataset;
data test;
input id age sex $;
cards;
1 25 Male
2 35 Female
3 29 Female
4 37 Male
5 32 Male

;
run;


*Most programmers use the following code to assign avalue of 1 0r 0 to flag variable;

data test1;
set test;
if sex='Male' then flag=1;
else flag=0;
run;


*Some programmers use the following code to do the same task;
data test;
set test;
flag=ifn(sex='Male',1,0);
run;

*You can write ....even simpler code than the above 2 dataset step methods.;

data test2;
set test;
flag='Male'=sex;
run;

*Or;

data test3;
set test;
flag=sex='Male';
run;

*Note: The above code does the same thing as the 1st and 2nd method;
Caveat: This code works only when you are trying to assign a value of 1 and 0 to test variable;


Thursday, October 15, 2009

Dummy Dataset or SAS Options: Which is better to insert a Zero Row?

Always, programmers need to summarize the demographics data and show it in a table and to do so they use Proc Freq procedure. Even though proc Freq calculates the Frequency exactly, it may not be the write procedure in all cases especially when data do not exist.

Some times statistician wants to see all the data values on the CRF in the final table, even though there is no combination as such exists in the dataset. In this case we have to insert observations with 0 values.

Here I will present you ….the different methods to insert a zero row.

1) Creating a Dummy Dataset and Concatenate the dummy dataset with the input dataset.
2) Proc Freq SPARSE option
3) Proc Means COMPLETETYPES Option
4) Proc Means COMPLETETYPES Option with PRELOADFMT option.
Dummy Dataset:
Adv: Simple and doesn’t need any formats
Caveat: Programmer has to know all the possible combinations

Sparse Option:
Lists all possible combinations of variable levels even when a combination does not occur.

Syntax:

proc freq data=demo noprint;
table sitec*race /sparse out=freq (drop=percent);
run;

Using SPARSE option in Proc Freq, SAS outputs one record for each possible combination of variables mentioned in tables’ statement.


Adv: Convenient and Simpler.
Dis.Adv: Sometimes CRF has more types than we normally see in dataset. If Statistician want us to keep one record for each type mentioned in the CRF, SPARSE option in the proc freq doesn’t work as expected. Because SAS doesn’t know what other possible combination occurs in the dataset.

Caveat: There must be at least one occurrence of a value for SPARSE to summarize appropriately.

Proc Means using Complete Types Option:
Syntax:

proc means data=demo completetypes noprint nway;
class sitec race;
output out =race(rename=(_freq_=count) drop=_type_);
run;

Adv: Simple and easy to write…..Proc Means with COMPLETETYPES option works similar to Proc Freq SPARSE option.

Caveat: There must be at least one occurrence of a value for COMPLETETYPES option to summarize appropriately.

Proc Means using COMPLETETYPES and the PRELOADFMT option:
PRELOADFMT Option tells SAS to load all the formats (mentioned in the Proc Format procedure for particular variable) in memory before start executing the Proc Means CLASS statement.

One important thing here you should know is about how to use this option.
If you want to use this PRELOADFMT option in the CLASS statemnt, you should also use either of COMPLETETYPES, EXCLUSIVE or ORDER=DATA options.

When you use the PRELOADFMT option in combination with the COMPLETETYPES option, SAS create the output with all the possible combinations even if the combination doesn't seen in the input dataset.


Syntax:

proc format;
VALUE $RACEF
'Asian'=3
'Black'=2
'White'=1
'American Indian or Alaska Native'=4
'Native Hawaiian or Other Pacific Islander'=5;
run;

data demo;
set demo;
format race $racef.;
run;

proc means data=demo completetypes noprint nway;
class sitec race/preloadfmt;
output out =race(rename=(_freq_=count) drop=_type_);
run;
Adv: Simplicity of use
There is no requirement to have at least one occurrence of a value in the data.

Caveat: This method only works if we use formats in combination with our data. You don’t necessarily need to know what the format values are, but we have to make sure formats are assigned to all variables we are trying to summarize.

YOu can use PRELOADFMT option in Proc means , Proc summary and Proc Tabulate.


Example:

data demo ;
input siteid $ sex $ race $ age ;
cards;SITE1 M White 23
SITE1 F White 43
SITE1 M White 34
SITE2 M Black 21
SITE2 M White 56
SITE2 F Black 33
;

run;

proc sort data=demo;
by siteid;

run;


*Without any options in proc freq;
proc freq data=demo noprint;
table siteid*race /out=nooptions (drop=percent);
run;














*With Sparse option in proc freq;

proc freq data=demo noprint;
table siteid*race /sparse out=_sparse (drop=percent);
run;








*With Completetypes option in proc means;
proc means data=demo completetypes noprint nway;
class siteid race;
output out =comptyp(where=(_stat_='N')rename=(_freq_=count) keep=siteid race _freq_ _stat_);
run;
















*With Completetypes and preloadfmt options in proc means;
proc format;
VALUE $RACEF
'Asian'='Asian'
'Black'='Black'
'White'='White';

run;


data demo;
set demo;
format race $racef.;
run;

proc means data=demo completetypes noprint nway;
class siteid race/preloadfmt;
output out =race(where=(_stat_='N')rename=(_freq_=count) keep=siteid race _freq_ _stat_);
run;

Output:














With PRELOADFMT in the CLASS statement and COMPLETETYPES option in the PROC MEANS statement, SAS will include all the possible combinations of classification variables in the output as well as zero rows (0 observations).

Wednesday, October 7, 2009

Learn how to view SAS dataset labels without opening the dataset directly in a SAS session. Easy methods and examples included!

Quick Tip: See SAS Dataset Labels Without Opening the Data Quick Tip: See SAS Dataset Labels With...