Friday, November 20, 2009

Proc SQL for SAS Programmers

Yesterday I came across a website BLINK7. It is a great site to browse and to get help to improve your SAS knowledge in Proc SQL and Base SAS. This website offers a lot of information in the form of sample codes and tutorials on different topics in SAS.

SQL for SAS Programmers - Introduction

What is SQL?

SQL stands for Structured Query Language and was designed for development and maintenance within a Database Management System (DBMS). A DBMS consists of one or more tables of data, typically joined in a hierarchical fashion, and a series of programs for organizing the data.
Typical tasks performed with SQL code include the following:

  • Retrieve (or query) data from one or more data tables
  • Manipulate data within existing tables
  • Define new tables and create data within new table
  • Alter existing table definitions
  • Set permissions for different users to access existing tables

The first part of this tutorial deals with using the PROC SQL statement to perform basic data extraction. Screenshots of the code and output are included. Readers who wish to follow along on their own systems or copy the code can download the files provided below (right click and select “save as” or “save link as”):

read more at:....

(download) SAS Code for Tutorial Part 1
(download) SAS Data: Transactions
(download) SAS Data: Payment Types
(download) SAS Data: Staff

SAS Tutorial: Creating Categories with PROC FORMAT

SAS Tutorial: Loading Tab-Delimited Files

source: www.blink7.com




Comparing SAS steps and PROC SQL_ Coding and Performance -


Basics of SAS PROC SQL -

Tuesday, November 3, 2009

Using ODS to Create Customised Output

Using the SAS Output Delivery System (ODS), you can create, customise, and manage HTML output in any operating environment by submitting programming statements. After creating HTML files, you can view them using Internet Explorer, Netscape Navigator, or any Web browser that fully supports HTML 3.2.

ODS gives you new formatting options and makes procedure output much more flexible. With ODS, you can easily create HTML, RTF, PCL, PS, XML, Latex and PDF output, an output data set of procedure results and traditional SAS listing output. Also, ODS stores your output in its component parts (data and table definition) so that numerical data retains its full precision.

Procedure output is divided into components, or output objects. Depending on the procedure that you run you might have one or several output objects created. For example proc print would create just one output object but proc univariate would produce multiple output objects. ODS stores a link to each output object in the results window. Using ODS programming statements we can control what output objects we are interested in and what ODS destinations we want to send them to.

In order to start creating HTML, RTF, PDF files etc. you will need a few ODS statements to get you started. By default SAS output still goes to the output window. In order to send the output elsewhere you need to open the appropriate destination. The example below turns off the listing destination (the output window) and opens the HTML destination so that it is ready to receive our output. When the HTML destination is closed the class.html file is created and the HTML destination is closed:


Ods listing close;
Ods html body='c:\myreports\class.html';
Proc print data=sashelp.class;
run;
Ods html close;
Ods listing;

Read more.....

Saturday, October 17, 2009

SAS Tip_less code: Assigning 1 or 0 to flag variable

*Creating a flag variable when a test variable meets certain criteria is very common for SAS programmer….

Many SAS programmers use the below code to assign a flag of 1 or 0 depending on of the test variable meets criteria or not.;

*Ex:;

*Create a test dataset;
data test;
input id age sex $;
cards;
1 25 Male
2 35 Female
3 29 Female
4 37 Male
5 32 Male

;
run;


*Most programmers use the following code to assign avalue of 1 0r 0 to flag variable;

data test1;
set test;
if sex='Male' then flag=1;
else flag=0;
run;


*Some programmers use the following code to do the same task;
data test;
set test;
flag=ifn(sex='Male',1,0);
run;

*You can write ....even simpler code than the above 2 dataset step methods.;

data test2;
set test;
flag='Male'=sex;
run;

*Or;

data test3;
set test;
flag=sex='Male';
run;

*Note: The above code does the same thing as the 1st and 2nd method;
Caveat: This code works only when you are trying to assign a value of 1 and 0 to test variable;


Thursday, October 15, 2009

Dummy Dataset or SAS Options: Which is better to insert a Zero Row?

Always, programmers need to summarize the demographics data and show it in a table and to do so they use Proc Freq procedure. Even though proc Freq calculates the Frequency exactly, it may not be the write procedure in all cases especially when data do not exist.

Some times statistician wants to see all the data values on the CRF in the final table, even though there is no combination as such exists in the dataset. In this case we have to insert observations with 0 values.

Here I will present you ….the different methods to insert a zero row.

1) Creating a Dummy Dataset and Concatenate the dummy dataset with the input dataset.
2) Proc Freq SPARSE option
3) Proc Means COMPLETETYPES Option
4) Proc Means COMPLETETYPES Option with PRELOADFMT option.
Dummy Dataset:
Adv: Simple and doesn’t need any formats
Caveat: Programmer has to know all the possible combinations

Sparse Option:
Lists all possible combinations of variable levels even when a combination does not occur.

Syntax:

proc freq data=demo noprint;
table sitec*race /sparse out=freq (drop=percent);
run;

Using SPARSE option in Proc Freq, SAS outputs one record for each possible combination of variables mentioned in tables’ statement.


Adv: Convenient and Simpler.
Dis.Adv: Sometimes CRF has more types than we normally see in dataset. If Statistician want us to keep one record for each type mentioned in the CRF, SPARSE option in the proc freq doesn’t work as expected. Because SAS doesn’t know what other possible combination occurs in the dataset.

Caveat: There must be at least one occurrence of a value for SPARSE to summarize appropriately.

Proc Means using Complete Types Option:
Syntax:

proc means data=demo completetypes noprint nway;
class sitec race;
output out =race(rename=(_freq_=count) drop=_type_);
run;

Adv: Simple and easy to write…..Proc Means with COMPLETETYPES option works similar to Proc Freq SPARSE option.

Caveat: There must be at least one occurrence of a value for COMPLETETYPES option to summarize appropriately.

Proc Means using COMPLETETYPES and the PRELOADFMT option:
PRELOADFMT Option tells SAS to load all the formats (mentioned in the Proc Format procedure for particular variable) in memory before start executing the Proc Means CLASS statement.

One important thing here you should know is about how to use this option.
If you want to use this PRELOADFMT option in the CLASS statemnt, you should also use either of COMPLETETYPES, EXCLUSIVE or ORDER=DATA options.

When you use the PRELOADFMT option in combination with the COMPLETETYPES option, SAS create the output with all the possible combinations even if the combination doesn't seen in the input dataset.


Syntax:

proc format;
VALUE $RACEF
'Asian'=3
'Black'=2
'White'=1
'American Indian or Alaska Native'=4
'Native Hawaiian or Other Pacific Islander'=5;
run;

data demo;
set demo;
format race $racef.;
run;

proc means data=demo completetypes noprint nway;
class sitec race/preloadfmt;
output out =race(rename=(_freq_=count) drop=_type_);
run;
Adv: Simplicity of use
There is no requirement to have at least one occurrence of a value in the data.

Caveat: This method only works if we use formats in combination with our data. You don’t necessarily need to know what the format values are, but we have to make sure formats are assigned to all variables we are trying to summarize.

YOu can use PRELOADFMT option in Proc means , Proc summary and Proc Tabulate.


Example:

data demo ;
input siteid $ sex $ race $ age ;
cards;SITE1 M White 23
SITE1 F White 43
SITE1 M White 34
SITE2 M Black 21
SITE2 M White 56
SITE2 F Black 33
;

run;

proc sort data=demo;
by siteid;

run;


*Without any options in proc freq;
proc freq data=demo noprint;
table siteid*race /out=nooptions (drop=percent);
run;














*With Sparse option in proc freq;

proc freq data=demo noprint;
table siteid*race /sparse out=_sparse (drop=percent);
run;








*With Completetypes option in proc means;
proc means data=demo completetypes noprint nway;
class siteid race;
output out =comptyp(where=(_stat_='N')rename=(_freq_=count) keep=siteid race _freq_ _stat_);
run;
















*With Completetypes and preloadfmt options in proc means;
proc format;
VALUE $RACEF
'Asian'='Asian'
'Black'='Black'
'White'='White';

run;


data demo;
set demo;
format race $racef.;
run;

proc means data=demo completetypes noprint nway;
class siteid race/preloadfmt;
output out =race(where=(_stat_='N')rename=(_freq_=count) keep=siteid race _freq_ _stat_);
run;

Output:














With PRELOADFMT in the CLASS statement and COMPLETETYPES option in the PROC MEANS statement, SAS will include all the possible combinations of classification variables in the output as well as zero rows (0 observations).

Wednesday, October 7, 2009