Friday, January 2, 2009

SAS sample programs

Reading/Writing Files
Making a fixed format file
Making a SAS Cport file
Reading a SAS Cport file
Reading multiple raw data files, Version 8
Reading multiple raw data files (version 6.x)
Using a SAS macro to "set" multiple files

Other
Imputing the median
Checking for duplicate Ids
Macro to compute a rolling standard deviation
Changing the length of a character variable
Replacing strings
Concatenating string variables using CAT functions
Simple macro to do repeated procs
Eliminate useless variables
Matching husbands and wives
Creating a wide table from a long dataset using PROC TABULATE
How can I "fill down" a variable?
Creating a long list of variable names based on an abbreviated one
Filling in Missing Values Over Time
Dummy Coding a Categorical Variable Using a Macro Program
A few SAS macro programs for renaming variables dynamically

source: http://oregonstate.edu/dept/statistics/sasclass/examples.htm

Creating SAS datasets
Read a SAS dataset
Create a SAS dataset from raw data
List input
Column input
Formated input
Mixed input
Include data in program
Create permanent SAS dataset Working with SAS datasets,
Assignment statements
Functions
IF-THEN-ELSE, SELECT and LABEL statements
Subsetting data
DO-WHILE loop
DO-UNTIL loop Procedures
PROC PRINT and PROC SORT
PROC MEANS, PROC SORT and PROC PRINT
PROC FREQ
PROC UNIVARIATE
PROC REG
PROC GLM
PROC GPLOT


source: http://www.stattutorials.com/SASDATA/

o SAS Program Files
o PROCMEANS1.SAS
o PROCMEANS2.SAS
o PROCMEANS3.SAS
o PROCMEANS4.SAS
o PROCMEANS5.SAS (output)
o PROCUNI.SAS (univariate)
o PROCUNI2.SAS (advanced univariate)
o PROCUNI3.SAS (advanced univariate)
o PROCFREQ1.SAS (Frequency table)
o PROCFREQ2.SAS (Data from summarize counts)
o PROCFREQ3.SAS (Goodness of fit)
o PROCFREQ4.SAS (two-way table)
o PROCFREQ5.SAS (2x2 from summary data)
o PROCCORR1.SAS (correlation)
o PROCCORR2.SAS (matrix of scatterplots)
o PROCTTEST1.SAS (two-group t-test)
o PROCTTEST2.SAS (paired t-test)
o PROCANOVA1.SAS (One Way ANOVA, also PROC GLM)
o PROCGLM2.SAS (Repeated Measures ANOVA)
o PROCGLM2a.SAS (Repeated Measures ANOVA)
o PROC-LIFE-1.SAS (Survival Analysis PROC LIFETEST)
o BLAND-ALTMAN.SAS (Bland-Altman Analysis)

o ODS Examples
o ODS1.SAS (Example output without ODS)
o ODS2.SAS (Simple ODS invocation)
o ODS3.SAS (ODS using Science Style (RTF))
o ODS3A.SAS (Same ODS to HTML)
o ODS3B.SAS (Same to PDF)
o ODS4.SAS (t-Test ODS output)
o ODS5.SAS (Simple GCHART/ No ODS)
o ODS5A.SAS (Drill-down bar chart example)
o ODS6.SAS (2x2 Crosstab)
o ODS6A.SAS (2x2 Crosstab with TRACE)
o ODS6B.SAS (2x2 Crosstab/Selected tables)
o ODS7.SAS (Scatterlot matrix/Correlations)
o ODS8.SAS (Regression with ODS graphics output)
o ODS9.SAS (GLM with graphics output/boxplots)
o ODS10.SAS (Discover output names of components)
o ODS10A.SAS (Output ODS data to file)
o ODS10B.SAS (Merge and use ODS output data)
o SASLibrary.pdf (How to create a SAS Library -- required for some of the examples)


o SAS Data Files
o SOMEDATA.SAS7BDAT
o SBPDATA.SAS7BDAT
o LIFE.SAS7BDAT (for LIFETEST)

Here is the lsit  and brief description of available projects. Everyone should do the first 4 projects.


Project 1 An introduction to the SAS operating environment.

Project 2 The basic SAS data step with input of data directly through the cards statement; use of labels, the sort procedure and print procedure; the means procedure.

Project 3 Reading data from ASCII files; computing new variables in the data step; the means procedure.

Project 4 Modifying existing SAS data sets using set; using loops in the data step; the ttest procedure.

Project 5 Column-wise input; analysis of categorical data using chi-square tests.

Project 6 Updating existing SAS data sets with new data.

Project 7 Basics of presentation quality graphics with proc gplot and proc g3d.

Project 8 Basic one factor analysis of variance using proc GLM.

Project 9 Advanced analysis of variance, custom hypothesis tests, and other features of proc GLM.

Project 10 Multivariate analysis of variance using proc GLM.

Project 11 Basic Box-Jenkins modeling of univariate time series analysis using proc arima (time domain).

Project 12 Some aspects of frequency domain analysis of time series using proc spectra.

Project 13 Discriminant analysis with proc discrim.

Project 14 Reading data from dBase and DIF files; using dBase and DIF files instead of actual SAS datasets.

Project 15 Using arrays, first and last, and processing dates. Repeated measures analysis.

Source: http://javeeh.net

macro for sorting the the datasets

MACRO FOR SORTING:

Rather than using the Proc Sort procedure all the time..... you can just use the following macro....

and call it when you req... to sort any SAS dataset.....

EXAMPLE1:
%macro srt(dtn,keyvar);
proc sort data=&dtn;
by &keyvar;
run;
%mend srt;


%srt(ie,PT IEORRES);

*the above step will tell SAS to sort the IE dataset with the by variables PT and IEORRES respectively.


EXAMPLE2:
*Sometimes we need to create an output dataset during the sorting process i.e in the Proc sort step in such a case use the below macro to do the same;

%MACRO SORT(IN=,VAR=,OUT=);
PROC SORT DATA=&IN OUT=&OUT;
by &VAR;
RUN;
%MEND SORT;

%SORT(IN=CEC1,VAR=PT,OUT=CEC2);
%SORT(IN=DERIVE.HEADER,VAR=PT,OUT=HEADER1);


EXAMPLE3:

*Sometimes we need to use the NODUPKEY option to delete the duplicate observations in the dataset in such a case use the below macro to do the same;

%MACRO SORT(IN,VAR,OUT,OPTN);

PROC SORT DATA=&IN OUT=&OUT &OPTN ;

by &VAR;

RUN;

%MEND SORT;

%SORT(IN=AE, AE1,USIBJID AEBODYSYS AESEV, NODUPKEY);







How to convert numeric date values into character and from character date values into numeric using DATASTEP, PROC SQL and ARRAYS

1) Converting character date values into numeric:

/*I) Using the DATASTEP:*/

1)
Data dates; input cdate $9.; cards;
16-apr-07
01-01-07
02-jun-07
13-sep-07
;
run;

Data Convert; set dates;
Date = input( cdate , ANYDTDTE9.); format date date7.;
run;


2)
Data dates;
input cdate $9.; cards;
16-apr-2007
01-01-2007
02-jun-2007
13-sep-2007
;
run;

Data Convert;
set dates;
Date = input( cdate , ANYDTDTE11.);
format date date9.;
run;

*II) Using Proc SQL;
*Numeric date variable can be converted to character date variable by using the PUT function within PROC SQL.;

proc sql;
create table date_char as
select PUT(date,date9.)as ndate
from date_num; quit;

*Character date variable can be converted to numeric date variable by using the INPUT function within PROC SQL.;
Proc sql;
create table date_num as
select INPUT (date,mmddyy10.)as ndate
format=mmddyy10.
from date_char;
quit;
Or

Proc sql;
create table date_num as
select INPUT (date,date9.)as ndate format=date9.
from date_char;
quit;
III) Using Arrays;

*using arrays to convert character date variables aestdtc and aeendtc into numeric variables:

data c_date;
set date;
array cha_date { 2 } $ 10 aeendtc aestdtc; array num_date { 2 } aeendt aestdt;
do i = 1 to 2;
num_date{ I } = input(cha_date{ i }, anydtdte10.);
end;
drop
aeendtc aestdtc;
run;


*Using Arrays converting variables from numeric to character;
data new; set old (rename=(b1=a1 b2=a2 b3=a3 b4=a4 b5=a5));
array aa $ b1 b2 b3 b4 b5;
array bb a1 a2 a3 a4 a5; do I=1 to dim(aa);
aa (i)= put(bb(i),8.0);
end;
drop I a1 a2 a3 a4 a5 a6; run;

*Using Arrays converting variables from character to numeric;
*By using the INPUT instead of PUT we can convert character variables into numeric;

data new ;
set old (rename=(b1=a1 b2=a2 b3=a3 b4=a4 b5=a5));
array aa b1 b2 b3 b4 b5;
array bb $ a1 a2 a3 a4 a5;
do I=1 to dim (aa);
aa (i)=input(bb(i),8.0);
end;
drop I a1 a2 a3 a4 a5 a6;
run;


Note: Dont forget to use $ sign inside the array statement when converting the vars from Num to char or Char to num.

Monday, December 22, 2008

12 Ways to save SAS data

12 Ways to save SAS data
source/direct link: http://wiki.binghamton.edu/index.php/12_Ways_to_save_SAS_data

1 Using CARDS; File Save (with editor window active)
1.1 To get it back

2 Cut and Paste (ex. Fixed format data)
2.1 To Get Them Back (space-delimited or fixed-formatted text)
2.2 Links

3 LIBNAME
3.1 To get it (them) back

4 Explorer with one LIBNAME statement?, menu/New Library button, Library Icon
4.1 To get it back

5 DATA ' '; direct access (including Unix ex.)
5.1 To get it back
5.2 Links:

6 In Excel Save As .XLS

7 In Excel Save As CSV (move to Unix, SSH)
7.1 To get it back(in SAS for Windows)
7.2 To Move (Upload) the CSV File from Windows to Unix
7.3 To get the data back (on Unix)
7.4 Links

8 In SPSS Save as
8.1 To get it (our data) back

9 In SAS Save as XPT file

10 XML

11 LIBNAME sasengine

12 DBF (to be written)

12.1 Saving a DBF (from Access?)
12.2 To get it back

13 13 ?PROC DATASOURCE?, ?Stata, ?S-Plus (to be written)

HOW TO CREATE A TRANSPORT FILE

CREATING A TRANSPORT FILE:




In SAS, how do I create a transport data set file?

In SAS, how do I create a transport data set file?

Source/direct link:http://kb.iu.edu/data/aevb.html

A SAS transport data set file is a machine-independent file that allows you to move a SAS data set from one operating system to another. A SAS transport data set file can also be read directly by several statistical software packages (e.g., SPSS, BMDP).

Following is an example of SAS code to copy the SAS data set file job1.sas7bdat to a SAS transport data set file portable.xpt in the outdata directory:

LIBNAME misc '~/work';
LIBNAME sasxpt XPORT '~/outdata/portable.xpt';


PROC COPY IN=misc OUT=sasxpt;

SELECT job1;
RUN;

In the example above:

The first LIBNAME statement aliases the library reference (libref) misc to the work directory.
The second LIBNAME statement aliases the libref sasxpt with the physical name of a SAS transport format file (in this case, portable.xpt in the outdata directory).

The COPY procedure copies one or more SAS data sets in the IN= libref (in this case, misc) to the OUT= libref (in this case, sasxpt).

The SELECT statement specifies that only the file job1.sas7bdat should be included in the transport file portable.xpt .

The file and pathnames in the above example follow Unix conventions. If you are using SAS for Windows, you should follow the appropriate filename and pathname conventions.

For example, in SAS for Windows, the two LIBNAME statements in the above example would instead be:

LIBNAME misc 'c:\work';
LIBNAME sasxpt XPORT 'c:\outdata\portable.xpt';




Getting the Data into SAS

GETTING DATA INTO SAS


1. Importing data from other sources
• Use DBMS Copy, STAT Transfer, or some other software that performs similar functions.
• Use SAS/ACCESS or the SAS Import feature in Windows.
• Use PROC IMPORTS to read certain types of data files.
• Create a raw data (ASCII) file then use INPUT and INFILE statements to read the raw data file in SAS programs.

2. Creating SAS data sets with raw (ASCII) data files
• Data are placed within the SAS program directly after the DATALINES or CARDS statement. Use INPUT statement to read the data.

*** Read data within SAS program as SAS data file ***;

DATA survey;
INPUT Q1Major Q2Degree Q3SASpast Q4ProjData Q5SASexprn Q6SASfutur;
DATALINES;
2 1 1 1 2 8
2 2 0 0 0 7
2 1 0 0 1 4
2 1 0 1 0 8
more raw data lines
;
RUN;

• In the above example,
– Spaces (or other types of delimiters) are required between data values
– Missing values must be specified (blank spaces are treated as separators)
– Character data values cannot have embedded spaces (e.g., names and addresses)
– Cannot skip unwanted data column when reading in data

Getting Data Into SAS
• Data are placed in an external ASCII file. Use INFILE statement to identify the location of the external file and Use INPUT statement to read the data.

*** Read text (ASCII) data file as SAS data file ***;
DATA TXT_Surv;
INFILE "C:\Class\M403B\M403B2005Lab1Surv.TXT" missover;
INPUT @1 Q1Major 3.
@6 Q1Spec $char10.
@17 Q2Degree 3.
@22 Q2Spec $char10.
@33 Q3SASpast 2.
@36 Q4ProjData 2.
@39 Q5SASexprn 3.
@45 Q6SASfutur 3.
;
RUN;
Or,

*** Read text (ASCII) data file as SAS data file ***;
DATA TXT_Surv;
INFILE "C:\Class\M403B\M403B2005Lab1Surv.TXT" missover;
INPUT Q1Major 1-3
Q1Spec $ 6-15
Q2Degree 17-20
Q2Spec $ 22-31
Q3SASpast 33-34
Q4ProjData 36-37
Q5SASexprn 39-41
Q6SASfutur 45-47
;
RUN;

• In the above two examples,
– Spaces (or other types of delimiters) are not required between data values
– Missing values can be left blank
– Character data values can have embedded spaces (e.g., names and addresses)
– Can skip unwanted data columns when reading data

Getting Data Into SAS
• Check the SAS log to make sure the data are read correctly into SAS (e.g., number of observations, length of each record, and number of variables, etc.).

• If the length of records is very long (> 256 characters), use option LRECL=XXX in the INFILE statement, where XXX is the length of the longest record in the ASCII data file.

• If one or more records have unassigned (blank) missing values at the end of the record, use option MISSOVER in the INFILE statement to prevent SAS from going to the next record prematurely. MISSOVER tells SAS to assign missing values to these variables before going to the next record.

• If each record has multiple lines, in the INPUT statement, use a forward slash (/) after variable names in each line, or use #2 before variable names on the second line (and #3 for the third line, etc.). Note: the same number of lines per record is required, and use blank line(s) if some records do not have data in the corresponding data line(s).

*** Read multiple lines per record ***;
DATA twoliner;
INFILEC:\rawdata\twolines.dat’;
INPUT id 1-8
sex 9-16
race 25-32 /
occ1 1-8
occ2 9-16
income 17-24
;
RUN;
Or,

*** Read multiple lines per record ***;
DATA twoliner;
INFILEC:\rawdata\twolines.dat’;
INPUT #1 id 1-8
sex 17-24
race 25-32
#2 occ1 1-8
occ2 9-16
income 17-24
;
RUN;

Getting Data Into SAS
• If ASCII file contains variable names or other information at the beginning of the file, use option FIRSTOBS=X in the INFILE statement to skip the first X line(s).

• The default delimiter in SAS is a blank space (‘ ‘). If other type of delimiter, such as comma (‘,’) is used to separate data values, need to specify option DLM=’,’ (or DELIMITER=’,’) in the INFILE statement. Use option DSD if consecutive delimiters indicate missing values in between. Note: for tab delimited data, use option EXPANDTABS in the INFILE statement.

• If some columns of data values are not need, skip those columns when reading the raw data file. For example, the following program reads variables ID, OCC1, and INCOME, while other variables are not included in the output SAS data.

*** Read multiple lines per record and skip unwanted variables ***;
DATA twoliner;
INFILEC:\rawdata\twolines.dat’;
INPUT #1 id 1-8
#2 occ1 1-8
income 17-24
;
RUN;

Getting Data Into SAS :

*** Define LIBREF MySASlib as a SAS library to store permanent SAS data sets ***;
LIBNAME MySASlib "C:\Class\M403B";

*** Create a permanent SAS data set Survey in LIBREF MySASlib in data step ***;
DATA MySASlib.Survey;
INFILE "C:\Class\M403B\M403B2005Lab1Surv.TXT" missover;
INPUT @1 Q1Major 3.
@6 Q1Spec $char10.
More input variables
;
RUN;

– MySASlib is the LIBREF, which refers to a subdirectory "C:\Class\M403B", and Survey is the name of the SAS data set, which can be found in the above subdirectory. Think of LIBREFs as pointers, where LIBREFs point to the location where permanent SAS data sets are located.

– More than one LIBNAME statements can be specified in one SAS program, if two or more permanent SAS data sets are created or used in different subdirectories.
• Identifying permanent SAS data sets

– Identify the permanent SAS data set by using an appropriate LIBREF, defined with a LIBNAME statement previously.

*** Examine the contents of a permanent SAS data set Survey located in LIBREF MySASlib using PROC CONTENTS ***;

PROC CONTENTS DATA=MySASlib.Survey Position;
RUN;


– The PROC CONTENTS option POSITION produces output with variables listed in the order in which they appear in the data set (by position), as well as alphabetically (the default).

Getting Data Into SAS

4. Exporting data from SAS data sets
• Use DBMS Copy, STAT Transfer, or some other software that performs similar functions.

• Use SAS/ACCESS or the SAS Export feature in Windows.

• Use PROC EXPORTS to write certain types of data files.

• Create raw data (ASCII) files using PUT and FILE statements in SAS program.

*** Define LIBREF M403 as a SAS library to store permanent SAS data sets ***;
LIBNAME M403 "C:\Class\M403B";
*** Write a text (ASCII) data file from a SAS data set ***;
DATA _NULL_;
SET M403.Survey
FILE "C:\Class\M403B\M403B_SurveyData.TXT" ;
PUT @1 Q1Major 3.
@6 Q1Spec $char10.
@17 Q2Degree 3.
@22 Q2Spec $char10.
@33 Q3SASpast 2.
@36 Q4ProjData 2.
@39 Q5SASexprn 3.
@45 Q6SASfutur 3.
;
RUN;

– In the above example, an ASCII file M403B_SurveyData.TXT is created and stored in a subdirectory "C:\Class\M403B" using a permanent SAS data set Survey stored in a SAS library referred by M403, which points to the subdirectory "C:\Class\M403B".

– If column numbers had not been specified, SAS would place a space between each variable value in the ASCII file.

– The SAS special data set _NULL_ is used because a SAS data set is not being created in this data step (_NULL_ tells SAS to not go to the trouble of building a SAS data set).

– It is not necessary to identify character variables with the $ sign because SAS already knows which variables are character variables.

Getting Data Into SAS
5. Transporting SAS data sets from one type of host (e.g., Unix) to another (e.g., Windows)

• In SAS Version 6 and lower, if a SAS data set is created in one type of host (e.g. Unix) and is used in another (e.g. Windows), a SAS transport file needs to be created first. This is not necessary in SAS Version 8 or higher. Using PROC COPY, the following example creates a SAS transport file called SAS_Surv.XPT from the SAS data set Survey located in the C:\Class\M403B subdirectory:

*** Define LIBREF M403 as a SAS library to store permanent SAS data sets ***;
LIBNAME M403 "C:\Class\M403B";

*** Define LIBREF XPT as a SAS transport file to be used in different host systems ***;
LIBNAME XPT XPORT "C:\Class\M403B\SAS_Surv.XPT";

*** Define LIBREF XPT as a SAS transport file to be used in different host systems ***;
PROC COPY IN=M403 OUT=XPT;
SELECT Survey;
RUN;

– XPORT is a SAS keyword that tells SAS to create a transport file.

– M403 is the LIBREF pointing to the location of the SAS data set; and XPT is the LIBREF referring to the SAS transport file.

– IN, OUT, and SELECT are SAS keywords specifying the input data set location (SAS data set), output data file (SAS transport file), and the name of the SAS data set from which the transport data set will be made.

• After the SAS transport file is created, SAS data set(s) contained in the transport file can be accessed directly in the "new" host. The following example creates a SAS dataset Survey from a SAS transport file SAS_Surv.XPT, which is located in C:\Class\M403B.

PROC COPY IN=XPT OUT=M403;
RUN;
DATA M403.Survey;
SET XPT.Survey;
RUN;