Discover More Tips and Techniques on This Blog

Converting SAS datasets to SPSS

If you want to view SAS dataset in SPSS you can use GET SAS command of SPSS.

Here is the syntax;
get sas data='C:\data\class.sas7bdat'.

For conversion of SAS to SPSS we need to see if any formats assigned to variables in the dataset or not.
If there are no formats then we just follow following steps to convert SAS dataset to SPSS.

**STEP1: Creating .xpt file of a SAS dataset using Proc COPY.**

libname SAS 'c:\sas\data\';
libname SPSS xport 'c:\sas\data\class.xpt';

proc copy in=sas out=spss;
select class;
run;


**STEP2: Use SPSS command to convert the transport format SAS file to SPSS;**


You should use following commands to convert transport format file to SPSS data.


get sas data='c:\sas\data\class.xpt'.
execute.

*******************************************************************************************;
If there are formats then we need to convert the formats catalog to a SAS data set before converting the SAS dataset into a .XPT file. This has to be done inside SAS to use the SAS formats as the value labels for SPSS data.

**STEP1: Creating .xpt file of a SAS dataset using Proc COPY.**


libname formats 'c:\sas\catalogs';

proc format library=formats cntlout=fmts;
run;


***Transport file of SAS formats;**


libname fmt2spss xport 'c:\sas\fmts.xpt';


proc copy in=work out=fmt2spss;
select fmts;
run;

***Transport file of SAS dataset.**


libname SAS 'c:\sas\data';
libname SPSS xport 'c:\sas\data\class.xpt';


proc copy in=sas out=spss;
select class;
run;


**STEP3: Use SPSS command to convert the transport format SAS file and Formats to SPSS;**


*Use following SPSS command to convert transport format file to SPSS data;


get sas data='c:\sas\data\class.xpt' /formats='c:\sas\fmts.xpt'.
execute .

('DiggThis’)

Delete observations from a SAS data set when all or most of variables has missing data

/* Sample data set */
data missing;
input n1 n2 n3 n4 n5 n6 n7 n8 c1 $ c2 $ c3 $ c4 $;
datalines;
1 . 1 . 1 . 1 4 a . c .
1 1 . . 2 . . 5 e . g h
1 . 1 . 3 . . 6 . . k i
1 . . . . . . . . . . .
1 . . . . . . . c . . .
. . . . . . . . . . . .
;
run;

*If you want to delete observation  if the data for every variable is missing then use the following code;

*Approach 1: Using the coalesce option inside the datastep;

data drop_misobs;

set missing;
if missing(coalesce(of _numeric_)) and missing(coalesce(of _character_)) then delete;
run;

 
Pros:
*Simple code
Cons;
*This code doesn't work if we want to delete observation based on specific variables and not all of them.

*Approach 2:Using N/NMISS option inside the datastep;

data drop_missing;

set missing;
*Checks the Non missing values using ;

if n(n1, n2, n3, n4, n5, n6, n7, n8, c1, c2, c3, c4)=0 then delete;
run;

data drop_missing;

set missing;
*Checks the missing values using nmiss option;

if nmiss(n1, n2, n3, n4, n5, n6, n7, n8, c1, c2, c3, c4)=12 then delete; *12 is the total number of variables in the dataset missing.;
run;

*If you want to delete records based on few variables and don't want to type all the variable names in the IF-THEN clause use the following code;

*Task: Delete observations from the dataset if all variables in the dataset except (N1 and C1) has missing data;

proc contents data=missing out=contents(keep=memname name);
run;


*Create a macro variable names with list of variable names in the dataset;
proc sql;
select distinct name into:names separated by ','
from contents(where=(upcase(name) ^in ('N1','C1'))) where memname='MISSING'; *Excluding 2 variables in the dataset;
quit;


data remove_missing;
set missing;
if n(&names) lt 1 then delete;
run;

('DiggThis’)

Diffrence Between RUN and QUIT statements

Folkes,  Here is the answer from Andrew Karp..... Direct link

****************************************************************************************************;
Allow me to weigh in on this topic. It comes up alot when I give SAS training classes. First, RUN and QUIT are both "explicit step boundaries" in the SAS Programming Language. PROC and DATA are "implied step boundaries."

Example 1: Two explicit step boundaries.

DATA NEW;
SET OLD:
C = A + B;
RUN;
PROC PRINT DATA=NEW;
RUN;

In this example, both the data and the proc steps are explicitly "ended" by their respective RUN statements.

Example 2: No explicit step boundaries.

DATA NEW;
SET OLD;
C = A + B;
PROC PRINT DATA=NEW;

In this example, the data step is implicitly terminated by the PROC statement. But, there is no step boundary for the PROC PRINT step/task, so it will not terminate unless/until the SAS supervisor "receives" a step boundary.

Some PROCS support what is called RUN group processing. These include PROCs DATASETS and PLOT in the BASE Module, PROCs REG and GLM in the STAT module and ARIMA in the ETS module.

Example 3: PROC DATASETS with RUN group processing.

PROC DATASETS LIBRARY = MYLIB;
MODIFY SET1;
FORMAT VAR1 DOLLAR12.2;
LABEL VAR1 "Dollars Paid";
RUN;

DELETE SET2;
RUN;

CHANGE SET3 = SET2;
RUN;
QUIT;


In this example, three separate data mangement tasks are carried out in the order they were written/coded. First a label and format are added to the descriptor portion of SET1, then SET 2 is deleted, and then SET3 is renamed to SET2. The RUN ends each command in the PROC DATASETS step (MODIFY, DELETE, CHANGE) and QUIT ends the step. If the explicit step boundary had been omitted, the step would have been implicity terminated by a subsequent PROC or DATA statement. If there were no implied step boundary following the last RUN command then the PROC DATASETS step would not terminate.

The same holds true with other RUN-group enabled PROCs in SAS Software. The ARIMA procedure in the ETS module, for example, implements what is called the Box-Jenkins methodology to analyze a time series and then generate future forecasted values from the existing series. There are three parts to this methodology, which are implmented in PROC ARIMA using (in this order), the IDENTIFY, ESTIMATE and FORECAST statements. The output from each statement is needed by PROC ARIMA to move to the next step in the process, and an experienced forecaster can look at the output generated by the IDENTIFY statement and then write the appropriate ESTIMATE statement syntax, and then do the same thing with the output generated by the ESTIMATE statement to write the proper FORECAST statement syntax. Once the analyst is satisfied with their model, they can terminate the PROC ARIMA step with a QUIT statement and move on to the next part of their project.

I hope this has been helpful.

Andrew Karp
Sierra Information Services

*******************************************************************************************************;

Disclosure:

In the spirit of transparency and innovation, I want to share that some of the content on this blog is generated with the assistance of ChatGPT, an AI language model developed by OpenAI. While I use this tool to help brainstorm ideas and draft content, every post is carefully reviewed, edited, and personalized by me to ensure it aligns with my voice, values, and the needs of my readers. My goal is to provide you with accurate, valuable, and engaging content, and I believe that using AI as a creative aid helps achieve that. If you have any questions or feedback about this approach, feel free to reach out. Your trust and satisfaction are my top priorities.