Discover More Tips and Techniques on This Blog

Easy way to UPCASE variable names of SAS dataset

option VALIDVARNAME=UPCASE;

Use trhe above option statement to upcase the variable name of the SAS dataset irrespective of type of variable in the dataset (character or numeric).


The following example shows how the option sattement VALIDVARNAME=UPCASE works.

proc contents data=sashelp.class out=test;

run;




Note: Propcase variable names. 

*Upcasing the variables;

option validvarname=upcase;
proc sort data=sashelp.class out=test; run;



Because of the option statement. Ex:  'Age'  becomes 'AGE' and 'Height' becomes 'HEIGHT' etc.

See the SAS Language Reference dictionary to get more details.


Another way to do this is to use a macro and I call it as UPCASE macro.

proc sort data=sashelp.class out=test;

by name;
run;
 
%macro upcase (lib,dsn);


*Create a macro variable with the total number of variable count;
data _null_;
set sashelp.vtable(where=(libname="&LIB" and memname="&DSN"));
call symput('nvars',nvar);
run;

*Create a macro variable with variable names;data _null_;
set sashelp.vcolumn(where=(libname="&LIB" and memname="&DSN"));
call symput(cats("var",_n_),name);
run;

proc datasets library=&LIB;
modify &DSN;
%do i = 1 %to &nvars;
rename &&var&i=%upcase(&&var&i);
%end;
;
quit;
run;
%mend;
%upcase(WORK,TEST);

MDY Function

The MDY function converts MONTH, DAY, and YEAR values to a SAS date value. For example, MDY(10,19,1999) returns the SAS date value '19OCT99'D.

Syntax:    MDY(month,day,year)

Arguments

month : specifies a numeric expression that represents an integer from 1 through 12.

day    :specifies a numeric expression that represents an integer from 1 through 31.

year   :specifies a two-digit or four-digit integer that represents the year. The YEARCUTOFF= system    option defines the year value for two-digit dates.

If you know month, day, and year values, it’s very easy to derive date variable. You just need to use MDY function (of course, month, day, and year should be numeric). However, if the data is character then the conversion to numeric should occur first and then the conversion to the date value.

Example:

*When month, day, and year has numeric values;
data test;
year=1999;
month=12;
day=19;
newdate=mdy(month,day,year);
format newdate yymmdd10.;
run;


*When month, day, and year has character values;
data test;
year='1999';
month='12';
day='19';
newdate=mdy(input(month,2.),input(day,2.),input(year,4.);
format newdate yymmdd10.;
run;

You can use CATX and Input functions to get the same result.

newdate=input(catx('-',year,month,day),yymmdd10.);

Ref: SAS Help and Documentation.

Overview on CDISC Implementation

CDISC Advantages



CDISC has developed a set of data standards to enhance

data collection,
management,
analysis, and
reporting efficiencies,
improve safety monitoring, and
streamline the review and approval process for investigational treatments.

Under the ICH’s electronic Common Technical Document (eCTD) guidance, CDISC Study Data Tabulation Model (SDTM) is the preferred standard for content format and structure of clinical data for all clinical studies. Based on proposed federal regulations, the FDA will mandate that all clinical trial submissions be in electronic format and that the content comply with data standards guidance. Veristat helps our clients by not only implementing these standards on a project or program, but also by providing our clients with an understanding of the CDISC standards.

source:veristatinc.com




source: http://cro.businessdecision.com/


('DiggThis’)

How to convert the datetime character string to SAS datetime value? (ANYDTDTM and MDYAMPM formats)

When we have a string like this "9/01/2010 11:52:54 AM" and would like to translate the string to a numeric SAS date time variable, most of the times we use SCAN function to extract the information to get the DATETIME format. This is definitely a tedious job.

SAS formats (MDYAMPM, ANTDTDTM) comes to rescue us. Here is how it works.

data test;
length date $25;

date="9/01/2010 11:52:54 AM";
*Convert the character string to SAS datetime value;
datetimevar =input(date,mdyampm25.2);
datetimevar1 =input(date,anydtdtm20.);
*Apply format to the SAS date time value;
format datetimevar datetimevar1 datetime19.;
run;

Result: 01SEP2010:11:52:54


*ANYDTDTM and MDYAMPM informats work together when the datetime value has AM PM specified or day, month, and year components are not ambiguous.

The MDYAMPMw. format writes datetime values with separators in the form mm/dd/yy hh:mm AM PM, and requires a space between the date and the time.
The ANYDTDTM w. format writes datetime values with separators in the form dd/mm/yy hh:mm AM PM, and requires a space between the date and the time.


When a value is read with ANYDTDTMw. informat and the style of the value is dd/dd/dd(dd) tt:tt:tt AM
PM, the MDYAMPMw.d informat is called to read the value. If the AM PM component is not present, the MDYAMPMw.d informat is used as long as the month and day components aren't ambiguous. If they are ambiguous, the value of the DATESTYLE= system option is used to determine the order of month, day, and year components.

MDYAMPMw.d Format
_______________________________________

Writes datetime values in the form mm/dd/yy hh:mm AM PM. The year can be either two or four digits.

Details

The MDYAMPMw.d format writes SAS datetime values in the following form:

mm/dd/yy hh:mm

The following list explains the datetime variables:

mm                     is an integer from 1 through 12 that represents the month.

dd                       is an integer from 1 through 31 that represents the day of the month.

yy or yyyy         specifies a two-digit or four-digit integer that represents the year.

hh                       is the number of hours that range from 0 through 23.

mm                     is the number of minutes that range from 00 through 59.

AM PM             specifies either the time period 00:01-12:00 noon (AM) or the time period 12:01-12:00   midnight (PM). The default is AM.

date and time separator characters     is one of several special characters, such as the slash (/), colon (:), or a blank character that SAS uses to separate date and time components.

Source:  SAS(R) 9.2 Language Reference: Dictionary, Third Edition

Examples:

These examples illustrate how the ANYDTDTMw. informat reads values based upon an AM PM specification and the DATESTYLE= system option.
--------------------------------------------------------------------------------
/* Since AM is specified with the value, the ANYDTDTM informat  is called to read the datetime value. */

options datestyle=dmy;


data test1;
format xtext $22. xdate DATETIME18.;
xtext="07/01/2008 12:30:00 AM";
xdate=input(xtext,ANYDTDTM30.);

proc print;
run;

/* Since AM PM aren't specified and the month and day components  are ambiguous, the DATESTYLE= system option is used to determine their order. */

options datestyle=dmy;


data test2;
format xtext $22. xdate DATETIME18.;
xtext="07/01/2008 12:30:00";
xdate=input(xtext,ANYDTDTM30.);


proc print;
run;
Source: http://support.sas.com/kb/37/309.html


If you have a date "Saturday, November 01, 2008"  and would like to convert it to numeric with DDMMYY10. format here is the way to do it.

data fmtchnge;length worddate $40;
worddate = "Saturday, November 01, 2008";
dmyfmt= input(substr(worddate,findc(worddate,',')+1),anydtdte30.);
*dmyfmt=input(substr(worddate,index(worddate,',')+2),anydtdte32.);

format dmyfmt mmddyy10.;
run;



 
 ('DiggThis’)

SAS Keyboard Shortcuts

Here are the few shortcuts you need to know to speed up the code writing. These work in both EPG (Enterprise Guide) and SAS Enhanced Editor.
Shortcuts and their descriptions:
Remember that the keyboard shortcuts listed here are default.

Selection Operations:1) Comment the section with line comments (/): press CTL + /
2) Undo the comment: press CTL + SHIFT + /
3) Convert selected text to lowercase: press CTL + SHIFT + L
4) Convert selected text to uppercase: press CTL + SHIFT + U
Shortcuts (pre-defined) CTRL+Shift+L or +U (only for the enhanced editor), which convert all selected text into lowercase or uppercase respectively. These become very handy  when we insert the text by copy+paste.

5) Indent selected section: press TAB
6) Un-indent selected section: press SHIFT + TAB
7) To move curser to the matching DO/END statement: press
     ALT + [ or
     ALT + { or  
     ALT+] or
     ALT + }

 Miscellaneous:1) To see the desktop: press (Window’s Key + M) or (Window’s Key + D)

Navigate around Text:

1) Move to beginning of line: Press Home
2) Move to top: Press CTRL+Home
3) Move to end: Press CTRL+End

Mark (Highlight) the Text:

1) Mark to beginning of line: Press SHIFT+Home
2) Mark to end of line: Press SHIFT+End
3) Mark to top: Press SHIFT+CTRL+Home
4) Mark to end: Press SHIFT+CTRL+End

Window Control:

1) To cascade the windows: Press SHIFT+F5
2) To Tile the windows: Press SHIFT+F4
3) To go to the next window: Press CTRL+F6
4) To close the active window: Press CTRL+F4
5) To exit the SAS system: Press ALT+F4


('DiggThis’)

Mastering Global and Local Macro Variables in SAS: Essential Techniques and Best Practices for SDTM Programmers

Mastering Global and Local Macro Variables in SAS: A Detailed Guide for SDTM Programmers

Mastering Global and Local Macro Variables in SAS: A Detailed Guide for SDTM Programmers

In SAS programming, especially when dealing with complex tasks such as SDTM (Study Data Tabulation Model) dataset creation, macro variables play a critical role in automating processes and managing large amounts of data efficiently. Understanding the distinction between Global and Local macro variables, and how to use them appropriately, is essential for writing clean, maintainable, and robust code.

What are Macro Variables in SAS?

Macro variables in SAS are placeholders that store text strings, which can be reused throughout your SAS programs. They are part of the SAS Macro Language and are used to make your code more dynamic and flexible. By using macro variables, you can avoid hardcoding values and make your code easier to modify and scale.

There are two primary types of macro variables in SAS:

  • Global Macro Variables: These are accessible from anywhere in your SAS session after they are defined. They persist until the end of the session or until explicitly deleted.
  • Local Macro Variables: These are only accessible within the macro where they are created. They are automatically deleted when the macro finishes executing.

Creating and Using Global Macro Variables

Global macro variables are ideal for values that need to be consistent and accessible across multiple programs or within different parts of the same program. In SDTM programming, global macro variables can be used to define study-specific parameters, dataset paths, and other constants that are used throughout your code.

How to Create Global Macro Variables

To create a global macro variable, you can use either the %GLOBAL statement or the %LET statement at the top level of your SAS program or macro. The %GLOBAL statement explicitly declares the macro variable as global, while %LET can implicitly create a global variable if used outside of a macro.

* Using %GLOBAL to declare a global macro variable;
%global study_id;
%let study_id = ABC123;

* Using %LET at the top level (implicitly global);
%let sdtm_path = C:\SASProjects\SDTM\datasets;

* Accessing the global macro variable;
%put The study ID is &study_id;
%put The SDTM path is &sdtm_path;

In the example above, both study_id and sdtm_path are global macro variables that can be accessed anywhere in your SAS session. This is particularly useful when you need to refer to these values in multiple DATA steps, procedures, or macros.

Examples in SDTM Programming

Global macro variables are frequently used in SDTM programming to store key information such as dataset paths, study identifiers, and other parameters that need to be consistent across multiple datasets. Here are some practical examples:

Example 1: Managing Dataset Paths

In a typical SDTM project, you might have several datasets stored in a specific directory. Instead of hardcoding the directory path in every step, you can define a global macro variable to store the path:

%global sdtm_path;
%let sdtm_path = C:\SASProjects\SDTM\datasets;

data dm;
    set "&sdtm_path.\dm.sas7bdat";
run;

data ae;
    set "&sdtm_path.\ae.sas7bdat";
run;

data lb;
    set "&sdtm_path.\lb.sas7bdat";
run;

By using the sdtm_path global macro variable, you can easily change the dataset path in one place, and the change will be reflected wherever the macro variable is used. This approach reduces errors and makes your code more flexible.

Example 2: Study-Specific Information

In SDTM programming, certain information like the study ID, sponsor name, or protocol version may be required across multiple datasets. Defining these as global macro variables ensures consistency and makes your code easier to update.

%global study_id sponsor protocol_version;
%let study_id = ABC123;
%let sponsor = PharmaCorp;
%let protocol_version = 1.0;

data dm;
    set sdtm.dm;
    studyid = "&study_id";
    sponsor = "&sponsor";
    protocol = "&protocol_version";
run;

data ae;
    set sdtm.ae;
    studyid = "&study_id";
    sponsor = "&sponsor";
    protocol = "&protocol_version";
run;

With these global macro variables, you ensure that the study ID, sponsor, and protocol version are consistent across all datasets without the need to manually enter these values in each step.

Creating and Using Local Macro Variables

Local macro variables are designed for use within specific macros, where they are created, used, and then automatically deleted when the macro finishes execution. This makes them ideal for temporary calculations or for ensuring that variable names do not conflict with other variables in your program.

How to Create Local Macro Variables

Local macro variables are created using the %LOCAL statement within a macro. Any macro variables created with %LOCAL are only accessible within that macro and are not visible outside of it.

%macro calculate_age(birth_date, ref_date);
    %local age_years;
    
    age_years = intck('year', &birth_date, &ref_date);
    if month(&ref_date) lt month(&birth_date) or
       (month(&ref_date) = month(&birth_date) and
        day(&ref_date) lt day(&birth_date)) then
        age_years = age_years - 1;
    
    &age_years
%mend calculate_age;

In this example, age_years is a local macro variable that is used to store the calculated age. Once the macro finishes executing, age_years is deleted and is not accessible outside of the macro.

Examples in SDTM Programming

Local macro variables are particularly useful in SDTM programming when you need to perform temporary calculations, create intermediate variables, or ensure that variable names do not conflict with those in other parts of your program.

Example 1: Deriving Variables within a Macro

When deriving variables such as age, visit number, or treatment period within a macro, local macro variables can be used to store intermediate values.

%macro derive_variables(data_in, data_out);
    %local age visit_num trt_period;
    
    data &data_out;
        set &data_in;
        age = %calculate_age(dob, rfstdtc);
        visit_num = input(visit, best.);
        trt_period = trtan;
    run;
%mend derive_variables;

%derive_variables(sdtm.dm, derived.dm);

In this example, the local macro variables age, visit_num, and trt_period are used to store intermediate values during the derivation process. These variables are local to the derive_variables macro and do not interfere with other variables outside the macro.

Example 2: Avoiding Name Conflicts in Large Projects

In large SDTM projects, multiple programmers may use similar variable names across different macros. Using local macro variables helps prevent conflicts and unintended overwriting of values.

%macro create_dm_var();
    %local subj_id;
    subj_id = "0001";  /* Local variable used within this macro */
    call symputx('subj_id', subj_id);
%mend create_dm_var;

%create_dm_var;
%put Subject ID: &subj_id;

In this example, subj_id is a local macro variable within the create_dm_var macro. This ensures that the variable does not conflict with any global variables or other variables in different macros, preserving the integrity of the data.

Advanced Techniques: Using Global and Local Macro Variables Together

In complex SDTM programming tasks, you may find yourself using both global and local macro variables together to achieve more sophisticated data manipulations. For example, you might use a global macro variable to set a general condition or path, and local macro variables within macros to perform specific calculations or transformations.

Example: Creating a Reusable Macro for Multiple Studies

Suppose you need to create a macro that processes SDTM datasets for multiple studies. You can use global macro variables to store study-specific information and local macro variables for intermediate calculations.

%global study_id sdtm_path;
%let study_id = XYZ789;
%let sdtm_path = C:\SASProjects\SDTM\datasets;

%macro process_sdtm(data_set);
    %local age visit_num;
    
    data processed_&data_set;
        set "&sdtm_path.\&data_set..sas7bdat";
        age = %calculate_age(dob, rfstdtc);
        visit_num = input(visit, best.);
    run;
    
    proc print data=processed_&data_set;
    run;
%mend process_sdtm;

%process_sdtm(dm);
%process_sdtm(ae);

In this example, the global macro variables study_id and sdtm_path store information that is consistent across the study, while local macro variables age and visit_num are used for specific calculations within the macro. This approach makes the macro reusable across different datasets while ensuring that variable names do not conflict with others.

Best Practices for Using Macro Variables in SDTM Programming

To get the most out of macro variables in SDTM programming, consider the following best practices:

  • Use Descriptive Names: Choose clear and descriptive names for your macro variables to make your code easier to understand and maintain.
  • Scope Control: Be mindful of the scope of your macro variables. Use local macro variables for temporary calculations and global macro variables for values that need to be consistent across multiple programs.
  • Document Your Code: Include comments in your code to explain the purpose of each macro variable, especially if they are used in multiple places.
  • Reuse and Modularize: Create reusable macros that can be applied across different studies or datasets, using macro variables to customize behavior.
  • Test Thoroughly: Test your macros thoroughly to ensure that global and local variables are behaving as expected, especially in complex programs.

Conclusion

Mastering the use of global and local macro variables is essential for effective SDTM programming in SAS. Global macro variables provide a way to store and access values consistently across your entire SAS session, while local macro variables allow for precise control within specific macros, avoiding conflicts and unintended overwrites. By understanding and applying these concepts, you can create more flexible, maintainable, and robust SAS programs.

Whether you're managing study-specific information, performing complex data transformations, or developing reusable macros, the strategic use of macro variables will enhance your efficiency and effectiveness as an SDTM programmer.

How to get the details of formats from the format libraries:

If you are like me wanted to get the details of formats stored in the format library, here are two easy ways ….Proc catalog or Proc format

Proc Catalog: Proc catalog will give us the details about name and type (numeric or character) of formats

Syntax:
proc catalog catalog = frmtdir.formats;
contents;
run;

*FORMATS, is the name of the folder where all permanent formats are stored inside the library FRMTDIR.

*If you want to get the details of temporary formats use WORK in place of frmtdir.formats;

Proc Format:

Use either SELECT or EXCLUDE statements to choose the formats for which you want to get the details.

Syntax:

proc format library = frmtdir.formats;
select locfmt;
run;

This code will provide the details of LOCFMT in the FRMTDIR library.


Below code gives complte information about what's stored in your format catalogs.

libname frmtdir 'c:\saswork';


proc format library=frmtdir.formats cntlout = formats;
run;

Look at the output dataset 'formats' to get the details of the formats.

('DiggThis’)

Converting SAS datasets to SPSS

If you want to view SAS dataset in SPSS you can use GET SAS command of SPSS.

Here is the syntax;
get sas data='C:\data\class.sas7bdat'.

For conversion of SAS to SPSS we need to see if any formats assigned to variables in the dataset or not.
If there are no formats then we just follow following steps to convert SAS dataset to SPSS.

**STEP1: Creating .xpt file of a SAS dataset using Proc COPY.**

libname SAS 'c:\sas\data\';
libname SPSS xport 'c:\sas\data\class.xpt';

proc copy in=sas out=spss;
select class;
run;


**STEP2: Use SPSS command to convert the transport format SAS file to SPSS;**


You should use following commands to convert transport format file to SPSS data.


get sas data='c:\sas\data\class.xpt'.
execute.

*******************************************************************************************;
If there are formats then we need to convert the formats catalog to a SAS data set before converting the SAS dataset into a .XPT file. This has to be done inside SAS to use the SAS formats as the value labels for SPSS data.

**STEP1: Creating .xpt file of a SAS dataset using Proc COPY.**


libname formats 'c:\sas\catalogs';

proc format library=formats cntlout=fmts;
run;


***Transport file of SAS formats;**


libname fmt2spss xport 'c:\sas\fmts.xpt';


proc copy in=work out=fmt2spss;
select fmts;
run;

***Transport file of SAS dataset.**


libname SAS 'c:\sas\data';
libname SPSS xport 'c:\sas\data\class.xpt';


proc copy in=sas out=spss;
select class;
run;


**STEP3: Use SPSS command to convert the transport format SAS file and Formats to SPSS;**


*Use following SPSS command to convert transport format file to SPSS data;


get sas data='c:\sas\data\class.xpt' /formats='c:\sas\fmts.xpt'.
execute .

('DiggThis’)

Delete observations from a SAS data set when all or most of variables has missing data

/* Sample data set */
data missing;
input n1 n2 n3 n4 n5 n6 n7 n8 c1 $ c2 $ c3 $ c4 $;
datalines;
1 . 1 . 1 . 1 4 a . c .
1 1 . . 2 . . 5 e . g h
1 . 1 . 3 . . 6 . . k i
1 . . . . . . . . . . .
1 . . . . . . . c . . .
. . . . . . . . . . . .
;
run;

*If you want to delete observation  if the data for every variable is missing then use the following code;

*Approach 1: Using the coalesce option inside the datastep;

data drop_misobs;

set missing;
if missing(coalesce(of _numeric_)) and missing(coalesce(of _character_)) then delete;
run;

 
Pros:
*Simple code
Cons;
*This code doesn't work if we want to delete observation based on specific variables and not all of them.

*Approach 2:Using N/NMISS option inside the datastep;

data drop_missing;

set missing;
*Checks the Non missing values using ;

if n(n1, n2, n3, n4, n5, n6, n7, n8, c1, c2, c3, c4)=0 then delete;
run;

data drop_missing;

set missing;
*Checks the missing values using nmiss option;

if nmiss(n1, n2, n3, n4, n5, n6, n7, n8, c1, c2, c3, c4)=12 then delete; *12 is the total number of variables in the dataset missing.;
run;

*If you want to delete records based on few variables and don't want to type all the variable names in the IF-THEN clause use the following code;

*Task: Delete observations from the dataset if all variables in the dataset except (N1 and C1) has missing data;

proc contents data=missing out=contents(keep=memname name);
run;


*Create a macro variable names with list of variable names in the dataset;
proc sql;
select distinct name into:names separated by ','
from contents(where=(upcase(name) ^in ('N1','C1'))) where memname='MISSING'; *Excluding 2 variables in the dataset;
quit;


data remove_missing;
set missing;
if n(&names) lt 1 then delete;
run;

('DiggThis’)

Diffrence Between RUN and QUIT statements

Folkes,  Here is the answer from Andrew Karp..... Direct link

****************************************************************************************************;
Allow me to weigh in on this topic. It comes up alot when I give SAS training classes. First, RUN and QUIT are both "explicit step boundaries" in the SAS Programming Language. PROC and DATA are "implied step boundaries."

Example 1: Two explicit step boundaries.

DATA NEW;
SET OLD:
C = A + B;
RUN;
PROC PRINT DATA=NEW;
RUN;

In this example, both the data and the proc steps are explicitly "ended" by their respective RUN statements.

Example 2: No explicit step boundaries.

DATA NEW;
SET OLD;
C = A + B;
PROC PRINT DATA=NEW;

In this example, the data step is implicitly terminated by the PROC statement. But, there is no step boundary for the PROC PRINT step/task, so it will not terminate unless/until the SAS supervisor "receives" a step boundary.

Some PROCS support what is called RUN group processing. These include PROCs DATASETS and PLOT in the BASE Module, PROCs REG and GLM in the STAT module and ARIMA in the ETS module.

Example 3: PROC DATASETS with RUN group processing.

PROC DATASETS LIBRARY = MYLIB;
MODIFY SET1;
FORMAT VAR1 DOLLAR12.2;
LABEL VAR1 "Dollars Paid";
RUN;

DELETE SET2;
RUN;

CHANGE SET3 = SET2;
RUN;
QUIT;


In this example, three separate data mangement tasks are carried out in the order they were written/coded. First a label and format are added to the descriptor portion of SET1, then SET 2 is deleted, and then SET3 is renamed to SET2. The RUN ends each command in the PROC DATASETS step (MODIFY, DELETE, CHANGE) and QUIT ends the step. If the explicit step boundary had been omitted, the step would have been implicity terminated by a subsequent PROC or DATA statement. If there were no implied step boundary following the last RUN command then the PROC DATASETS step would not terminate.

The same holds true with other RUN-group enabled PROCs in SAS Software. The ARIMA procedure in the ETS module, for example, implements what is called the Box-Jenkins methodology to analyze a time series and then generate future forecasted values from the existing series. There are three parts to this methodology, which are implmented in PROC ARIMA using (in this order), the IDENTIFY, ESTIMATE and FORECAST statements. The output from each statement is needed by PROC ARIMA to move to the next step in the process, and an experienced forecaster can look at the output generated by the IDENTIFY statement and then write the appropriate ESTIMATE statement syntax, and then do the same thing with the output generated by the ESTIMATE statement to write the proper FORECAST statement syntax. Once the analyst is satisfied with their model, they can terminate the PROC ARIMA step with a QUIT statement and move on to the next part of their project.

I hope this has been helpful.

Andrew Karp
Sierra Information Services

*******************************************************************************************************;

Sending the LOG and OUTPUT from PC SAS to a seperate file

Here is how to direct the SAS LOG file and or SAS Output  to a seperate file.

Approach 1: Using Display Manager Statements;

filename log 'C:\temp\logfile.log';

filename out 'C:\temp\output.lst';

*Select only male students and age less than 16;
proc sql;
create table males as
select age, height, weight
from sashelp.class
where sex='M' and age lt 16 
order by age;
quit;


*Get the descriptive statistics for height variable by age;
proc means data=males ;
by age;
var height;
output out=htstats mean=mean n=n std=sd median=med min=min max=max;
run;


DM 'OUT;FILE OUT REP;';
DM 'LOG;FILE LOG REP;';

Information about Display Manager Commands:
DEXPORT and DIMPORT: DISPLAY MANAGER commands used to IMPORT and EXPORT the Tab delimited (Excel and .CSV) files;
SAS Display Manager Commands



Approach 2: Using Proc PRINTTO procedure;

Refer:  How to save the log file or what is PROC PRINTTO procedure 

('DiggThis’)

Random Sample Selection

Last week my manager asked me to randomly pick 10%observations from a large data set and then create a listing so that the Data management programmers can QC the data. I want to share some thoughts here … how easy and simple to do random sampling.
Approach 1:

Data step Approach: In this approach, the observations are shuffled using the RANUNI function which assigns a random number to each observation.

Step1: Generating the Random Vector (shuffling) using the RANUNI function;
The RANUNI function generates a random number from a continuous uniform distribution (the interval (0, 1).

Step2: After assigning a random number to each record, the records can then be sorted in ascending or descending order of the random numbers.;

data randsamp ;
input patno @@;
random=RANUNI(-1);
* RANUNI function to assign a random number to each record.;
* Here the seed is negative integer (-1) so the results are not replicable.;
cards;
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83
;
run;

*Sort the records by increasing shuffling order;
proc sort data=randsamp; by random; run;

*CREATE MACRO VARIABLE FOR 9 (10% of 83 Subjects) RANDOM PATIENTS LIST;
proc sql;
select distinct(patno) into:random separated by "," from randsamp where monotonic() le 9;
quit;
%put &random;

Proc SQL Approach: 
In this approach also, the observations are shuffled using the RANUNI function which assigns a random number to each observation.

*RANDOMLY SELECTING 10% OF A LARGE DATASET using Proc SQl and RANUNI function.;
*The following Proc SQL code will create a table called rands consisting of approximately 10% of the records randomly selected from dataset randsamp;

DATA randsamp ;
input patno @@;
cards;
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83
;
run;
proc sql;
create table rands(where= (monotonic() eq 9)) as
select *, RANUNI(-1) as random from randsamp order by random ;
quit;

*Proc Surveyselect Approach;

*The following Proc Survey select code will create a table called sample consisting of approximately 10% of the records randomly selected from dataset randsamp;

DATA randsamp;
input patno @@;
cards;
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83
;
run;

PROC SURVEYSELECT DATA=randsamp2 OUT=sample SAMPSIZE=9 SEED=-1;
RUN;

Important Note:
The value generated by the RANUNI function depends on a seed. The seed should be a non-negative integer from 1 to 2,147,483,646 in order to replicate the results of the RANUNI function. That is, given the same seed, the function produces the same result. If no seed, zero, or negative integers are specified as the seed, the computer clock sets the seed and results are not replicable.
Source: SESUG-2000 (P-404).pdf on Random Sample Selection by Imelda C. Go, Richland County School District One, Columbia, SC

If you use a positive number as a seed then you can replicate the random sample records as long as you don’t change the seed number. If you use the negative number (as in the above programs) you can’t replicate the records. Every time you submit the program random samples generated will be different.
('DiggThis’)

WARNING: You may have unbalanced quotation marks.

SAS can allow the strings up to 32,767 characters long but some times SAS will write a Warning message WARNING: The quoted string currently being processed has become more than 262 characters long. You may have unbalanced quotation marks., when you try to keep a character string longer than 262 characters to a variable.  It is hard to look back at the SAS code to search for unbalanced quotes.
To make it more clearly I am going to show an example.

I want to add a 263 characters long name to a variable (longvar) and to do that I will simply use a data step… and when I do that I will see the WARNING message in Log.

data TEST;
x="(SEE DOCTOR'S LETTER)3RD ADMINISTRATION OF MTX WAS DELAYED BY 14 DAYS AND WAS REDUCED TO 1G/M2 INSTEAD OF 5G/M2, PROBLEMS, E.COLI SEPSIS WITH HEART INSUFFICIENCY WITH SINUS TACHYCARDY, PARALYTIC ILEUS, TACHYPNEA , PATIENT DIED ON 21.04.98 FROM MULTIORGAN FAILURE.";
y=length(x);
put x;
run;

LOG FILE:

There is a SAS option (NOQUOTELENMAX) which will take care of the WARNING message.

Options noQuoteLenMax; *Just before the Data step;

Don’t forget to change back to Options QuoteLenMax; after the end of the Data step.

Options noQuoteLenMax;

CALL EXECUTE: Easy way to print or sort multiple files.

When printing multiple files, or sorting multiple datasets, the traditional method is to write multiple steps as below.

Proc print data=libref.ae; var _all_; run;
Proc print data=libref.conmed; var _all_; run;
Proc print data=libref.demog; var _all_; run;
Proc print data=libref.lab; var _all_; run;

Proc print data=libref.medhist; var _all_; run;
If you are like me who likes to simplify the traditional SAS code here is the tip. CALL EXECUTE comes to rescue here.

*Using Disctionary Tables and Call Execute;
proc sql;
create table dsn as select distinct memname from dictionary.tables
where libname="LIBREF" and memtype="DATA";
quit;

*Sorts all the datasets using Call Execute;
data _null_;
set dsn;
call execute ("proc sort data=final.||'memname||';by usubjid; run;");
run;

*Prints all the datasets using Call Execute;
data _null_;
set dsn;
call execute ("proc print data=final.||'trim(memname)||';var _all_; run;");
run;

*Using Proc Contents and Call Execute;
proc contents data=libref._all_ out=contents(keep=memname);
run;

*Create a macro variable memname with list of all the datasets; 
proc sql;
select distinct memname into:memname from contents;
quit;
%put memname;

*Sorts all the datasets using Call Execute;
data _null_;
set dsn;
call execute ("proc sort data=libref.||'trim(memname)||';by usubjid; run;");
run;

*Prints all the datasets using Call Execute;
data _null_;
set dsn;
call execute ("proc print data=libref.||'trim(memname)||';var _all_; run;");
run;

*Using SASHELP Views and Call Execute to sort the dataset by usubjid;

*Sorts all the datasets using Call Execute;
data _null_;
set sashelp.vtable (where=(libname="LIBREF"));
call execute("proc sort data=libref.||'trim(memname)||';by usubjid;run;");
run;

*Prints all the datasets using Call Execute;
data _null_;
set sashelp.vtable (where=(libname="LIBREF"));
call execute("proc print data=libref.||'trim(memname)||';by _all_;run;");
run;

*If you are not printing/sorting all the datasets in the library here is code for that.The Following code only prints 4 datasets (AE, Conmed, Demog , Lab and Medhist);

data _null_;
do dsname='ae', 'conmed', 'demog', 'lab', 'medhist';
call execute("Proc print data=libref.||'trim(dsname))||'; var _all_; run;");
end;
run;

Write a Letter using SAS/ Emailing with SAS

SAS can do many things which most of us don’t have a clue. Here is one example….

Writing a letter:

filename formltr 'C:\Documents and Settings\sreddy\Desktop\formltr.rtf';

data address;
infile datalines;
input @ 1 stno
 @ 6 lane $12.
@19 aptno $7.
@27 city $9.
@37 state $2.
@40 zip ;
datalines;
2550 Augusta Blvd Apt#203 Fairfield OH 45014
;
run;

data _null_;
retain lm 5;
set address;
file formltr;* print notitles;
put _page_;
adr1 = trim(stno) ' ' trim(lane);
put @lm adr1;
adr2 = trim(aptno);
put @lm adr2;
adr3 = trim(city) ||', '|| trim(state) ||' '|| trim(zip);
put @lm adr3;
adr4 = trim('Dear')|| ' ' ||trim('SAS') || ' ' || trim('Users,');
put / @lm adr4;
put / @lm 'StudySAS Blog offers a lot of information regarding tips and tutorials on various topics ' ;
put @lm 'in SAS. It covers basics to get started to more in-depth topics like Macros and Proc SQL.';
put @lm 'It is a great site to browse to help broaden and deepen your SAS knowledge in a variety';
put @lm 'of areas.';
put / @lm 'Thanks for visiting StudySAS Blog. ';
put //// @lm 'Sarath Annapareddy';
run;
  
  • lm: represents left margin
  • / : forward slash symbol ( / ) skips a line.
  • If you want to skip ‘N’ number of lines use ‘N’ number of flashes after the PUT statement.
  • The trim function and concatenation operator (||) are important here because without these you will get extra spaces which we probably don't want see in our letter.
The Above SAS program will create a rtf in the specified location with the following information.



Emailing with SAS




How to send an email using SAS:

filename mymail email sastest@abc.com subject="Sending Email using SAS" from=abctest@gmail.com' attach="C:\Documents and Settings\sreddy\Desktop\formltr.rtf";


data _null_;
file mymail;
put 'Hello there, Please review the attached letter.';
put 'Thanks,';
put 'Sarath';
run;
quit;

Disclosure:

In the spirit of transparency and innovation, I want to share that some of the content on this blog is generated with the assistance of ChatGPT, an AI language model developed by OpenAI. While I use this tool to help brainstorm ideas and draft content, every post is carefully reviewed, edited, and personalized by me to ensure it aligns with my voice, values, and the needs of my readers. My goal is to provide you with accurate, valuable, and engaging content, and I believe that using AI as a creative aid helps achieve that. If you have any questions or feedback about this approach, feel free to reach out. Your trust and satisfaction are my top priorities.