Saturday, January 26, 2013

How to use MISSING(), NMISS() and the CMISS() functions



SAS provides several functions to test for missing values but in this post we will focus on MISSING(), CMISS() and NMISS() functions. The NMISS() function is reserved for numeric variables. The MISSING() and CMISS() functions can be used with either character or numeric variables.  The CMISS() and NMISS() functions are designed by SAS to count the number of arguments with missing values whereas the MISSING function checks whether or not a variable is missing. The MISSING(), CMISS(), and  NMISS() functions provide a simple approach to check for missing values and these functions let you write few lines of code by avoiding large if-statements when you need to check for missing values in several values at the same time.

MISSING() function is very useful when you need to check any variable has a missing value or not, but not sure if it’s a character or numeric? MISSING function works for either character or numeric variables and it also checks for the special numeric missing values (.A, .B,.C ._ etc)as well. The MISSING() function produces a numeric result (0 or 1) if the data point is present or missing. MISSING(varname) is the same as MISSING(varname)=1.  MISSING(varname)=0 specifies when the data point is present.

The MISSING function is particularly useful if you use special missing values since 'if varname=.' will not identify all missing values in such cases.

NOTE: Missing value is not consistent in SAS as it changes between numeric and character variables. A single period (.) represents the Numeric missing value. A single blank enclosed in single or double quotes (' ' or “  ” ) represents the Character missing value. A single period followed by a single letter or an underscore (ex: .A, .B, .Z, ._) represents Special numeric missing values. Please note that these special missing values available for numeric variables only.

The NMISS() function will count the number of arguments with missing values in the specified list of numeric variables. NMISS() function is very useful if you want to make sure if at least one variable in the list is not missing.

The CMISS() is available with SAS 9.2 and SAS Enterprise Guide 4.3 and is similar to the NMISS() function. The only difference is that it counts the number arguments that are missing for both character and numeric variables.

The NMISS() function returns the number of argument variables which have missing values. NMISS works with multiple numeric values, whereas MISSING works with only one value that can be either numeric or character.

Examples:
* count the number of the variables A, B, and C which have missing values;
count=nmiss(A, B, C);
count=nmiss(of A B C);

* count the number of the variables from Var1 to Var10 which have missing values;
count=nmiss(of var1-var10);


Examples:
x1=nmiss(1,0,.,2,5,.);
2
x2=nmiss(1,0);
0
x3=nmiss(of x1-x2);
0

For more details refer to this page. (USING the CMISS, NMISS and MISSING FUNCTIONS)
For more details regarding the special missing values, please also refer to Special Missing Values in SAS (http://studysas.blogspot.com/2010/04/special-missing-values.html).

References:
1)     Missing values in SAS (http://www.pauldickman.com/teaching/sas/missing.php);
2)     MISSING! - Understanding and Making the Most of Missing Data: SUGI 31: Suzanne M. Humphreys, PRA International, Victoria, BC (Canada).
3)     Special Missing Values in SAS (http://studysas.blogspot.com/2010/04/special-missing-values.html)
4)     Usage Note 36480 KNOWLEDGE BASE / SAMPLES & SAS NOTES from support.sas.com
5)     SAS(R) 9.2 Language Reference: Dictionary, Fourth Edition.
6)     Carpenter's Guide to Innovative SAS Techniques, Art Carpenter (Page:99)




Saturday, January 12, 2013

Studyday calculation ( --DY Variable in SDTM)




USE OF THE “STUDY DAY” VARIABLES

The permissible Study Day variables (--DY, --STDY, and --ENDY) describe the relative day of the observation starting with the reference date as Day 1. They are determined by comparing the date portion of the respective date/time variables (--DTC, --STDTC, and --ENDTC) to the date portion of the Subject Reference Start Date (RFSTDTC from the Demographics domain).

The Subject Reference Start Date (RFSTDTC) is designated as Study Day 1. The Study Day value is incremented by 1 for each date following RFSTDTC. Dates prior to RFSTDTC are decremented by 1, with the date preceding RFSTDTC designated as Study Day -1 (there is no Study Day 0). This algorithm for determining Study Day is consistent with how people typically describe sequential days relative to a fixed reference point, but creates problems if used for mathematical calculations because it does not allow for a Day 0. As such, Study Day is not suited for use in subsequent numerical computations, such as calculating duration. The raw date values should be used rather than Study Day in those calculations.

Reference: Study Data Tabulation Model Implementation Guide v3.1.2 (Page No 40).

You will find that  you need to create --DY and or --STDY /--ENDY varianles in almost all the SDTM domains. Because the process of the derivation is same, it makes sense to create a macro code and use it across all the domains...

/****************************************************************

*Study Number              :ABCD_0123
*Sponsor Protocol Number   : ABC1004
*Program Name              : studyday.sas
*Program Location          : X:\PROJECT\DEPT\ABC1004\Progs\macros
*Description               : StudyDAY Macro 
*Program Author            : Sarath Annapareddy
*Creation Date             : 13-Jul-2012
*Macro Parameters: 
  rfdate: --DTC variable used to calculate  Study day variable.
  var   : --DTC variable used to calculate the  Study day  to.
  dy    : Prefix of the Study day variable
 dsn    : Dataset in which the --DTC variable used to calculate the  Study day  to exists.

*Notes: Macro must be used outside the datastep.

****************************************************************;
/*************            Setup Section            ************/
/**************************************************************/




%macro make_studyday(dsn,var,dy,rfdate);

*Getting the Baseline or Reference start date from DM dataset;
proc sort data=interim.dm out=dm(keep=usubjid rfstdtc);  
by usubjid;
run;

proc sort data=&dsn;
      by usubjid;
run;

data &dsn;
        merge &dsn (in=a) dm;
    by usubjid;
    if a;
/*Numeric date variable;*/
       &rfdate._n=input(substr(&rfdate,1,10),anydtdte10.);
       &var._n=input(substr(&var,1,10),anydtdte10.);

/*Study day derivation;*/
if nmiss(&var._n,&rfdate._n)=0 then &dy=&var._n-&rfdate._n+(&var._n>=&rfdate._n); 
run;
%mend;



A sample macro call of this SAS macro for the Adverse Events (AE) domain might look like this:

%make_studyday(ae,aestdtc,aestdy,rfstdtc);
%make_studyday(ae,aeendtc,aeendy,rfstdtc);
%make_Gstudyday(ae,aedtc,aedy,rfstdtc);





For pre-dose:
         studyday= the event/visit date – first dose date
For post-dose:
           studyday= the event/visit date – first dose date + 1







Learn how to view SAS dataset labels without opening the dataset directly in a SAS session. Easy methods and examples included!

Quick Tip: See SAS Dataset Labels Without Opening the Data Quick Tip: See SAS Dataset Labels With...