Posts

How to use MISSING(), NMISS() and the CMISS() functions

SAS provides several functions to test for missing values but in this post we will focus on MISSING(), CMISS() and NMISS() functions. The NMISS() function is reserved for numeric variables. The MISSING() and CMISS() functions can be used with either character or numeric variables.    The CMISS() and NMISS() functions are designed by SAS to count the number of arguments with missing values whereas the MISSING function checks whether or not a variable is missing. The MISSING(), CMISS(), and    NMISS() functions provide a simple approach to check for missing values and these functions let you write few lines of code by avoiding large if-statements when you need to check for missing values in several values at the same time. MISSING() function is very useful when you need to check any variable has a missing value or not, but not sure if it’s a character or numeric? MISSING function works for either character or numeric variables and it also checks for the speci...

Studyday calculation ( --DY Variable in SDTM)

USE OF THE “STUDY DAY” VARIABLES The permissible Study Day variables (--DY, --STDY, and --ENDY) describe the relative day of the observation starting with the reference date as Day 1. They are determined by comparing the date portion of the respective date/time variables (--DTC, --STDTC, and --ENDTC) to the date portion of the Subject Reference Start Date (RFSTDTC from the Demographics domain). The Subject Reference Start Date (RFSTDTC) is designated as Study Day 1. The Study Day value is incremented by 1 for each date following RFSTDTC. Dates prior to RFSTDTC are decremented by 1, with the date preceding RFSTDTC designated as Study Day -1 (there is no Study Day 0). This algorithm for determining Study Day is consistent with how people typically describe sequential days relative to a fixed reference point, but creates problems if used for mathematical calculations because it does not allow for a Day 0. As such, Study Day is not suited for use in subsequent numerical computations...

Creating Custom or Non-Standard CDISC SDTM Domains

Here is the nice article about creating custom SDTM domains......... Creating Custom or Non-Standard CDISC SDTM Domains Within the  Clinical Data Interchange Standards Consortium (CDISC) Study Data Tabulation Model (SDTM) , standard domains are split into four main types: special purpose, relationships, trial design and general observation classes. General observation classes cover the majority of observations collected during a study and can be divided among three general classes: The Interventions class captures investigational , therapeutic and other treatments that are administered to the subject (with some actual or expected physiological effect) either as specified by the study protocol (e.g., “exposure”), coincident with the study assessment period (e.g., “concomitant medications”), or other substances self-administered by the subject (such as alcohol, tobacco, or caffeine). The Events class captures planned protocol milestones such as randomization and study co...

ENCODING=Dataset Option

Let me explain the reason writing this post…. My coworker was having problem reading in a SAS dataset that he got from the Sponsor. It was a SAS dataset encoded with UTF-8 and other coding related stuff. When he tried to get in the rawdata using Libname statement libname rawdata     ‘ /sas/SAS913/SASDATA/CLIENT /ABC123/raw’; data datasetname ; set rawdata.datasetname ; run; When he runs the SAS code above, SAS stops at the current block, and returns an error that looks like this: ERROR: Some character data was lost during transcoding in the dataset RAWDATA.DATSETNAME. NOTE: The data step has been abnormally terminated. NOTE: The SAS System stopped processing this step because of errors. NOTE: SAS set option OBS=0 and will continue to check statements. This may cause NOTE: No observations in data set. NOTE: There were 20314 observations read from the data set RAWDATA.DATSETNAME. WARNING: The data set WORK.DATASETNAME may b...

Create a .CSV file of SAS dataset without column names or header row?

SAS places the variables names in Row 1 when you try to create an excel or .CSV file of the  SAS dataset. I have found a tip to tell SAS not to keep variable names in the row 1 of .CSV file. SAScommunity.org page has put together nice information regarding how to do this. 1 Run PROC EXPORT with PUTNAMES=NO 2 Run PROC EXPORT and recall and edit the code 3 Run PROC EXPORT and use a DATA step to rewrite the file without the first row 4 DATA _NULL_ with a PUT statement 5 DATA _NULL_ with a PUT statement, all fields quoted 6 ODS CSV and PROC REPORT with suppressed column headers 7 The %ds2csv SAS Institute utility macro 8 The CSV tagset and the table_headers="NO" option Run PROC EXPORT with PUTNAMES=NO Sample program  proc export data =data_to_export  outfile =' C:\data_exported.csv '         dbms=csv      ...

ERROR 29-185: Width Specified for format ---- is invalid

You see "ERROR 29-185: Width Specified for format ----  is invalid" message in the log file  when you try to specify the DATE format but used an invalid width. DATE format will not result in a date if it is too long or too short. Valid values are 5-9 in SAS 9.1.X versions. If you use newer version (SAS 9.2) then you won't see this Error message in the log. ( I am assuming that this is fixed in SAS 9.2). Try using format date9. instead of date11 . if you are using SAS 9.1.x (either Windows or Unix) version. data _null_ ; date =' 23-SEP-2004'd ; put date date11. ; * T his statement gives you error in SAS 9.1.2/9.1.3 versions ; put date date9. ; run ;

My 5 Important reasons to use Proc SQL

• Proc SQL requires few lines of SAS code compared with datastep and or Proc steps • Frequency counting can be done in no time… which is very helpful during the QC or validation • Proc SQL can merge datasets together using different variable names unlike datastep. • Proc SQL can merge many datasets together in the same step on different variables • Proc SQL allows you to join more than two datasets together at the same time on different levels • The merge process Proc SQL join does not overlays the duplicate by-column, where the Merge    statement of the data step does. Data step vs Proc SQL •   Data step Merge – Pre sorting of the dataset by the by-variable needed before the merging process – Requires common variable names –  May need few more lines of code than Proc SQL •   PROC SQL Join process works different than the typ...