STUDYSAS BLOG

Posts

How to avoid data set merging problems when common BY variable has different lengths?

- November 14, 2013

When merging 2 datasets with a common by-variable and when that common variable has different variable length, the merge process produce unexpected results. If you use SAS 9.2 version like me, then SAS Data step will inform you with the following warning: WARNING: Multiple lengths were specified for the BY variable ****** by input data sets. This may cause unexpected results. It is good that at least starting SAS 9.2 version, data step issues a Warning message to inform the programmer. But if you use before versions, it is difficult to notice this potential disaster. When you see this WARNING message in the SAS log, we might be inclined to ignore this warning because we think this is just a WARNING never realizing the potential danger. When you see this message in the LOG we should be thinking about this instead of avoiding because SAS will do exactly what it states: it may cause unexpected results . In some cases merge won’t even happen between datasets and so...

Basic Differences Between Proc MEANS and Proc SUMMARY

- July 25, 2013

Though Proc Means and Proc Summary are 2 different procedures essentially used to compute descriptive statistics of numeric variables, but there are differences between these two. ( 1) By default, Proc MEANS produces printed output in the LISTING window or other open destination whereas Proc SUMMARY does not. 2) Proc Summary only produces the descriptive statistics for the variables that are specified in the VAR statement, where as Proc Means by default, computes the descriptive statistics of the numeric variables even without the VAR statement. Here is a post which details the differences: Direct Link: Excerpt : Proc SUMMARY and Proc MEANS are essentially the same procedure. Both procedures compute descriptive statistics. The main difference concerns the default type of output they produce. Proc MEANS by default produces printed output in the LISTING window or other open destination whereas Proc SUMMARY d...

Exploring the Analysis Data Model – ADaM Datasets

- March 14, 2013

Today, I stumbled upon a blog which is interesting and resourceful. I liked the article so much so want to share with all my friends here. Here is the direct link for the post to download or to review: Actual Article: The Analysis Data Model (ADaM) is a standard released by the Clinical Data Interchange Standards Consortium (CDISC) and has quickly become widely used in the submission of clinical trial information. ADaM has very close ties to another of CDISCs released standards, Study Data Tabulation Model (SDTM) . The main difference between these two CDISC standards is the way in which the data is displayed. SDTM provides a standard for the creation and mapping of collected data from Raw sources, where as ADAM provides a standard for the creation of analysis-ready data , often using SDTM data as the source. The purpose of the analysis-ready ADaM data is to provide the programmer with a means to create tables, listings and figures wit...

How to use MISSING(), NMISS() and the CMISS() functions

- January 26, 2013

SAS provides several functions to test for missing values but in this post we will focus on MISSING(), CMISS() and NMISS() functions. The NMISS() function is reserved for numeric variables. The MISSING() and CMISS() functions can be used with either character or numeric variables. The CMISS() and NMISS() functions are designed by SAS to count the number of arguments with missing values whereas the MISSING function checks whether or not a variable is missing. The MISSING(), CMISS(), and NMISS() functions provide a simple approach to check for missing values and these functions let you write few lines of code by avoiding large if-statements when you need to check for missing values in several values at the same time. MISSING() function is very useful when you need to check any variable has a missing value or not, but not sure if it’s a character or numeric? MISSING function works for either character or numeric variables and it also checks for the speci...

Studyday calculation ( --DY Variable in SDTM)

- January 12, 2013

USE OF THE “STUDY DAY” VARIABLES The permissible Study Day variables (--DY, --STDY, and --ENDY) describe the relative day of the observation starting with the reference date as Day 1. They are determined by comparing the date portion of the respective date/time variables (--DTC, --STDTC, and --ENDTC) to the date portion of the Subject Reference Start Date (RFSTDTC from the Demographics domain). The Subject Reference Start Date (RFSTDTC) is designated as Study Day 1. The Study Day value is incremented by 1 for each date following RFSTDTC. Dates prior to RFSTDTC are decremented by 1, with the date preceding RFSTDTC designated as Study Day -1 (there is no Study Day 0). This algorithm for determining Study Day is consistent with how people typically describe sequential days relative to a fixed reference point, but creates problems if used for mathematical calculations because it does not allow for a Day 0. As such, Study Day is not suited for use in subsequent numerical computations...

Creating Custom or Non-Standard CDISC SDTM Domains

- November 29, 2012

Here is the nice article about creating custom SDTM domains......... Creating Custom or Non-Standard CDISC SDTM Domains Within the Clinical Data Interchange Standards Consortium (CDISC) Study Data Tabulation Model (SDTM) , standard domains are split into four main types: special purpose, relationships, trial design and general observation classes. General observation classes cover the majority of observations collected during a study and can be divided among three general classes: The Interventions class captures investigational , therapeutic and other treatments that are administered to the subject (with some actual or expected physiological effect) either as specified by the study protocol (e.g., “exposure”), coincident with the study assessment period (e.g., “concomitant medications”), or other substances self-administered by the subject (such as alcohol, tobacco, or caffeine). The Events class captures planned protocol milestones such as randomization and study co...

ENCODING=Dataset Option

- June 20, 2012

Let me explain the reason writing this post…. My coworker was having problem reading in a SAS dataset that he got from the Sponsor. It was a SAS dataset encoded with UTF-8 and other coding related stuff. When he tried to get in the rawdata using Libname statement libname rawdata ‘ /sas/SAS913/SASDATA/CLIENT /ABC123/raw’; data datasetname ; set rawdata.datasetname ; run; When he runs the SAS code above, SAS stops at the current block, and returns an error that looks like this: ERROR: Some character data was lost during transcoding in the dataset RAWDATA.DATSETNAME. NOTE: The data step has been abnormally terminated. NOTE: The SAS System stopped processing this step because of errors. NOTE: SAS set option OBS=0 and will continue to check statements. This may cause NOTE: No observations in data set. NOTE: There were 20314 observations read from the data set RAWDATA.DATSETNAME. WARNING: The data set WORK.DATASETNAME may b...

Posts

NESUG 2011 Publication