Posts

10 Essential SAS Programming Tips for Boosting Your Efficiency

10 Essential SAS Programming Tips for Boosting Your Efficiency As a SAS programmer, you're always looking for ways to streamline your code, improve efficiency, and enhance the readability of your programs. Whether you're new to SAS or a seasoned pro, these tips will help you optimize your workflows and make the most out of your programming efforts. Here are ten essential SAS programming tips to elevate your coding skills: Harness the Power of PROC SQL for Efficient Data Manipulation PROC SQL can be a game-changer when it comes to handling complex data manipulations. It allows you to merge datasets, filter records, and create summary statistics all within a few lines of code, making your data processing more concise and effective. proc sql; select Name, mean(Salary) as Avg_Salary from employees group by Department having Avg_Salary > 50000; quit; Simplify Repetitive Tasks with ARRAY Repetitive calc...

Separating Unique and Duplicate Observations Using PROC SORT in SAS 9.3 and Newer Versions

Today, I stumbled upon a post where the author talks about a new options that are available in SAS 9.3 and later versions. These options ( NOUNIQUEKEYS and  UNIQUEOUT)    that allows sorting and then finding the duplicate records to be done in one step using PROC SORT. Direct Link:  Separating Unique and Duplicate Observations Using PROC SORT in SAS 9.3 and Newer Versions Christopher J. Bost published a paper in SAS Global Forum 2013 regarding the same option. Dealing with Duplicates

FDA's Official List of Validation Rules for SDTM compliance

Yesterday, FDA published its first official list of validation rules for CDISC SDTM. These long awaited rules cover both conformance and quality requirements, as described in the FDA Study Data Technical Conformance Guide. Conformance validation rules help ensure that the data conform to the standards, while quality checks help ensure the data support meaningful analysis. For Official list of rules, here is the direct link for the FDA website: http://www.fda.gov/forindustry/datastandards/studydatastandards/default.htm The FDA is asking sponsors to validate their study data before submission using these published validation rules and either correct any validation issues or explain, why they could not be corrected, in the Study Data Reviewer's Guide. This recommended pre-submission validation step is intended to minimize the presence of validation issues at the time of submission. Open CDISC is offering a webinar on the official list of validation rules. They are offeri...

How to avoid data set merging problems when common BY variable has different lengths?

When merging 2 datasets with a common by-variable and when that common variable has different variable length, the merge process produce unexpected results.  If you use SAS 9.2 version like me, then SAS Data step will inform you with the following warning: WARNING: Multiple lengths were specified for the BY variable ****** by input data sets. This may cause unexpected results. It is good that at least starting SAS 9.2 version, data step issues a Warning message to inform the programmer. But if you use before versions, it is difficult to notice this potential disaster.   When you see this WARNING message in the SAS log, we might be inclined to ignore this warning because we think this is just a WARNING never realizing the potential danger. When you see this message in the LOG we should be thinking about this instead of avoiding because SAS will do exactly what it states: it may cause unexpected results . In some cases merge won’t even happen between datasets and so...

Basic Differences Between Proc MEANS and Proc SUMMARY

Though Proc Means and Proc Summary are 2 different procedures essentially used to compute descriptive statistics of numeric variables, but there are differences between these two. (  1)   By default,  Proc MEANS  produces printed output in the LISTING window or other open destination whereas Proc SUMMARY does not.    2) Proc Summary only produces the descriptive statistics for the variables that are specified in the VAR statement, where as Proc Means by default, computes the descriptive statistics of the numeric variables even without the VAR statement. Here is a post which details the differences: Direct Link:   Excerpt :  Proc SUMMARY and Proc MEANS are essentially the same procedure.  Both procedures compute descriptive statistics.  The main difference concerns the default type of output they produce.  Proc MEANS by default produces printed output in the LISTING window or other open destination whereas Proc SUMMARY d...

Exploring the Analysis Data Model – ADaM Datasets

Today, I stumbled upon a blog which is interesting and resourceful.  I liked the article so much so want to share with all my friends here. Here is the direct link for the post to download or to review: Actual Article: The Analysis Data Model (ADaM) is a standard released by the  Clinical Data Interchange Standards Consortium (CDISC)  and has quickly become widely used in the submission of clinical trial information. ADaM has very close ties to another of CDISCs released standards,  Study Data Tabulation Model (SDTM) . The main difference between these two CDISC standards is the way in which the data is displayed. SDTM provides a standard for the creation and mapping of collected data from Raw sources, where as  ADAM provides a standard for the creation of analysis-ready data , often using SDTM data as the source. The purpose of the analysis-ready ADaM data is to provide the programmer with a  means to create tables, listings and figures wit...

How to use MISSING(), NMISS() and the CMISS() functions

SAS provides several functions to test for missing values but in this post we will focus on MISSING(), CMISS() and NMISS() functions. The NMISS() function is reserved for numeric variables. The MISSING() and CMISS() functions can be used with either character or numeric variables.    The CMISS() and NMISS() functions are designed by SAS to count the number of arguments with missing values whereas the MISSING function checks whether or not a variable is missing. The MISSING(), CMISS(), and    NMISS() functions provide a simple approach to check for missing values and these functions let you write few lines of code by avoiding large if-statements when you need to check for missing values in several values at the same time. MISSING() function is very useful when you need to check any variable has a missing value or not, but not sure if it’s a character or numeric? MISSING function works for either character or numeric variables and it also checks for the speci...