Posts

Transcoding Problem: Option (correctencoding=wlatin1)

Have you ever tried to convert the default encoding to Wlatin1 (Windows SAS Session Encoding)? Let me tell you the story behind writing this post…. Today I was asked to send SAS datasets to one of the client. I transferred the SAS datasets to the client and immediately after, I got an email from the so called client saying the encoding of SAS datasets is different this time when compared with the last transfer. He said It’s causing problems in Proc compare process. Opps… bummer…. Client’s email got me little worried ... I checked the Proc contents details and saw the change in the encoding. I investigated the issue and found out that Unicode SAS with UTF-8 encoding uses 1 to 4 bytes to handle Unicode data. It is possible for the 8-bit characters to be expanded by 2 to 3 bytes when transcoding occurs, which causes the truncation error. Because of the truncation problem I was asked to change the unicoding back to WLATIN1 so that the character data present in the SAS datasets repre...

SDTM Compliance Checks

Validation checks or tools to check the compliance of SDTM data JANUS is a standard database model which is based on the CDISC’s SDTM standard. JANUS is used by the FDA to store the submitted SDTM clinical data. As a part of data definition file submission pharmaceutical companies have to submit SAS datasets in transport file ( . xpt ) format along with annotated CRF and Define.xml file. The reason being this is… to properly load the clinical data into JANUS database which is maintained by the FDA. It is very easy for FDA reviewers to review the clinical data once they load the clinical data into their JANUS database. They can even produce ad-hoc reports and perform cross-study review at the same time. FDA runs compliance checks on the data submitted to make sure the data was collected as per the SDTM standard. FDA checks the compliance of data by running the WebSDM™ developed by PhaseForward ). WebSDM™ is a SDTM compliance check validation tool performs a set of SDTM c...

How to remove carriage return and linefeed characters within quoted strings.

HANDLING SPECIAL EMBEDDED CHARACTERS To manage and report data in DBMS that contains very long text fields is not easy. This can be frustrating if the text field has special embedded symbols such as tabs, carriage returns (‘OD’x ), line feeds (‘OA’x) and page breaks. But here is simple SAS code which takes care of those issues. The normal line end for Windows text files is a  carriage return character or a line feed character so   The syntax for taking out all carriage return ('OD'x) and line feed ('OA'x) characters is comment= Compress(comment,'0D0A'x);                              or comment= TRANWRD(comment,'0D0A'x,’’); If you just want to take out the Carriage Return, use this code: comment= TRANWRD(comment,'0D'x,''); You could also try this one too.. Comment=compress(Comment, ,"kw"); * k is for keep, w is for "write-a...

Counting the number of missing and non-missing values for each variable in a data set.

/* create sample data */ data one; input a $ b $ c $ d e; cards; a . a 1 3 . b . 2 4 a a a . 5 . . b 3 5 a a a . 6 a a a . 7 a a a 2 8 ; run; /* create a format to group missing and non-missing */ proc format; value $missfmt ' '=' missing ' other='non-missing'; value missfmt .=' missing ' other= 'non-missing' ; run; %macro lst(dsn); /** open dataset **/ %let dsid= %sysfunc (open(&dsn)); /** cnt will contain the number of variables in the dataset passed in **/ %let cnt= %sysfunc (attrn(&dsid,nvars)); %do i = 1 %to &cnt; /** create a different macro variable for each variable in dataset **/ %let x&i= %sysfunc (varname(&dsid,&i)); /** list the type of the current variable **/ %let typ&i= %sysfunc (vartype(&dsid,&i)); %end; /** close dataset **/ %let rc= %sysfunc ( close (&dsid)); %do i = 1 %to &cnt; /* loop through each variable in PROC FREQ and ...

When do I use a WHERE statement instead of an IF statement to subset a data set?

When programming in SAS, there is almost always more than one way to accomplish a task. Beginning programmers may think that there is no difference between using the WHERE statement and the IF statement to subset your data set. Knowledgeable programmers know that depending on the situation, sometimes one statement is more appropriate than the other. For example, if your subset condition includes automatic variables or new variables created within the DATA step, then you must use the IF statement instead of the WHERE statement. This tip shows you how and when to apply the WHERE and IF statements to get correct and reliable results. It also reviews the similarities as well as the differences between these two SAS programming approaches. Detail differences in program efficiency between the two approaches will not be covered in this tip. For more details refer to  http://support.sas.com/kb/24/286.html

Transporting SAS Files using Proc Copy and or Proc Cport/Proc Cimport

When moving SAS datasets /catalogs from one type of computer to another, there are several things to be considered, such as the operating systems of the two computers, the versions of SAS and the type of communication link between the computers. The easiest way to move SAS datasets from one system to another system is to: Create a transport file using any SAS version. Move the transport file to the new system. Import the transport file on the new system. Transport datasets are 80-byte length binary files made from SAS datasets. PROC COPY or PROC CPORT can create Transport datasets but they both create different types of transport files. Transport files can be created and read using either PROC COPY or PROC CPORT & PROC CIMPORT, but you cannot mix and match. Transport files created with PROC COPY must be read with PROC COPY; those created by PROC CPORT must be read with PROC CIMPORT. PROC COPY uses an engine (i.e. XPORT) to create a SAS transport file. PROC COPY is used to tra...

How to generate the month name from a numeric date value

Image
Task : I have a SAS date and wanted to create a variable with the month name. Here is how to do it...... Use MONNAMEw. format which is simple and easy.  You need to be using  SAS 9.X versions to make it work. /*Use MONNAMEw. format*/ data month; input date: mmddyy8 .; month_name= put (date, monname3 .); datalines; 01/15/04 02/29/04 07/04/04 08/18/04 12/31/04 ; run; proc print; run;