
ENCODING=Dataset Option

Let me explain the reason writing this post…. My coworker was having problem reading in a SAS dataset that he got from the Sponsor. It was a SAS dataset encoded with UTF-8 and other coding related stuff. When he tried to get in the rawdata using Libname statement libname rawdata     ‘ /sas/SAS913/SASDATA/CLIENT /ABC123/raw’; data datasetname ; set rawdata.datasetname ; run; When he runs the SAS code above, SAS stops at the current block, and returns an error that looks like this: ERROR: Some character data was lost during transcoding in the dataset RAWDATA.DATSETNAME. NOTE: The data step has been abnormally terminated. NOTE: The SAS System stopped processing this step because of errors. NOTE: SAS set option OBS=0 and will continue to check statements. This may cause NOTE: No observations in data set. NOTE: There were 20314 observations read from the data set RAWDATA.DATSETNAME. WARNING: The data set WORK.DATASETNAME may b...

Create a .CSV file of SAS dataset without column names or header row?

SAS places the variables names in Row 1 when you try to create an excel or .CSV file of the  SAS dataset. I have found a tip to tell SAS not to keep variable names in the row 1 of .CSV file. page has put together nice information regarding how to do this. 1 Run PROC EXPORT with PUTNAMES=NO 2 Run PROC EXPORT and recall and edit the code 3 Run PROC EXPORT and use a DATA step to rewrite the file without the first row 4 DATA _NULL_ with a PUT statement 5 DATA _NULL_ with a PUT statement, all fields quoted 6 ODS CSV and PROC REPORT with suppressed column headers 7 The %ds2csv SAS Institute utility macro 8 The CSV tagset and the table_headers="NO" option Run PROC EXPORT with PUTNAMES=NO Sample program  proc export data =data_to_export  outfile =' C:\data_exported.csv '         dbms=csv      ...

ERROR 29-185: Width Specified for format ---- is invalid

You see "ERROR 29-185: Width Specified for format ----  is invalid" message in the log file  when you try to specify the DATE format but used an invalid width. DATE format will not result in a date if it is too long or too short. Valid values are 5-9 in SAS 9.1.X versions. If you use newer version (SAS 9.2) then you won't see this Error message in the log. ( I am assuming that this is fixed in SAS 9.2). Try using format date9. instead of date11 . if you are using SAS 9.1.x (either Windows or Unix) version. data _null_ ; date =' 23-SEP-2004'd ; put date date11. ; * T his statement gives you error in SAS 9.1.2/9.1.3 versions ; put date date9. ; run ;

My 5 Important reasons to use Proc SQL

• Proc SQL requires few lines of SAS code compared with datastep and or Proc steps • Frequency counting can be done in no time… which is very helpful during the QC or validation • Proc SQL can merge datasets together using different variable names unlike datastep. • Proc SQL can merge many datasets together in the same step on different variables • Proc SQL allows you to join more than two datasets together at the same time on different levels • The merge process Proc SQL join does not overlays the duplicate by-column, where the Merge    statement of the data step does. Data step vs Proc SQL •   Data step Merge – Pre sorting of the dataset by the by-variable needed before the merging process – Requires common variable names –  May need few more lines of code than Proc SQL •   PROC SQL Join process works different than the typ...

Transcoding Problem: Option (correctencoding=wlatin1)

Have you ever tried to convert the default encoding to Wlatin1 (Windows SAS Session Encoding)? Let me tell you the story behind writing this post…. Today I was asked to send SAS datasets to one of the client. I transferred the SAS datasets to the client and immediately after, I got an email from the so called client saying the encoding of SAS datasets is different this time when compared with the last transfer. He said It’s causing problems in Proc compare process. Opps… bummer…. Client’s email got me little worried ... I checked the Proc contents details and saw the change in the encoding. I investigated the issue and found out that Unicode SAS with UTF-8 encoding uses 1 to 4 bytes to handle Unicode data. It is possible for the 8-bit characters to be expanded by 2 to 3 bytes when transcoding occurs, which causes the truncation error. Because of the truncation problem I was asked to change the unicoding back to WLATIN1 so that the character data present in the SAS datasets repre...

SDTM Compliance Checks

Validation checks or tools to check the compliance of SDTM data JANUS is a standard database model which is based on the CDISC’s SDTM standard. JANUS is used by the FDA to store the submitted SDTM clinical data. As a part of data definition file submission pharmaceutical companies have to submit SAS datasets in transport file ( . xpt ) format along with annotated CRF and Define.xml file. The reason being this is… to properly load the clinical data into JANUS database which is maintained by the FDA. It is very easy for FDA reviewers to review the clinical data once they load the clinical data into their JANUS database. They can even produce ad-hoc reports and perform cross-study review at the same time. FDA runs compliance checks on the data submitted to make sure the data was collected as per the SDTM standard. FDA checks the compliance of data by running the WebSDM™ developed by PhaseForward ). WebSDM™ is a SDTM compliance check validation tool performs a set of SDTM c...

How to remove carriage return and linefeed characters within quoted strings.

HANDLING SPECIAL EMBEDDED CHARACTERS To manage and report data in DBMS that contains very long text fields is not easy. This can be frustrating if the text field has special embedded symbols such as tabs, carriage returns (‘OD’x ), line feeds (‘OA’x) and page breaks. But here is simple SAS code which takes care of those issues. The normal line end for Windows text files is a  carriage return character or a line feed character so   The syntax for taking out all carriage return ('OD'x) and line feed ('OA'x) characters is comment= Compress(comment,'0D0A'x);                              or comment= TRANWRD(comment,'0D0A'x,’’); If you just want to take out the Carriage Return, use this code: comment= TRANWRD(comment,'0D'x,''); You could also try this one too.. Comment=compress(Comment, ,"kw"); * k is for keep, w is for "write-a...