Posts

Showing posts from February, 2009

How to create a comma separated file (.csv) of a SAS dataset?

IN SAS programming, we often require outputting the dataset in different formats like EXCEL and CSV etc and here are the five different ways to export the SAS dataset into .csv file. Example: data new ; infile datalines dsd dlm=' ' missover; input a b c d; datalines; 3 5 1 1 4 1 . . 5 8 3 2 6 0 4 4 ; run ; By putting MISSOVER in the infile statement we are telling SAS to do not look for the data in the next lane if it runs out of the data, instead keep missing values for any remaining variables. DSD and DLM options should be included always in the infile statement, if we include the dlm=’ ‘ in the infile statement then SAS will put one digit for each variable even though we haven’t assigned any length to variable. DSD option will tell SAS to consider a missing value if 2 delimiters are present side by side in any observation. When we ran the above program in SAS, we create a SAS dataset name ‘ NEW’ in the work directory and if we want to create a ...

How to Import Excel files into SAS

Reading from Excel Spreadsheets: Microsoft Excel spreadsheets can be read from SAS in several ways. Two of these will be demonstrated here. First, PROC IMPORT allows direct access to Excel files through SAS/Access to PC File Formats or access to Comma-Separated (CSV) files through Base SAS. The second method uses the Excel LIBNAME engine. PROC IMPORT The IMPORT procedure reads from external sources and creates a SAS data set. Two sources are Excel spreadsheets and CSV files. A particular SAS/Access product may be required for certain sources, however. In our example, SAS/Access to PC File Formats is required to read an Excel file, but a CSV file can be accessed with Base SAS. General Syntax for PROC IMPORT: PROC IMPORT DATAFILE=" c:\sas\ego.csv " OUT=jeeshim.egov DBMS=CSV REPLACE; For Excel you use the DATAFILE =”filename” option to specify the Excel file to be read. (The TABLE=”tablename” option would be applicable if you were reading from a database such as Microso...

THE SPECIAL “??” FORMAT MODIFIER

The following excerpt is from SAS OnlineDoc documentation: ? or ?? Direct link: http://www.nesug.org/Proceedings/nesug01/at/at1013.pdf The optional question mark (?) and double question mark (??) format modifiers suppress the printing of both the error messages and the input lines when invalid data values are read. The ? modifier suppresses the invalid data message. The ?? modifier also suppresses the invalid data message and, in addition, prevents the automatic variable _ERROR_ from being set to 1 when invalid data are read. Below is an example of using ?? to determine whether a variable contains non-numeric values or not: data _null_; x = “ 12345678 ”; if (input (x, ?? 8. ) eq . ) then put ‘ non-numeric’ ; else put ‘numeric’ ; run ; Running SAS would return “Numeric” in the above example. If we used X=”123a5678”, SAS would return “Non-Numeric”. Note that the input format in the above example is “8.” So only the first 8 bytes of the character string are checked...

PROC SQL basics, tips and techniques and sample code programs

Proc SQL: Power of SAS SQL: • SQL looks at datasets differently from SAS – SAS looks at a dataset one record at a time, using an implied loop that moves from the first record to the last – SQL looks at all the records, as a single object • Because of this difference SQL can easily do few things that are more difficult to do in SAS • There are a number of built in functions in SQL that can be used in a select statement • Because of how SQL handles a dataset, these functions work over the entire dataset • Functions: – Count: Counts Values – Sum: Sums Values – Max: Identifies the largest value – Min: Identifies the smallest value – Mean: Averages the values Read more at www.cognigencorp.com/perspective/tipsNtricks.pub/1/PROC%20SQL%20Talk_12_.ppt – SAS SQL Introduction to Proc SQL AN INTRODUCTION TO PROC SQL® PROC SQL: When and How to Use It? Proc SQL – A Primer for SAS Programmers Understanding PROC SQL Creating Macro Variables with Proc SQL DATA Step vs. PROC SQL: What’s a neophyte to d...

Length of Numeric variables GT 8 in SAS| StudySAS BLOG

Q&A: numeric variables length more than 8? We all know that the default length of the numeric variables in SAS is 8 and if suppose I want to store a number lets say (12345678910, which has a length 11 to numeric variable) to variable total, what should I do? What if the numeric variable digits are more than 12 digits and i want to store them all without any E values? ANS) The default length of numeric variables in SAS is 8 and all the numbers that we see in the sas datasets are called as floating numbers(floating point binary) and not a regular sequence numbers form 1 to 10. When we are using SAS/Windows as our operating system and then the minimum length for any numeric variable should be 3(not 1 as we get confused all the time). So if a variable contas less than 3 digits means it is stored with less space. The reason is, since a numeric variable will need a power and and the sign(+ or -), if SAS want to store a numberit defenitely needs a minimum of 3 bytes. Depending ...

Options VALIDVARNAME=UPCASE

VALIDVARNAME= V7 UPCASE ANY VALIDVARNAME= option is generally used in SAS whenever we want to control the SAS variable names in the dataset. VALIDVARNAME= V7 UPCASE ANY The default option will be VALIDVARNAME=V7 until we specify as UPPERCASE or ANY . When we mention options VALIDVARNAME=V7 , that means we are telling SAS to change the name of the Database column (etc EXCEL sheet column) to valid SAS name with certain rules keeping in mind. Here are those rules that SAS needs to follow, when it changes the DBMS column name to valid SAS name. Only 32 mixed case (lower or uppercase) characters are allowed in each variable. Names should be starting with an underscore or an alphabet (either uppercase or lower case character). Invalid characters in the DBMS column (ex. $) should be changed to underscores. See the SAS Language Reference: Dictionary to get more details about the rules. VALIDVARNAME=UPCASE When we mention options VALIDVARNAME=UPPERCASE we are telling SAS to change the column...

How to merge data sets with a common variable?

Here is the simple way of merging the data sets with a common variable if the datasets has the same prefix name. For example: col1-col10, dsn1-dsn 7 , or data1 to data10 with common variable of ID. Considering we have 10 datsets and all of them having the same prefix data; %macro mymerge (n); data merged; merge %do i = 1 % to &n; data&i %end; ; /* this additional ';' is necessary, the first ';' is for the "%end", while the second ';' is for "Merge"*/; by id; run; %mend; %mymerge(10)

Merging the data sets using macro code

Merging the data sets with a common variable if the datasets has the same prefix name? For example: col1-col10 dsn1-dsn 7 data1 to data6 with common variable of Usubjid. here is the example, I have 7 datasets i need to merge and each of them having the common variable(usubjid) to merge, and all the datasets having the same prefix dsn(dsn1 to dsn7). % macro allmerge (n); data combine ; merge %do i = 1 % to &n; dsn&i %end; ; /* this additional ';' is necessary, the first ';' is for the "%end", while the second ';' is for "Merge" */ by usubjid; run ; %mend; %allmerge (7)

Resolving and Using &&var&i Macro Variables

Here are the 2 important questions always comes up in our minds ,(& vs && vs &&& and single dot and double dots ) when we are dealing with macros for the first time and here are the answers for them. I did find a very good regarding the above topics in the one of the SAS forums and IAN WHITLOCK explained it very clear. direct link : or when to use &,&&,and &&&,how do we distinguish? &MACVAR references a macro variable. The rule is that the scanner reads from left to right. Two ampersands resolve to one and scanner continues. If the resulting expression has ampersands then that expression is scanned again. So &&x resolves to &x resolves to value of X &&&x resolves to & followed by the value of X which then resolves to the value of the variable indicated. If you are familiar with TSO CLISTS (even if you are not), they are very similar to SAS macro. SAS was originally based on PL1, so both SAS syntax an...

How can I count number of observations per subject in a data set?

Image
We always have this question in mind, while we do the SAS programming and here is the simple answer for that, we just need to use SUM statement and the FIRST .variable in the SET statement and then the RETAIN statement to calculate the observations count per subject. By doing some minor modification we can calculate observations count per subject per visit also. (Just include visit variable in the BY variable list in PROC sort and First . variable list in datastep with SET statement). For example: data dsn ; input patid implants ; datalines; 1 3 1 1 1 2 1 1 2 1 2 2 3 1 4 2 3 1 4 5 2 3 1 6 ; run ; proc sort data=dsn ; by patid ; run ; data dsn1 ; set dsn ; by patid ; cnt+1 ; if first.patid then cnt=1 ; run ; proc sort data= dsn1 ; by patid descending cnt ; run ; data dsn2 ; set dsn1 ; by patid ; retain totcnt ; if first.patid then totcnt=cnt ; output; run; proc print data=dsn2 ; run; Output:

Mastering Duplicates Removal in SAS: A Comprehensive Guide to Using PROC SQL, DATA STEP, and PROC SORT

Removing Duplicate Observations in SAS: A Comprehensive Guide Removing Duplicate Observations in SAS: A Comprehensive Guide In data analysis, it's common to encounter datasets with duplicate records that need to be cleaned up. SAS offers several methods to remove these duplicates, each with its strengths and suitable scenarios. This article explores three primary methods for removing duplicate observations: using PROC SQL , the DATA STEP , and PROC SORT . We will provide detailed examples and discuss when to use each method. Understanding Duplicate Observations Before diving into the methods, let's clarify what we mean by duplicate observations. Duplicates can occur in different forms: Exact Duplicates: All variables across two or more observations have identical values. Key-Based Duplicates: Observations are considered duplicates based on the values of specific key variables (e.g., ID, Date). The ...

How to scan more than 20 records to determine variable attributes

Usage Note 1075: How to scan more than 20 records to determine variable attributes in EFI In Versions 7 and 8 of the SAS System, by default the Import Wizard, PROC IMPORT and the External File Interface (EFI) scan 20 records to determine variable attributes when reading delimited text files. Changing the default setting can only be done for EFI in Version 7, Release 8 and Release 8.1. Beginning in Release 8.2 changing the default setting is applicable to the Import Wizard, PROC IMPORT and EFI. Follow the steps below to change the default behavior: 1) Type regedit on the command line (white box with a check mark) 2) When the Registry Editor window opens, double click on the PRODUCTS icon 3) Double click on the BASE icon 4) Double click on the EFI icon 5) In the window on the right the Contents of EFI will be populated with EFI options 6) Double click on GuessingRows 7) When the new window opens with the old value of 20 , delete it, enter the new value, and clic...