Posts

HOW TO USE THE SCAN FUNCTION:

USING THE SCAN FUNCTION: SCAN (string,n,delimiters): returns the nth word from the character string string, where words are delimited by the characters in delimiters.  It is used to extract words from a  character value when the relative order of words is known, but their starting positions are not. NewVar= SCAN (string,n<, delimiters >); -returns the nth ‘word’ in the string   When the SCAN function is used: 􀂃 the length of the created variable is 200 bytes if it is not previously defined with a LENGTH statement 􀂃 delimiters before the first word have no effect When the SCAN function is used, 􀂃 any character or set of characters can serve as delimiters Points to remember while using SCAN Function: 􀂃 a missing value is returned if there are fewer than n words in string 􀂃 two or more contiguous delimiters are treated as a single delimiter 􀂃 if  n is negative, SCAN selects the word in the character string starting from the end of stri...

How to use the PROPCASE function

Using the PROPCASE function The "old" way to capitalize the first letter of words was to use LOWCASE, UPCASE, and the SUBSTR function, like this: DATA CAPITALIZE; INFORMAT FIRST LAST $30.; INPUT FIRST LAST; FIRST = LOWCASE (FIRST); LAST = LOWCASE (LAST); SUBSTR( FIRST , 1 , 1 ) = UPCASE(SUBSTR( FIRST , 1 , 1 )); SUBSTR( LAST , 1 , 1 ) = UPCASE(SUBSTR( LAST , 1 , 1 )); DATALINES; ronald cODy THomaS eDISON albert einstein ; PROC PRINT DATA=CAPITALIZE NOOBS; TITLE "Listing of Data Set CAPITALIZE"; RUN ; With the PROPCASE function in SAS 9.1, it's much easier. DATA PROPER ; INPUT NAME $60.; NAME = PROPCASE (NAME); DATALINES; ronald cODy THomaS eDISON albert einstein ; PROC PRINT DATA=PROPER NOOBS; TITLE "Listing of Data Set PROPER"; RUN; source:www.support.sas.com Example: data test ; x= lowcase ( ' MY NaMe iS SARaTh ' ) ; y= propcase (x) ; z= propcase ( lowcase ( ' ...

How to capitalize the first letter of every word in a string

Image
Capitalize the first letter of every word in a string Convert a text string into mixed case. Note: Beginning in SAS 9.1, this task is easily accomplished with the PROPCASE function. See Sample 2 on the Full Code tab. /* Sample 1: COMPBL, LOWCASE, SCAN, INDEX, UPCASE, SUBSTR */ data sample; input name $char50.; /* Lowercase the entire string, remove consecutive blanks */ newname= compbl ( lowcase (name)); length next $ 20; i=0; next=scan(newname,1, ' ' ); do while (next ne ' ' ); i+1; /* Scan off each 'word' based upon a space, locate the position */ /* of the first letter in the original string, UPCASE the first */ /* letter and use SUBSTR to replace the byte. */ pos= indexw (newname,trim(next)); substr (newname,pos,1)= upcase ( substr (newname,pos,1)); next= scan (newname,i, ' ' ); end; keep name newname; datalines; Jane DOE min ning chou HENRIK HANSSON D ETCHEVERRY, Charo B ; proc print ; run; /* Sample 2: PROPCASE (available in SAS 9.1) */...

SOUNDEX function

Image
Combine data sets based upon similar values Encode character strings using SOUNDEX to aid in combining the data based upon similar but not exact values. Note: The SOUNDEX algorithm is English-biased. For more details about SOUNDEX, please refer to the SAS Language Reference, Dictionary under Functions. RESULT: source: support.sas.com

Options in SAS' INFILE Statement

Options in SAS' INFILE Statement There are a number of options available for the INFILE statement. Below you will find discussion of the following options: DLM='character', DSD, MISSOVER, and FIRSTOBS=value. DLM='character' When I prepare a data file for list input to SAS, I use a blank space as the delimiter. The delimiter is the character which must appear between the score for one variable and that for the next variable. One can, however, choose to use a delimiter other than a blank space. For example, the comma is a commonly used delimiter. If you are going to use a delimiter other than a blank space, you must tell SAS what the delimiter is. Here is an example of a couple of data lines in a comma delimited file: 4,2,8010,2,4,2,4,4,2,2,2,2,2,2,4,4,2,4,2,2,CDFR,22,900,5,4,1 4,2,8011,1,2,3,1,3,4,4,4,1,2,2,4,2,3,4,3,1,psychology,24,360,4,3,1 Here is the INFILE statement which identified the delimiter as being a comma: infile 'd:\Research-Misc\Hale\Hale.csv' ...

Finding the number of observations in SAS dataset

There are a number of ways of finding out the number of observations in a SAS data set and, while they are documented in a number of different places, I have decided to collect them together in one place. At the very least, it means that I can find them again. First up is the most basic and least efficient method: read the whole data set and increment a counter a pick up its last value. The END option allows you to find the last value of count without recourse to FIRST.x/LAST.x logic. data _null_ ; set test end=eof; count+1; if eof then call symput( "nobs" ,count); run; The next option is a more succinct SQL variation on the same idea. The colon prefix denotes a macro variable whose value is to be assigned in the SELECT statement; there should be no surprise as to what the COUNT(*) does… proc sql noprint; select count(*) into :nobs from test; quit; Continuing the SQL theme, accessing the dictionary tables is another route to the same end and has the advantage of need...

Convert values from character to numeric or from numeric to character.

Image
Convert values from character to numeric or from numeric to character\Convert variable values using either the INPUT or PUT function. Convert a character value to a numeric value by using the INPUT function. Specify a numeric informat that best describes how to Read the data value into the numeric variable. When changing types a new variable name is required. If you need to keep the original variable name for the new type, use the RENAME = option as illustrated in Sample 2. data char; input string : $8. date : $6.; numeric= input (string, 8.); sasdate= input (date, mmddyy6.); format sasdate mmddyy10.; datalines ; 1234.56 031704 3920 123104; proc print; run ; data now_num; input num date: mmddyy6.; datalines ; 123456 110204 1000 120504;   run; data now_char ; set now_num (rename=(num=oldnum date=olddate)); num = put (oldnum,6. -L); date = put ( olddate , date9 .); run ; proc print ; run ; Source: support.sas.com Here ...