Posts

Showing posts from August, 2008

How to Create a new data set from multiple data sets based upon sorted order

Image
Create a new data set from multiple data sets based upon sorted order Use the SET and BY statements to interleave data sets. Note: Interleaving uses a SET statement and a BY statement to create a new, sorted data set from multiple data sets. The number of observations in the new data set is the sum of the number of observations in the original data sets. The observations in the new data set are arranged by the values of the BY variable(s) and, within each BY-Group, the order of the data sets in which they occur, including duplicates. To interleave, data sets need to be in sorted order or indexed on the BY variables. Output: source: support.sas.com

Dynamically generate SET statement to combine multiple data sets

Dynamically generate SET statement to combine multiple data sets You can manually enter the data set names or use MACRO logic to generate the repetitive data set names when combining many data sets on a SET statement. Note: Variable name lists are not valid for data sets. In other words, if you have WORK.DS1, WORK.DS2, and WORK.DS3, you can NOT specify WORK.DS1-WORK.DS3 on the SET statement. /* names use the naming convention of DSn, where n is an incrementing /* number. data ds1; x=1; run; data ds2; x=2; run; data ds3; x=3; run; /* Build a macro called NAMES with two parameters. The first parameter / * is the 'prefix' of the naming pattern. The second parameter is the / * maximum number of data sets you want to generate on the SET statement. */ %macro names(prefix,maxnum); %do i=1 %to &maxnum; &prefix&i %end; ; %mend names; /* Call the macro on the SET statement */ data all; set %names(DS,3); run; proc print data=all; title "Appended results"; run; run; A...

How to determine which data set contributed an observation

Determine which data set contributed an observation Use the IN= option to create a boolean variable that is set to one or 'true' to indicate whether the data set contributed data to the current observation. When the IN= variable's value is 1, assign the data set's name into a new variable. data one; input string $;datalines; apple banana coconut; run; data two; input string $; datalines; anagram bottle clown dog; run; data combo; set one(in=o) two(in=t); if o then origin='one'; else origin='two'; run; proc print data=combo;run; run; RESULTS: Obs string origin 1 apple one 2 banana one 3 coconut one 4 anagram two 5 bottle two 6 clown two 7 dog two source: www.support.sas.com

How can I convert a numeric date variable to a character variable using PROC SQL?

Helpful documents for Proc SQL: Inside PROC SQL's Query Optimizer PROC SQL -- The Long and The Short of It Performance Enhancements to PROC SQL in Version 7 of the SAS® System Using the SAS/ACCESS Libname Technology to Get Improvements in Performance and Optimizations in SAS/SQL Queries How can I convert a numeric date variable to a character variable using PROC SQL? The PUT function can be used within PROC SQL to create a character date string from a numeric SAS date value. data numdate; ******************************************************; data numdate; date=TODAY(); run; proc sql; create table new as select PUT ( date , date9.) as newdate from numdate ; quit ; ************************************************************; source: www.support.sas.com How can I convert a character date variable to a numeric date variable with PROC SQL? The INPUT function can be used within PROC SQL to create a numeric date variable from a character date string. ******...

CLASS Statement

When should I put a variable in the CLASS statement? What does the CLASS statement do? The CLASS statement is used to indicate which variables in the model are categorical variables. In most modeling procedures, such a variable is then treated as a nominal (unordered) categorical predictor variable. A set of numeric indicator ("dummy") variables is created internally to represent the levels of the variable. Because the indicator variables are used for fitting the model, the original variable does not need to be numeric. The resulting model has multiple parameter estimates (one for each indicator variable). Each parameter compares one level of the predictor with a reference level, typically the last level in sorted order. A joint test of all the estimated parameters for the predictor is a test for any differences among the levels and is therefore a test of the predictor's overall effect. In contrast, a variable name that appears in the MODEL statement but not in the CL...

List of the Domains (datasets) and the variables in it: (CDISC perspective)

List of the Variables in Each Dataset:

TLF samples

TABLES: Screening Failures Subject Disposition Subject Disposition by Visit Premature Discontinuation from Study Medication Subject Disposition by Center Protocol Deviations Demographics and Baseline Characteristics Medical and Surgical History Gynecological History Screening Pap Smear, Mammography, and Serum Pregnancy Test Results Number (%) of Subjects Who Took Concomitant Medication Duration of Treatment in Days and Study Drug Compliance Number (%) of Subjects with Treatment-Emergent Adverse Events by Body System and Preferred Term Number (%) of Subjects with Common (>=2%) Treatment-Emergent Adverse Events by Preferred Term, sorted by Descending Order of Incidence in the Trt-x Number (%) of Subjects with Treatment-Emergent Adverse Events by Body System, Preferred Term, and Maximum Intensity Number (%) of Subjects with Treatment-Emergent Thromboembolic Events Number (%) of Subjects with Drug-Related Treatment-Emergent Adverse Events by Body System and Preferred Term Number (%) of ...

Different phases I-IV of a clinical trial

Clinical Trials/Studies in Humans The clinical testing (investigation) of experimental drugs (previously unproven in humans, therefore "experimental") in humans is normally done in three phases (Phase I, II and III) with more and more people included in each subsequent phase. Before moving to the next phase of development the data are carefully analyzed to ensure the experimental drug is at least safe and well tolerated. After successful completion of Phase I-III testing, a company will submit the results of all of the studies to the FDA or TPD to obtain a New Drug Approval (NDA). Once the FDA or TPD grants a company with a NDA, the company can market the drug (medication) to the public. Additional testing (post-marketing or late phase III/phase IV) to look at the ongoing-term safety continues. Phase I Studies Phase I studies are primarily concerned with the drug's safety, and are the first time the drug is tested in humans. These studies are typically done in a ...

How to Read Delimited Text Files into SAS

This sample program shows you how to read a delimited text file into SAS. A text file is often referred as raw data that can be prepared in a variety of formats (e.g., csv [comma-separated values], tab-delimited, or spacedelimited). When reading text files with the DATA step, two statements are used: INFILE and INPUT statements. The INFILE statement is used to specify the physical file being read. You can use a FILENAME statement in conjunction with an INFILE statement (see Example.1). Or, you can specify the full path to the file in the INFILE statement (see example.2). A set of options are available in the INFILE statement: DLM, DSD, and LRECL. The DLM option allows you to tell SAS what character is used as the delimiter in the text file. If this option is not specified, SAS assumes the delimiter is a space. Common delimiters include commas, vertical pipes, semi-colons, and tabs. The DSD option has three functions when reading delimited files. The first function is to remove any quot...

How to Determine if a variable exists in the dataset or not

Programatically determine if a variable exists in a data set Use the OPEN and VARNUM functions to determine if a variable exists in a SAS data set. The OPEN function opens a SAS data set and returns a data set identifier. The VARNUM function returns the variable's position in the data set. If the variable does not exist then VARNUM returns a 0. /* Create sample data */ data test; input fruit $ count; datalines; apple 12 banana 4 coconut 5 date 7 eggs 9 fig 5 ; /* Specify the data set to open in the OPEN function. Specify the */ /* variable to be located in the second parameter of the VARNUM */ /* function. If the variable does not exist, the value of check */ /* will be zero. */ data _null_; dsid=open('test'); check=varnum(dsid,'count'); if check=0 then put 'Variable does not exist'; else put 'Variable is located in column ' check +(-1) '.'; run; OUTPUT: /* Partial LOG output */ 311 data _null_; 312 dsid=open('test'); 313 check=varnum...

What to do when we want only the even number observations

Output only even number observations. Note: The MOD function returns the remainder when values are divided. In this sample, when dividing i by 2, there will be no remainder for the even observations (2,4,6,8 and 10). If there is a remainder, the current observation has to be an odd numbered observation. This program outputs only the even numbered observations to a new data set. /* Create sample data set */ data one; do i=1 to 10; output; end; run; data two; set one; /* The MOD function returns the remainder from the division of */ /* argument-1 by argument-2. If the remainder is zero, when */ /* the second argument is 2, then the first argument must be */ /* even...therefore, output the observation. */ if mod(i,2)=0 then output; run; proc print; run; OUTPUT: Obs i 1 2 2 4 3 6 4 8 5 10 source:www.support.sas.com

How to verify the existence of the external file:

Verify the existence of an external file Conditionally execute code to read in a file only when the file exists. Note: Although your operating environment utilities may recognize partial physical filenames, you must always use fully qualified physical filenames with FILEEXIST. This example verifies the existence of an external file. If the file exists, read in the file using INFILE and INPUT statements. If the file does not exist, display a message in the SAS log that states the file does not exist. Note that in a macro statement you do not enclose character strings in quotation marks. /* If the file passed to the macro does exist, read in the file and create */ /* a character variable called VAR with a default length of 8 bytes. If */ /* file named in the macro call does not exist, write "FILE DOES NOT EXIST..." */ /* to the log. */ %macro in_file(file); %if %sysfunc(fileexist(&file))=1 %then %do; data a; infile "&file"; input var $; run; %end; %else %do; d...

Accurately Calculating Age with Only One Line of Code

direct link: This tip was written by William Kreuter , a senior computer specialist at the University of Washington in Seattle. He has used SAS software in public health research since 1981, and now specializes in manipulating large data sets for the School of Public Health and the School of Medicine. He can be reached at billyk@u.washington.edu. A frequent need of SAS software users is to determine a person's age, based on a given date and the person's birthday. Although no simple arithmetic expression can flawlessly return the age according to common usage, efficient code to do so can be written using SAS software's date functions. This article, by SAS software user William Kreuter, presents a one-line solution for this purpose. Put SAS Date Functions to Work for You Many kinds of work require the calculation of elapsed anniversaries. The most obvious application is finding a person's age on a given date. Others might include finding the number of years since any ...

Search a character expression for a string, specific character, or word:INDEX/INDEXC Functions

Image
Choose appropriate INDEX function to find target strings, individual letters, or strings on word boundaries. Note: Sample 1 uses INDEX to search for the first occurrence of a 'word' anywhere in a string. If the string is not found, the result is zero. Sample 2 uses INDEXC to locate the first occurence of any character specified in the excerpt. If no target is found, the result is zero.** Sample 3 uses INDEXW to find the target excerpt in a string on a word boundary. If the word is not found, the result is zero. RESULTS: Sample 1: INDEX Sample 2: INDEXC Sample 3: INDEXW source:www.support.sas.com

How to use the MISSING function when you don't know if the variable is characer or numeric

Image
Using the MISSING function from SAS Functions by Example By Ron Cody Ever need to check for a missing value, but you're not sure if the variable is character or numeric? No problem when you use the MISSING function. This function takes either character or numeric variables and it checks for the .A, .B, ._ numeric missing values as well. For example: DATA MISSING; INPUT CHAR $ X Y; IF MISSING (CHAR) THEN N_CHAR + 1; IF MISSING (X) THEN N_X + 1; IF MISSING (Y) THEN N_Y + 1; DATALINES ; CODY 5 6 . . . WHATLEY .A ._ LAST 10 20 ; PROC PRINT DATA=MISSING NOOBS ; TITLE " Listing of MISSING "; RUN ; A listing of MISSING , below, shows that the MISSING function works correctly with character and numeric values, including all the alternative numeric missing values: Listing of MISSING source: www.support.sas.com

HOW TO USE THE SCAN FUNCTION:

USING THE SCAN FUNCTION: SCAN (string,n,delimiters): returns the nth word from the character string string, where words are delimited by the characters in delimiters.  It is used to extract words from a  character value when the relative order of words is known, but their starting positions are not. NewVar= SCAN (string,n<, delimiters >); -returns the nth ‘word’ in the string   When the SCAN function is used: 􀂃 the length of the created variable is 200 bytes if it is not previously defined with a LENGTH statement 􀂃 delimiters before the first word have no effect When the SCAN function is used, 􀂃 any character or set of characters can serve as delimiters Points to remember while using SCAN Function: 􀂃 a missing value is returned if there are fewer than n words in string 􀂃 two or more contiguous delimiters are treated as a single delimiter 􀂃 if  n is negative, SCAN selects the word in the character string starting from the end of stri...

How to use the PROPCASE function

Using the PROPCASE function The "old" way to capitalize the first letter of words was to use LOWCASE, UPCASE, and the SUBSTR function, like this: DATA CAPITALIZE; INFORMAT FIRST LAST $30.; INPUT FIRST LAST; FIRST = LOWCASE (FIRST); LAST = LOWCASE (LAST); SUBSTR( FIRST , 1 , 1 ) = UPCASE(SUBSTR( FIRST , 1 , 1 )); SUBSTR( LAST , 1 , 1 ) = UPCASE(SUBSTR( LAST , 1 , 1 )); DATALINES; ronald cODy THomaS eDISON albert einstein ; PROC PRINT DATA=CAPITALIZE NOOBS; TITLE "Listing of Data Set CAPITALIZE"; RUN ; With the PROPCASE function in SAS 9.1, it's much easier. DATA PROPER ; INPUT NAME $60.; NAME = PROPCASE (NAME); DATALINES; ronald cODy THomaS eDISON albert einstein ; PROC PRINT DATA=PROPER NOOBS; TITLE "Listing of Data Set PROPER"; RUN; source:www.support.sas.com Example: data test ; x= lowcase ( ' MY NaMe iS SARaTh ' ) ; y= propcase (x) ; z= propcase ( lowcase ( ' ...

How to capitalize the first letter of every word in a string

Image
Capitalize the first letter of every word in a string Convert a text string into mixed case. Note: Beginning in SAS 9.1, this task is easily accomplished with the PROPCASE function. See Sample 2 on the Full Code tab. /* Sample 1: COMPBL, LOWCASE, SCAN, INDEX, UPCASE, SUBSTR */ data sample; input name $char50.; /* Lowercase the entire string, remove consecutive blanks */ newname= compbl ( lowcase (name)); length next $ 20; i=0; next=scan(newname,1, ' ' ); do while (next ne ' ' ); i+1; /* Scan off each 'word' based upon a space, locate the position */ /* of the first letter in the original string, UPCASE the first */ /* letter and use SUBSTR to replace the byte. */ pos= indexw (newname,trim(next)); substr (newname,pos,1)= upcase ( substr (newname,pos,1)); next= scan (newname,i, ' ' ); end; keep name newname; datalines; Jane DOE min ning chou HENRIK HANSSON D ETCHEVERRY, Charo B ; proc print ; run; /* Sample 2: PROPCASE (available in SAS 9.1) */...

Options in SAS' INFILE Statement

Options in SAS' INFILE Statement There are a number of options available for the INFILE statement. Below you will find discussion of the following options: DLM='character', DSD, MISSOVER, and FIRSTOBS=value. DLM='character' When I prepare a data file for list input to SAS, I use a blank space as the delimiter. The delimiter is the character which must appear between the score for one variable and that for the next variable. One can, however, choose to use a delimiter other than a blank space. For example, the comma is a commonly used delimiter. If you are going to use a delimiter other than a blank space, you must tell SAS what the delimiter is. Here is an example of a couple of data lines in a comma delimited file: 4,2,8010,2,4,2,4,4,2,2,2,2,2,2,4,4,2,4,2,2,CDFR,22,900,5,4,1 4,2,8011,1,2,3,1,3,4,4,4,1,2,2,4,2,3,4,3,1,psychology,24,360,4,3,1 Here is the INFILE statement which identified the delimiter as being a comma: infile 'd:\Research-Misc\Hale\Hale.csv' ...

Finding the number of observations in SAS dataset

There are a number of ways of finding out the number of observations in a SAS data set and, while they are documented in a number of different places, I have decided to collect them together in one place. At the very least, it means that I can find them again. First up is the most basic and least efficient method: read the whole data set and increment a counter a pick up its last value. The END option allows you to find the last value of count without recourse to FIRST.x/LAST.x logic. data _null_ ; set test end=eof; count+1; if eof then call symput( "nobs" ,count); run; The next option is a more succinct SQL variation on the same idea. The colon prefix denotes a macro variable whose value is to be assigned in the SELECT statement; there should be no surprise as to what the COUNT(*) does… proc sql noprint; select count(*) into :nobs from test; quit; Continuing the SQL theme, accessing the dictionary tables is another route to the same end and has the advantage of need...

Convert values from character to numeric or from numeric to character.

Image
Convert values from character to numeric or from numeric to character\Convert variable values using either the INPUT or PUT function. Convert a character value to a numeric value by using the INPUT function. Specify a numeric informat that best describes how to Read the data value into the numeric variable. When changing types a new variable name is required. If you need to keep the original variable name for the new type, use the RENAME = option as illustrated in Sample 2. data char; input string : $8. date : $6.; numeric= input (string, 8.); sasdate= input (date, mmddyy6.); format sasdate mmddyy10.; datalines ; 1234.56 031704 3920 123104; proc print; run ; data now_num; input num date: mmddyy6.; datalines ; 123456 110204 1000 120504;   run; data now_char ; set now_num (rename=(num=oldnum date=olddate)); num = put (oldnum,6. -L); date = put ( olddate , date9 .); run ; proc print ; run ; Source: support.sas.com Here ...