.

Posts

Mastering Duplicates Removal in SAS: A Comprehensive Guide to Using PROC SQL, DATA STEP, and PROC SORT

- February 03, 2009

Removing Duplicate Observations in SAS: A Comprehensive Guide Removing Duplicate Observations in SAS: A Comprehensive Guide In data analysis, it's common to encounter datasets with duplicate records that need to be cleaned up. SAS offers several methods to remove these duplicates, each with its strengths and suitable scenarios. This article explores three primary methods for removing duplicate observations: using PROC SQL , the DATA STEP , and PROC SORT . We will provide detailed examples and discuss when to use each method. Understanding Duplicate Observations Before diving into the methods, let's clarify what we mean by duplicate observations. Duplicates can occur in different forms: Exact Duplicates: All variables across two or more observations have identical values. Key-Based Duplicates: Observations are considered duplicates based on the values of specific key variables (e.g., ID, Date). The ...

How to scan more than 20 records to determine variable attributes

- February 02, 2009

Usage Note 1075: How to scan more than 20 records to determine variable attributes in EFI In Versions 7 and 8 of the SAS System, by default the Import Wizard, PROC IMPORT and the External File Interface (EFI) scan 20 records to determine variable attributes when reading delimited text files. Changing the default setting can only be done for EFI in Version 7, Release 8 and Release 8.1. Beginning in Release 8.2 changing the default setting is applicable to the Import Wizard, PROC IMPORT and EFI. Follow the steps below to change the default behavior: 1) Type regedit on the command line (white box with a check mark) 2) When the Registry Editor window opens, double click on the PRODUCTS icon 3) Double click on the BASE icon 4) Double click on the EFI icon 5) In the window on the right the Contents of EFI will be populated with EFI options 6) Double click on GuessingRows 7) When the new window opens with the old value of 20 , delete it, enter the new value, and clic...

SAS sample programs

- January 03, 2009

SAS sample code programs: Macro for sorting the variables: How to convert character date values into numeric date values using DATASTEP/PROC SQL and ARRAYS: How to detect missing values using Arrays: First. & Last. Variables: How to determine the last observation in the dataset: How to determine whether a numeric or character value exists within a group of variables: Lag Function: How to obtain information from the previous observation: How to create a new dataset from multiple datasets based on the sorted order: Dynamically generate SET statement to combine multiple datasets: How to determine which dataset contributed an observation: How to determine if a variable exists in a dataset or not: How to Read Delimited Text Files into SAS: How to keep only even number observations in the dataset: How to verify the existence of an external file: Accurately calculating age in one line code: How to use INDEX/INDEXC functions: Finding the number of observations in the dataset: How to capita...

BASE SAS CERTIFICATION SUMMARY FUNCTIONS

- January 02, 2009

Certification Summary---Functions: sourec: http://wiki.binghamton.edu/index.php/Certification_Summary---Functions SAS Functions can be used to convert data and to manipulate the values of character variables. Functions are written by specifying the function name, then it's arguments in parentheses. Arguments can include variables, constants, or expressions. Although arguments are typically separated by commas, they can also be specified as variable lists or arrays. Contents[ hide ] 1 YEAR, QTR, MONTH and DAY Functions 2 WEEKDAY Function 3 MDY Function 4 DATE and TODAY Function 5 SUBSTR Function 6 INDEX Function 7 UPCASE Function 8 LOWCASE Function 9 INT Function 10 ROUND Function 11 TRIM Function 12 Some Sample Certification Examples 13 Points to remember SAS Tip: How to round the numbers: It's often a requirement to round the values of one variable to a precision defined in a different variable - either another data set variable, or a macro ...

SAS sample programs

- January 02, 2009

Reading/Writing Files Making a fixed format file Making a SAS Cport file Reading a SAS Cport file Reading multiple raw data files, Version 8 Reading multiple raw data files (version 6.x) Using a SAS macro to "set" multiple files Other Imputing the median Checking for duplicate Ids Macro to compute a rolling standard deviation Changing the length of a character variable Replacing strings Concatenating string variables using CAT functions Simple macro to do repeated procs Eliminate useless variables Matching husbands and wives Creating a wide table from a long dataset using PROC TABULATE How can I "fill down" a variable? Creating a long list of variable names based on an abbreviated one Filling in Missing Values Over Time Dummy Coding a Categorical Variable Using a Macro Program A few SAS macro programs for renaming variables dynamically source: http://oregonstate.edu/dept/statistics/sasclass/examples.htm Creating SAS datasets Read a SAS da...

macro for sorting the the datasets

- January 02, 2009

MACRO FOR SORTING: Rather than using the Proc Sort procedure all the time..... you can just use the following macro.... and call it when you req... to sort any SAS dataset..... EXAMPLE1: %macro srt(dtn,keyvar); proc sort data =&dtn; by &keyvar; run; %mend srt; %srt (ie,PT IEORRES); *the above step will tell SAS to sort the IE dataset with the by variables PT and IEORRES respectively. EXAMPLE2: *Sometimes we need to create an output dataset during the sorting process i.e in the Proc sort step in such a case use the below macro to do the same; %MACRO SORT(IN=,VAR=,OUT=); PROC SORT DATA=&IN OUT=&OUT; by &VAR; RUN; %MEND SORT; %SORT (IN=CEC1,VAR=PT,OUT=CEC2); %SORT (IN=DERIVE.HEADER,VAR=PT,OUT=HEADER1); EXAMPLE3: *Sometimes we need to use the NODUPKEY option to delete the duplicate observations in the dataset in such a case use the below macro to do the same; % MACRO SORT(IN,VAR,OUT,OPTN); PROC SORT DATA =...

How to convert numeric date values into character and from character date values into numeric using DATASTEP, PROC SQL and ARRAYS

- January 02, 2009

1) Converting character date values into numeric: /*I) Using the DATASTEP:*/ 1) Data dates; input cdate $9.; cards; 16-apr-07 01-01-07 02-jun-07 13-sep-07 ; run; Data Convert; set dates; Date = input ( cdate , ANYDTDTE9.); format date date7.; run; 2) Data dates ; input cdate $9.; cards; 16-apr-2007 01-01-2007 02-jun-2007 13-sep-2007 ; run; Data Convert ; set dates; Date = input ( cdate , ANYDTDTE11.); format date date9.; run ; *II) Using Proc SQL; *Numeric date variable can be converted to character date variable by using the PUT function within PROC SQL.; proc sql; create table date_char as select PUT(date, date9.) as ndate from date_num; quit; *Character date variable can be converted to numeric date variable by using the INPUT function within PROC SQL.; Proc sql; create table date_num as select INPUT (date, mmddyy10.) as ndate format= mmddyy10. from date_char; quit ; Or Proc sql ; create table date_num as sel...