Discover More Tips and Techniques on This Blog

Delete observations from a SAS data set when all or most of variables has missing data

/* Sample data set */
data missing;
input n1 n2 n3 n4 n5 n6 n7 n8 c1 $ c2 $ c3 $ c4 $;
datalines;
1 . 1 . 1 . 1 4 a . c .
1 1 . . 2 . . 5 e . g h
1 . 1 . 3 . . 6 . . k i
1 . . . . . . . . . . .
1 . . . . . . . c . . .
. . . . . . . . . . . .
;
run;

*If you want to delete observation  if the data for every variable is missing then use the following code;

*Approach 1: Using the coalesce option inside the datastep;

data drop_misobs;

set missing;
if missing(coalesce(of _numeric_)) and missing(coalesce(of _character_)) then delete;
run;

 
Pros:
*Simple code
Cons;
*This code doesn't work if we want to delete observation based on specific variables and not all of them.

*Approach 2:Using N/NMISS option inside the datastep;

data drop_missing;

set missing;
*Checks the Non missing values using ;

if n(n1, n2, n3, n4, n5, n6, n7, n8, c1, c2, c3, c4)=0 then delete;
run;

data drop_missing;

set missing;
*Checks the missing values using nmiss option;

if nmiss(n1, n2, n3, n4, n5, n6, n7, n8, c1, c2, c3, c4)=12 then delete; *12 is the total number of variables in the dataset missing.;
run;

*If you want to delete records based on few variables and don't want to type all the variable names in the IF-THEN clause use the following code;

*Task: Delete observations from the dataset if all variables in the dataset except (N1 and C1) has missing data;

proc contents data=missing out=contents(keep=memname name);
run;


*Create a macro variable names with list of variable names in the dataset;
proc sql;
select distinct name into:names separated by ','
from contents(where=(upcase(name) ^in ('N1','C1'))) where memname='MISSING'; *Excluding 2 variables in the dataset;
quit;


data remove_missing;
set missing;
if n(&names) lt 1 then delete;
run;

('DiggThis’)

Disclosure:

In the spirit of transparency and innovation, I want to share that some of the content on this blog is generated with the assistance of ChatGPT, an AI language model developed by OpenAI. While I use this tool to help brainstorm ideas and draft content, every post is carefully reviewed, edited, and personalized by me to ensure it aligns with my voice, values, and the needs of my readers. My goal is to provide you with accurate, valuable, and engaging content, and I believe that using AI as a creative aid helps achieve that. If you have any questions or feedback about this approach, feel free to reach out. Your trust and satisfaction are my top priorities.