Discover More Tips and Techniques on This Blog

Showing posts with label Do loops. Show all posts
Showing posts with label Do loops. Show all posts

Change all missing values of all variables into zeros/putting zeros in place of missing values for variables

Have you been asked how to convert missing values for all the variables into zeros..... if you are.... here is the answer for that.....

In this example the I have used array to do the same.

The variable list includes ID and Score1 to score6.Using simple array method we can change all the missing value for the variables score1 to score6 to 0.

data old;
input ID SCORE1 SCORE2 SCORE3 SCORE4 SCORE5 SCORE6;
cards;
24 100 97 . 100 85 85
28 . 87 98 100 . 90
60 100 . . 100 100 100
65 100 98 100 . 90 100
70 99 97 100 100 95 100
40 97 99 98 . 100 95
190 100 . 97 100 100 90
196 100 100 . 100 100 100
210 . 85 . 90 80 95
;
 

run;

*Ist Method;
data new;

set old;
array zero score1-score6;do over zero;
if zero=. then zero=0;
end;
run;


*2nd Method;
data new;

set old;
array nums _numeric_;

do over nums;
if nums=. then nums=0;
end;
run;



proc print;
Title 'Missing values changed to zero using arrays and a do loop';

run;
Output:



ID SCORE1 SCORE2 SCORE3 SCORE4 SCORE5 SCORE6

24 100 97 0 100 85 85
28 0 87 98 100 0 90
60 100 0 0 100 100 100
65 100 98 100 0 90 100
70 99 97 100 100 95 100
40 97 99 98 0 100 95
190 100 0 97 100 100 90
196 100 100 0 100 100 100
210 0 85 0 90 80 95




 Missing values changed to zero using arrays and a do loop.


What if we don't want to convert all.. missing values in variables to zero... I mean .. some of them needs to be converted to zeros and some to 1.

Here is the sample code for that:

The following code will convert all the missing values into either 1 or 0 depending upon the value of ID. If the value of ID less than or equal to 70 then the missing value should be converted to 1 else if the ID value is greater than 70 then the missing values can be converted into 0.

data new1;
set old;
array RS(6) score1-score6 ; 
do i=1 to 6;
if ID le 70 then do;
if RS(i)=. then RS(i)=1; 

end;
else if id gt 70 then do;
if RS(i)=. then RS(i)=0; 

end;
end;
run;

 

/*Macro converts all missing values for numeric variables into 0*/
%macro replaceMissing(ds);
DATA &ds.;SET &ds.;
ARRAY ZERO _NUMERIC_;
DO OVER ZERO;
if ZERO=. then ZERO=0;
end;

run;
%mend replaceMissing;
%replacemissing(dsn);


********************************************************************;
data missing;
set sashelp.column;
array chars _character_;
do over chars;
if chars='' then chars='Missing';
end;
array nums _numeric_;
do over nums;
if nums=. then nums=0;
end;
run;
********************************************************************;
The above code converts missing values of all charcater variables in the sashelp.column dataset  to 'MISSING' . It also converts missing values of all numeric variables in the sashelp.column dataset to 0. 




LAG Function: How to obtain information from previous observation(s)

Often times SAS® programmers need to retain the value of a variable in the current observation to the next observation. The LAG function  can be very helpful here. A LAGn (n=1-100) function returns the value of the nth previous execution of the function. It is easy to assume that the LAGn functions return values of the nth previous observation.


Using the LAG function to obtain information from previous observation(s)

**********************************************************;/* Sample 1: Create a single lag of one variable */


data one;
input x;
lagonce=lag(x);
datalines;
1
2
3
4
5
;
proc print data=one;
title 'Sample1: Single lag of one variable';
run;

***************************************************************;/* Sample 2: Create multiple lags of one variable */


data two;
input x;
lag1=lag(x);
lag2=lag2(x);
datalines;
1
2
3
4
5
;
proc print data=two;
title 'Sample 2: Multiple lags of one variable';
run;
***************************************************************;/* Sample 3: Create a single lag of one variable within a BY-Group */
/* See also: */
/* Sample 140: Obtaining the previous value of a variable within */a BY-Group */
/* Sample 108: Use the LAG function to conditionally carry */
/* information down a data set */



data three;
input group $ x;
datalines;
a 1
a 2
a 3
b 1
b 2
b 3
b 4
;
data final;
set three;
by group;
lagx=lag(x);
/* Note the LAG function is executed outside the IF condition. */
/* On the first member of the BY-Group, the variable created */
/* with the LAG function is reset to missing. */

if first.group then lagx=.;
run;
proc print data=final;
title 'Sample 3: Single lag of one variable within a BY-Group';
run;

RESULTS:
Sample1: Single lag of one variable
Obs x lagonce
1 1 .
2 2 1
3 3 2
4 4 3
5 5 4

Sample 2: Multiple lags of one variable
Obs x lag1 lag2
1 1 . .
2 2 1 .
3 3 2 1
4 4 3 2
5 5 4 3

Sample 3: Single lag of one variable within a BY-Group
Obs group x lagx
1 a 1 .
2 a 2 1
3 a 3 2
4 b 1 .
5 b 2 1
6 b 3 2
7 b 4 3

source: http://support.sas.com/kb/25/938.html


Without Using LAG Function:
*****************************************************************************;


Example2:
data lagcheck;

input a b ;
datalines;
1 1
. 2
. 3
. 4
. 5
2 6
. 7
. 8
3 9
. 10
. 11
. 12
. 13
. 14
;
run;
*Method1;
data lagcheck;
set lagcheck;
n=_n_;
if missing(a) then do;
do until (not missing(a));
n=n-1;
set lagcheck(keep=a) point=n;
end;
end;
run* Note: Remember 2 Set statements;
**********************************************************;
*Method2;
data lagcheck;
set lagcheck;
retain lasta;
if not(missing(a)) then lasta=a;
if missing(a) then a=lasta;
drop lasta;
run;

***************************************************************;
* Here is another example given in SAS-L archives about Re: A Confusion about how to filling out empty cells with duplicates. and interesting solutiion using UPDATE Statement;

data have;
input Subject number1 number2;
infile datalines truncover;
datalines;
10001 212
10001 . 10
10002 555
10002
10002
10002 . 11
10003 11
10003
10003 . 12
10003

;;;;
run;


data need;
do _n_ = 1 by 1 until(last.subject);
update have(obs=0) have;
by subject;
end;
do _n_ = 1 to _n_;
output ;
end ;
run;
**********************************************************************;

Disclosure:

In the spirit of transparency and innovation, I want to share that some of the content on this blog is generated with the assistance of ChatGPT, an AI language model developed by OpenAI. While I use this tool to help brainstorm ideas and draft content, every post is carefully reviewed, edited, and personalized by me to ensure it aligns with my voice, values, and the needs of my readers. My goal is to provide you with accurate, valuable, and engaging content, and I believe that using AI as a creative aid helps achieve that. If you have any questions or feedback about this approach, feel free to reach out. Your trust and satisfaction are my top priorities.