Wednesday, October 29, 2008

LAG Function: How to obtain information from previous observation(s)

Often times SAS® programmers need to retain the value of a variable in the current observation to the next observation. The LAG function  can be very helpful here. A LAGn (n=1-100) function returns the value of the nth previous execution of the function. It is easy to assume that the LAGn functions return values of the nth previous observation.


Using the LAG function to obtain information from previous observation(s)

**********************************************************;/* Sample 1: Create a single lag of one variable */


data one;
input x;
lagonce=lag(x);
datalines;
1
2
3
4
5
;
proc print data=one;
title 'Sample1: Single lag of one variable';
run;

***************************************************************;/* Sample 2: Create multiple lags of one variable */


data two;
input x;
lag1=lag(x);
lag2=lag2(x);
datalines;
1
2
3
4
5
;
proc print data=two;
title 'Sample 2: Multiple lags of one variable';
run;
***************************************************************;/* Sample 3: Create a single lag of one variable within a BY-Group */
/* See also: */
/* Sample 140: Obtaining the previous value of a variable within */a BY-Group */
/* Sample 108: Use the LAG function to conditionally carry */
/* information down a data set */



data three;
input group $ x;
datalines;
a 1
a 2
a 3
b 1
b 2
b 3
b 4
;
data final;
set three;
by group;
lagx=lag(x);
/* Note the LAG function is executed outside the IF condition. */
/* On the first member of the BY-Group, the variable created */
/* with the LAG function is reset to missing. */

if first.group then lagx=.;
run;
proc print data=final;
title 'Sample 3: Single lag of one variable within a BY-Group';
run;

RESULTS:
Sample1: Single lag of one variable
Obs x lagonce
1 1 .
2 2 1
3 3 2
4 4 3
5 5 4

Sample 2: Multiple lags of one variable
Obs x lag1 lag2
1 1 . .
2 2 1 .
3 3 2 1
4 4 3 2
5 5 4 3

Sample 3: Single lag of one variable within a BY-Group
Obs group x lagx
1 a 1 .
2 a 2 1
3 a 3 2
4 b 1 .
5 b 2 1
6 b 3 2
7 b 4 3

source: http://support.sas.com/kb/25/938.html


Without Using LAG Function:
*****************************************************************************;


Example2:
data lagcheck;

input a b ;
datalines;
1 1
. 2
. 3
. 4
. 5
2 6
. 7
. 8
3 9
. 10
. 11
. 12
. 13
. 14
;
run;
*Method1;
data lagcheck;
set lagcheck;
n=_n_;
if missing(a) then do;
do until (not missing(a));
n=n-1;
set lagcheck(keep=a) point=n;
end;
end;
run* Note: Remember 2 Set statements;
**********************************************************;
*Method2;
data lagcheck;
set lagcheck;
retain lasta;
if not(missing(a)) then lasta=a;
if missing(a) then a=lasta;
drop lasta;
run;

***************************************************************;
* Here is another example given in SAS-L archives about Re: A Confusion about how to filling out empty cells with duplicates. and interesting solutiion using UPDATE Statement;

data have;
input Subject number1 number2;
infile datalines truncover;
datalines;
10001 212
10001 . 10
10002 555
10002
10002
10002 . 11
10003 11
10003
10003 . 12
10003

;;;;
run;


data need;
do _n_ = 1 by 1 until(last.subject);
update have(obs=0) have;
by subject;
end;
do _n_ = 1 to _n_;
output ;
end ;
run;
**********************************************************************;

1 comments:

Eric Lewis said...

Analytic function query more than one row in a table at a time without having to join the table. It returns values from a previous row in the table. To return a value from the next row, try using the lead function.

Post a Comment

ShareThis