Monday, February 9, 2009

Options VALIDVARNAME=UPCASE

VALIDVARNAME= V7 UPCASE ANY

VALIDVARNAME= option is generally used in SAS whenever we want to control the SAS variable names in the dataset.

VALIDVARNAME= V7 UPCASE ANY

The default option will be VALIDVARNAME=V7 until we specify as UPPERCASE or ANY.

When we mention options VALIDVARNAME=V7, that means we are telling SAS to change the name of the Database column (etc EXCEL sheet column) to valid SAS name with certain rules keeping in mind.

Here are those rules that SAS needs to follow, when it changes the DBMS column name to valid SAS name.

Only 32 mixed case (lower or uppercase) characters are allowed in each variable.

Names should be starting with an underscore or an alphabet (either uppercase or lower case character).

Invalid characters in the DBMS column (ex. $) should be changed to underscores.


See the SAS Language Reference: Dictionary to get more details about the rules.

VALIDVARNAME=UPCASE
When we mention options VALIDVARNAME=UPPERCASE we are telling SAS to change the column name of the Database column to uppercase variables irrespective of type of variables in the DBMS column.


And whenever we want the same kind of characters in SAS dataset which are in the DBMS column (ex .(=) sign and the Asterisk(*) or the forward slash(\) we have to mention options

VALIDVARNAME=ANY
If we do, this will allows any characters which are in the DBMS column to be kept in the SAS dataset.

To understand the concept better here I am giving the example:

Example
The following example shows how the Pass-Through Facility works with
VALIDVARNAME=UPPERCASE.

options validvarname=uppercase;
proc sql;
connect to oracle as tables(user=USERID orapw=passward path=’INSTANCE’);
create table lab as
select lab_rslt, lab_test
from connection to oracle
(select "laboratory result$", "laboratory test$"
from DBMStable);
quit;

When we check the Output we observe that the variables in the DBMS column is changed to upper case as well as V7 (default option) converts those variables into UPPERCASE variables. Ex: " laboratory result$" becomes LAB_RSLT and " laboratory test$" becomes LAB_TEST.


6 comments:

karthik said...

hi
can you let me know how we can include id variable representing the no. of observations in proc sql.

what i'm saying is the equivalent step in sql for the following code

data a;
set a;
id=_n_;

sarath said...

Proc sql;
create table a1 as
select monotonic() as ID,
from a;
quit;

as we are using here proc sql, the output dataset name should be different than the input dataset.
so instead of keeping 'a' as the output dataset I've used a1.

sarath

Unknown said...

sarath, why shuld we use proc sql for combining datasets? instead why shuld'nt we use combing data sets by concatinating mergeing and interleaving etc.. tell me in which cases we use proc sql and in which cases we use merging, interleaving etc....

sarath said...

Neelima...

we can use.. Proc sql... for almost all the situations....(not only merging/Interleaving)

whatever you do.. with the SAS datastep.. you can achive the same result .. using PROC SQL....

PROC SQL ..... is sleak and simple...

There is no.. such .. situation.. where .. we have to use ... PROC SQL... infact .... Many of my co-workers(even though they have 5-6 yrs exp min...always hesitate to use PROC SQL...

Unknown said...

hai can anyone tell me , how can we get the attributes of first five datasets in library?

sarath said...

Hi Sudeshna,

Here is the automated macro code to get the attributes of first five datasets from any library.

libname sdtm 'G:\SASDATA\_DIS\P9002_0109\Validation\data';

data contents(compress=no); *Create a dummy dataset;
run;

%macro cmpare (lib,n);
data _null_;
set sashelp.vstabvw;
where libname="&lib" and memname ne " ";
call symput ('dsname'||left(_n_), trim(left(memname)));
run;

%do i=1 %to &n;
proc contents data=sdtm.&&dsname&i. out=&&dsname&i. noprint;
run;

data contents;
set &&dsname&i. contents ;
keep memname name label length type format informat;
if memname ne '';
run;
%end;


%mend cmpare;
%cmpare(lib=SDTM,n=5);

SDTM is the library name and n=5 is the number of datasets whose atrributes we want to check.

All the best.

Sarath

Post a Comment

ShareThis