Sunday, August 16, 2009

Macro IN Operator

Have you ever come across a situation where you have to write a macro program where a macro variable has more than one value? Writing a macro program in this case involves so many different conditions and to connect each condition you generally use OR operator as below…

%macro test;
%if &dsn=ae or %if &dsn=ds or %if &dsn=co or %if &dsn=cm %then %do; Some---SAS—Statements;
%end;
%test;

You can use a simple IN operator inside the datastep and make the code very simple… like as follows…

data dsn;
set old;
if &dsn in ("ae","ds","co","cm") then do;
Some---SAS—Statements;
end;
run;

Can we use the same IN operator inside the macro…

If you are using any SAS version prior to 9.2 … the answer is.. You cannot use IN operator inside the macro.

But…..

In SAS 9.1.3 version:
You can use character # (new binary comparison operator) as an alternate operator to mnemonic IN operator.

Using # operator in the above code:

%macro test;
%if &dsn # ae ds co cm %then %do;
Some---SAS—Statements;
%end;
%test;

Even if you use it, SAS will give you an ERROR message saying the Operator is not recognized.

In SAS 9.2 version:
You can directly use the IN operator inside your macro code as like in simple datastep in SAS 9.2.

Using IN operator in the above code:

%macro test;
%if &dsn in ae ds co cm %then %do;
Some---SAS—Statements;
%end;
%test;

Note:
No need of % sign in front of IN operator.
In SAS 9.2 both IN and # work if you use the system option minoperator inside your macro call.

In SAS Version 9.2:

%macro test/minoperator;
%if &dsn # ae ds co cm %then %do;
Some---SAS—Statements;
%end;
%test;

%macro test;
%if &dsn in ae ds co cm %then %do;
Some---SAS—Statements;
%end;
%test;


Both works fine…..

MINOPERATOR option tells SAS to recognize the word 'IN' or special symbol # by the SAS macro facility as an infix operator when evaluating logical or integer expressions.


Here is another way of writing the macro code with delimiters.

Use MINDELIMITER option to change the default delimiter from space to any other, in this case it is comma (,).

options mindelimiter;

%macro test/mindelimiter=',';
%if &dsn in ae,ds,co,cm %then %do;
Some---SAS—Statements;
%end;
%test;

Thursday, August 13, 2009

Macro Debugging Options:MPRINT, MLOGIC, SYMBOLGEN, MACROGEN, MFILE

Macro Debugging Options:

Debugging a macro code isn’t easy process. It is difficult to identify the problem in the SAS code just by seeing the ERROR message in the LOG file. It is lot easier to debug a macro if we use the following SAS options.

There are four system options that are helpful in debugging SAS Macros:

SYMBOLGEN, MPRINT, MLOGIC, and MFILE.
Each option adds different information to the SAS LOG File.

Like any other SYSTEM OPTIONS, you turn these on using the OPTIONS statement:

Options macrogen symbolgen mlogic mprint mfile;
You can turn them off by specifying the following OPTIONS statement:

Options nomacrogen NoSymbolgen nomlogic nomprint nomfile;


Both statements are needed in side SAS. Once you set any option and it will remain in effect throught the SAS session. So if you want to debug the macro use first OPTIONs Statemnt else use 2nd one.

Let’s look at each option, and the output they produce in more detail…


MPRINT:As we know that when we execute a macro code SAS doesn’t display it in the LOG file, but using the MPRINT option displays all the SAS statements of the resolved macro code.

MPRINT option prints one statement per line along with resolved macro code.

LOG FILE:
31 %macro concat;
32 data all;
33 set
34 %do i = 1 %to 4;
35 dsn&i
36 %end;
37 ;
38 run;
39 %mend;
40 %concat;
MPRINT(CONCAT): data all;
MPRINT(CONCAT): set dsn1 dsn2 dsn3 dsn4 ;


NOTE: There were 6 observations read from the data set WORK.DSN1.
NOTE: There were 6 observations read from the data set WORK.DSN2.
NOTE: There were 6 observations read from the data set WORK.DSN3.
NOTE: There were 6 observations read from the data set WORK.DSN4.
NOTE: The data set WORK.ALL has 24 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 0.06 seconds
cpu time 0.06 seconds

MPRINT(CONCAT): run;

Note: Even though we wrote the macro code, the MPRINT option prints the resolved macro code into the LOG file as seen above.


MLOGIC:This option is very helpful when we deal with nested macros (Macro inside another macro). Often we use %DO loops and or %IF-%THEN-%ELSE statements inside the macro code and LOGIC option will display how the macro variable resolved each time in the LOG file as TRUE or FALSE .

To be more specific, MLOGIC option identifies and displays the macro logic. It even follows the macro execution pattern.

LOG FILE:
31 %macro concat;
32 data all;
33 set
34 %do i = 1 %to 4;
35 dsn&i
36 %end;
37 ; /* this additional ';' is necessary, the first ';' is for
38 the "%end", while the second ';' is for "set " */
39 run;
40 %mend;
41
42 %concat;
MLOGIC(CONCAT): Beginning execution.
MLOGIC(CONCAT): %DO loop beginning; index variable I; start value is 1; stop value is 4; by value is
1.
MLOGIC(CONCAT): %DO loop index variable I is now 2; loop will iterate again.
MLOGIC(CONCAT): %DO loop index variable I is now 3; loop will iterate again.
MLOGIC(CONCAT): %DO loop index variable I is now 4; loop will iterate again.
MLOGIC(CONCAT): %DO loop index variable I is now 5; loop will not iterate again.
NOTE: There were 6 observations read from the data set WORK.DSN1.
NOTE: There were 6 observations read from the data set WORK.DSN2.
NOTE: There were 6 observations read from the data set WORK.DSN3.
NOTE: There were 6 observations read from the data set WORK.DSN4.
NOTE: The data set WORK.ALL has 24 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 0.06 seconds
cpu time 0.06 seconds

MLOGIC(CONCAT): Ending execution.

SYMBOLGEN:Often we use multiple ampersands (ex: &&dsn.&i) and SYMBOLGEN option prints the message in the LOG file about how the macro variable is resolved.

To be more specific, it prints message in the LOG whenever a macro variable get resolved.

LOG FILE:
31 %macro concat;
32 data all;
33 set
34 %do i = 1 %to 4;
35 dsn&i
36 %end;
37 ; /* this additional ';' is necessary, the first ';' is for
38 the "%end", while the second ';' is for "set " */
39 run;
40 %mend;
41
42 %concat;
SYMBOLGEN: Macro variable I resolves to 1
SYMBOLGEN: Macro variable I resolves to 2
SYMBOLGEN: Macro variable I resolves to 3
SYMBOLGEN: Macro variable I resolves to 4


NOTE: There were 6 observations read from the data set WORK.DSN1.
NOTE: There were 6 observations read from the data set WORK.DSN2.
NOTE: There were 6 observations read from the data set WORK.DSN3.
NOTE: There were 6 observations read from the data set WORK.DSN4.
NOTE: The data set WORK.ALL has 24 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 0.06 seconds
cpu time 0.06 seconds


MFILE:
MFILE is useful when we want to create a newfile with the resolved macro code.
To create a newfile you have to specify FILENAME statement as follows:

Filename mprint ‘C:\Users\Sarath Annapareddy\Desktop\macroresol.sas’;
Options mprint mfile;


Note: Whenever the macro code executes, the resultant resolved macro code will be written to the macroresol.sas file.

LOG FILE:

31 %macro concat;
32 data all;
33 set
34 %do i = 1 %to 4;
35 dsn&i
36 %end;
37 ; /* this additional ';' is necessary, the first ';' is for
38 the "%end", while the second ';' is for "set " */
39 run;
40 %mend;
41
42 %concat;
NOTE: The macro generated output from MPRINT will also be written to external file C:\Users\Sarath
Annapareddy\Desktop\macroresol.sas while OPTIONS MPRINT and MFILE are set.

NOTE: There were 6 observations read from the data set WORK.DSN1.
NOTE: There were 6 observations read from the data set WORK.DSN2.
NOTE: There were 6 observations read from the data set WORK.DSN3.
NOTE: There were 6 observations read from the data set WORK.DSN4.
NOTE: The data set WORK.ALL has 24 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 0.06 seconds
cpu time 0.06 seconds

macroresol.sas file.

data all;
set dsn1 dsn2 dsn3 dsn4 ;
run;


MACROGEN:

Option MACROGEN will turn on the macro expansion and is necessary when you use macros.


MPRINTNEST and MPRINTLOGIC:These options are available with SAS v9.0. These options can be useful when you use NESTED macros. You will see more information in the LOG file than what you usually see with MPRINT and MLOGIC options combine.

Both these options require the use of MPRINT and MLOGIC respectively.

Example code used here:

options macrogen mlogic mprint symbolgen mfile;
filename mprint 'C:\Users\Sarath Annapareddy\Desktop\macroresol.sas';

data dsn1;
input a @@;
cards;
1 2 3 5 6 7
;
run;

data dsn2;
input a @@;
cards;
4 5 6 9 8 6
;
run;

data dsn3;
input a @@ ;
cards;
21 22 23 24 25 26
;
run;

data dsn4;
input a @@;
cards;
10 11 12 13 14 15
;
run;

*Concatenating all the 4 datasets using a macro with the %DO LOOP;%macro concat;
data all;
set
%do i = 1 %to 4;
dsn&i
%end;
;
run;
%mend;

%concat;


  • LOG FILE:
Debugging Techniques:

Here are few techniques to debug a macro:

Check if %macro-%mend, %DO-%END and %IF-%THEN-%ELSE are used correctly in the code.
Check whether single or double quotes used for the macro variables.
Check for balancing of quotes.
Check whether you have used %LOCAL and or %GLOBAL in appropriate places.
Check the macro variable resolution using the %PUT statement whenever required.
Use SAS debugging options MACROGEN, MPRINT, MLOGIC and SYMBOLGEN
to make sure the macro code is executed as expected.


31 %macro concat;
32 data all;
33 set
34 %do i = 1 %to 4;
35 dsn&i
36 %end;
37 ; /* this additional ';' is necessary, the first ';' is for
38 the "%end", while the second ';' is for "set " */
39 run;
40 %mend;
41
42 %concat;

MLOGIC(CONCAT): Beginning execution.
MPRINT(CONCAT): data all;
NOTE: The macro generated output from MPRINT will also be written to external file C:\Users\Sarath
Annapareddy\Desktop\macoresol.sas while OPTIONS MPRINT and MFILE are set.
MLOGIC(CONCAT): %DO loop beginning; index variable I; start value is 1; stop value is 4; by value is
1.
SYMBOLGEN: Macro variable I resolves to 1
MLOGIC(CONCAT): %DO loop index variable I is now 2; loop will iterate again.
SYMBOLGEN: Macro variable I resolves to 2
MLOGIC(CONCAT): %DO loop index variable I is now 3; loop will iterate again.
SYMBOLGEN: Macro variable I resolves to 3
MLOGIC(CONCAT): %DO loop index variable I is now 4; loop will iterate again.
SYMBOLGEN: Macro variable I resolves to 4
MLOGIC(CONCAT): %DO loop index variable I is now 5; loop will not iterate again.
MPRINT(CONCAT): set dsn1 dsn2 dsn3 dsn4 ;
MPRINT(CONCAT): run;

NOTE: There were 6 observations read from the data set WORK.DSN1.
NOTE: There were 6 observations read from the data set WORK.DSN2.
NOTE: There were 6 observations read from the data set WORK.DSN3.
NOTE: There were 6 observations read from the data set WORK.DSN4.
NOTE: The data set WORK.ALL has 24 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 0.06 seconds
cpu time 0.06 seconds


MLOGIC(CONCAT): Ending execution.

Saturday, August 8, 2009

CALL SYMPUT vs CALL SYMPUTX

Call Symput:

Use CALL SYMPUT is you need to assign a data step value to a macro variable.

Syntax: Call Symput (“Macro variable”, character value)

The first argument to the Symput routine is the name of the macro variable to be assigned to the value from the second argument.

The second argument is the character value that will be assigned to the macro variable. The second argument need to be always a character value or if a numeric value is to be used it should convert first into character variable before assigning it to macro variable. It may lead to problems, if you don’t do the conversion from numeric to character. In this case, SAS automatically converts numeric value of the variable to character value before assigning it to macro variable and prints a message in the LOG saying that conversion happened.

See the example:

data _null_;
count=1978;
call symput('count',count);
run;
%put &count;

19 data _null_;
20 count=1978;
21 call symput('count',count);
22 run;

NOTE: Numeric values have been converted to character values at the places given by: (Line):(Column).
21:21
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.01 seconds


23 %put &count;
1978

Even though macro variable count is resolved to the value i.e 1978, SAS printed a note saying that NOTE: Numeric values have been converted to character values at the places given by: (Line):(Column). 21:21

To avoid that… you should do the conversion from numeric to character before assigning it to macro variable. Here is the syntax of that:

data _null_;
count=1978;
call symput('count',strip(put(count,8.)));
run;
%put &count;

29 data _null_;
30 count=1978;
31 call symput('count',left(put(count,8.)));
32 run;

NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds


33 %put &count;
1978

Note:

1) Even though we have created macro variable using the CALL SYMPUT but the same macro variable cannot be used in the same data step. The reason behind this is, macro code is compiled and executed before the data step code compiles and executes. So macro variable created by the CALL SYMPUT cannot available for the data step because macro variable compilation time occurs after or in the middle of the execution of the Data Step code.
(If a case arises where you have to access the same macro variable inside the data step, you can certainly do... by using two diff.. macro functions called RESOLVE or SYMGET)

2) SAS always aligns numeric values right and variable values get truncated as a part of this and to avoid that use the strip function to remove all the leading spaces as like in the above example.
3) If CALL SYMPUT is used outside the macro ( i.e open code) it creats global macro variable whereas it creats a local macro variable when it is used inside a macro.

CALL SYMPUTX:

SAS introduced CALL SYMPUTX in version 9 to address the pitfalls of CALL SYMPUT.

Advantages of CALL SYMPUTX over CALL SYMPUT include:

1) SYMPUTX automatically convert the numeric variables to character variables before assigning it to macro variable. (No need of manual conversion using PUT statement as given in the above example)

2) Call Symputx strips leading and trailing blanks. So no need of using STRIP or LEFT function to remove the leading spaces

Syntax: Call Symputx (“Macro Variable”, Character Value, Symbol Table)

First and second arguments are same as in CALL SYMPUT. The third argument (Symbol table) is optional and the valid value of it is G, L, and F. If we put G then the macro variable will be stored in the global symbol table, else if we specify L SAS will store the macro in the local symbol table, else if we don’t specify or specify F SAS follows the same rules as like for Call Symput.

Example:

data _null_;
count=1978;
call symputx('count',put(count,8.),’G’);
run;
%put &count;

29 data _null_;
30 count=1978;
31 call symputx('count',put(count,8.));
32 run;


NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds


33 %put &count;
1978

Note: If you use CALL SYMPUT instead of CALL SYMPUTX, the program executes in an identical manner, but a note is written to the SAS Log about conversion of numeric values to character values.


Here is the simple way to understand the diff between CALL SYMPUT AND CALL SYMPUTX:
Retrieved from: Using_the_SAS_V9_CALL_SYMPUTX_Routine from SAScommunity.org page:
Submitted by Michael A. Raithel.

The SAS V9 CALL SYMPUTX routine can save you keystrokes and lead to leaner, cleaner SAS programs.
Instead of using:
call symput('MACROVAR',trim(left(charvar)));

to load a SAS macro variable with a character string that might contain blanks, you could use SYMPUTX instead:
call symputX('MACROVAR',charvar);

Wednesday, July 29, 2009

Dictionary Tables and SASHELP Views:

Most of the SAS programmers think that the Metadata concepts (Dictionary tables and SASHELP views) are advanced SAS topics, in fact it is not.

I tried here to explain the concept precise and clear……

Here, I will cover the following topics……

Introduction to SAS metadata
How to see what’s in the tables
How to use the tables, using several real-world examples
Advantages

What are Dictionary Tables4?

Dictionary tables are created and automatically maintained by SAS system at run time to store information related to SAS data libraries, SAS system options, SAS catalogs, and external files associated with the currently running SAS session. CATALOGS, COLUMNS, EXTFILES, INDEXES, MEMBERS, MACRO, OPTIONS, TABLES, TITLES, AND VIEW are these objects which provide detailed information about the SAS files.

DICTIONARY tables are special read-only PROC SQL tables. They retrieve information about all the SAS data libraries, SAS data sets, SAS system options, and external files that are associated with the current SAS session.

PROC SQL automatically assigns the DICTIONARY libref. To get information from DICTIONARY tables, specify DICTIONARY.table-name in the FROM clause.

DICTIONARY.table-name is valid in PROC SQL only. However, SAS provides PROC SQL views, based on the DICTIONARY tables that can be used in other SAS procedures and in the DATA step. These views are stored in the SASHELP library and are commonly called "SASHELP views."

The following table lists some of the DICTIONARY tables and the names of their corresponding views3. Ref..... SAS help and Documentation for complete details:


Where to find/ how do we access Dictionary Tables?

Start a session. Open an interactive SAS session and use PRINT to display SASHELP.-VTABLE.

1. Start a SAS session
2. Using SAS explorer, navigate to LIBRARIES SASHELP VTABLE
3. Double-click to open it. Examine contents briefly
4. Close the VIEWTABLE window (don’t terminate the SAS session)

Note: Available only in Version 6.07 or later SAS versions

Dictionary Tables
􀂄 Always available
􀂄 Maintained by SAS
􀂄 Read only
􀂄 Have information on virtually every aspect of your SAS session

Two ways to access them
1. Dictionary tables
2. SASHELP views

What Tables are Available:
􀂄 V8 11 DICTIONARY tables • 17 SASHELP views

􀂄 V9.1.3• 22 DICTIONARY tables• 30 SASHELP views

COLUMNS/VCOLUMN:

Provides information about all variables in all data sets in all specified libraries. It gives us a lot of flexibility to investigate data sets.

Example 1:

Reading all variable names of one data set into one macro variable i.e &VAR

proc sql noprint;
Select name into: VAR separated by ","
from columns where libname="WORK” & memname="DEMO";
quit;


All variable names in data set DEMO are read into string &VAR &VAR = usubjid, subjid, age, gender, race, ..... , ...

TABLES/VTABLE:

Being Parallel with MEMBERS, TABLES provides more detailed information about SAS data files.

The following code is one example of outputting a summary table of SAS data files under certain specified directory.


TITLES/VTITLE:
TITLES: Contains messages about titles and footnotes which are used by current program.
TITLES: Provides a useful source of managing titles and footnotes in reports.

The TITLES table has an entry for each title and footnote line currently in effect. See the example below on how to use this table to save the current titles, and then restore them after running a report.

create table DICTIONARY.TITLES
(type char(1) label='Title Location',
number num label='Title Number',
text char(256) label='Title Text');



EXTFILES/VEXTFL:
The EXTFILES table has an entry for each external file registered (filerefs) in the session.

create table DICTIONARY.EXTFILES
(fileref char(8) label='Fileref',
xpath char(1024) label='Path Name',
xengine char(8) label='Engine Name');


To check What Tables are Available:

proc sql;
create table tables as
select * from dictionary.dictionaries;
quit;



What Views are Available:

Execute the following program to see what views are available:

proc sql;
create table views as
select distinct memname
from sashelp.vsview
where libname='SASHELP'and memname like 'V%';
quit;


Here are the SASHELP views which are available in SAS version 9.1.3.

MEMNAME
VALLOPT
VCATALG
VCFORMAT
VCHKCON
VCNCOLU
VCNTABU
VCOLUMN
VDCTNRY
VENGINE
VEXTFL
VFORMAT
VGOPT
VINDEX
VLIBNAM
VMACRO
VMEMBER
VOPTION
VREFCON
VREMEMB
VSACCES
VSCATLG
VSLIB
VSTABLE
VSTABVW
VSTYLE
VSVIEW
VTABCON
VTITLE
VVIEW


What is the Table Structure?

proc sql;
describe table DICTIONARY.OPTIONS;
quit;


NOTE: SQL table DICTIONARY.OPTIONS was created like:

create table DICTIONARY.OPTIONS
(optname char(32) label='Option Name',
opttype char(8) label='Option type',
setting char(1024) label='Option Setting',
optdesc char(160) label='Option Description',
level char(8) label='Option Location',
group char(32) label='Option Group');


What can we do with these dictionary tables and or SASHELP views?

Check Conflicting Variable Attributes:
(Comparing the test and production directories)

Identify Current Directory:
(Retrieving the executing filename and path of program programmatically)

Use Variables Meeting Criteria or and Identify Features Incompatible with Transport Format:

/*Example #1: Check if all variables in the analysis data sets did have names and*/
/*labels, and the lengths of these should not exceed 8 and 40 characters, respectively.*/

/*Example #2: Check if all analysis data sets did have a label, and the length of any label cannot exceed 40 characters. Also, the length of data set names should not*/

/*Example #3: Check if the length of any character value of variable exceeds 200 characters*/
Refer:

http://www.studysas.blogspot.com/2009/04/proc-comparesas-program-used-to-compare.html
http://www.studysas.blogspot.com/2009/04/maxvarlenmacro-check-length-of-all.html


/*Example #4: Check if variables with the same name across multiple data sets have the*/
/*same attributes, in order to ensure consistency*/

*Select variable names that are in multiple data sets;
ODS RTF FILE="columns.rtf";
TITLE "List of Columns in Multiple Data Sets";
PROC SQL;
select name length=10, memname length=28,
label length=50, type length=8,
length length=8, format length=8,
informat length=8
from dictionary.columns
where libname="PRODN" group by name
having count(*) > 1
order by name, memname;
QUIT;
ODS RTF CLOSE;

/*If we need to see variables with differences only, use the following:*/

OPTIONS orientation=Landscape;
ODS RTF FILE="columnsDiff.rtf";
TITLE "List of Columns in Multiple Data Sets with Discrepancy";
PROC SQL;
select name length=10, memname length=28, label length=50,
type length=8, length length=8, format length=8,
informat length=8
from dictionary.columns
where libname="PRODN"
group by name
having count(distinct label) > 1 or
count(distinct type) > 1 or
count(distinct length) > 1 or
count(distinct format) >1 or
count(distinct informat) > 1
order by name, memname;
QUIT;
ODS RTF CLOSE;



Put all the variables in Alphabetical Order:
/*Example #5: Put all the variables into Alphabetical order;*/

proc sql noprint;
select name into: new
separated by ','
from dictionary.columns where libname='WORK' and memname='MH'
order by name;
create table mh1 as select &new.
from mh;
run;


Note: Memname is the name of the dataset(MH). The above program reorder the variables alphabetically. We are taking help from dictionary.columns using proc sql to know and create a macro variable new to each variable in the dataset. Upon creating the macro variable, use the 2nd proc sql step…. to reorder the variables.

Count Observations/Variables in a Data Set:

Renaming all variables in a SAS dataset:

http://studysas.blogspot.com/2009/07/renaming-all-variables-in-sas-data-set.html


Resetting Titles:

data test;
a=1;
b=1;
c=123;
run;
ods listing close;
ods trace on;
ods escapechar='*';

ods rtf file="Title Check.rtf" bodytitle style=rtfout;

title1 j=r 'page 1 of 1';
title2 'Title check';
footnote1 program.sas;
footnote2 July 22nd 2009;

proc print data=test;
run;

ods rtf close;
ods trace off;
ods listing;

DATA __oldtitles;
set sashelp.vtitle;
RUN;

Proc Contents vs Dictionary Tables/SASHELP views2:

Getting the information by extracting directly from SASHELP /DICTIONARY TABLES can be very helpful. These allow us to have more power with controlling the output than with proc contents. The standard output of PROC CONTENTS provides information from one dataset or all the datasets in a single library. Some of the information provided by PROC CONTENTS, like location of dataset, position of the columns in a dataset, may not be necessary for the final user of the information, and it is not easy to eliminate. With the use of PROC SQL, selected information about one or more dataset can be retrieved from views and create datasets.

Advantages1:
Multiple libraries can be analyzed in a single SQL query. Each PROC CONTENTS step is limited to a single library reference. In addition, a second step is not required to reduce variable-level information to the table level.

This approach may be more efficient for libraries containing many variables. PROC CONTENTS may take much longer because of the variable-level nature of its output data set.

All SAS files - not just SAS data sets - can be analyzed through the DICTIONARY.MEMBERS table by selecting the appropriate values of the Memtype column. PROC CONTENTS only processes SAS data sets when creating an output data set.

Disadvantages:
Knowledge of SQL syntax is required. We can access SASHELP views using a simple datastep but it is much slower in generating the required information.

References:
1) Don't Be a SAS® Dinosaur: Modernize Your SAS Programs by Warren Repole : SAS Library Information: SAS Files

2) A data dictionary, a more complete alternative to Proc contents: Pharmasug 2002 proceedings (By Angelina Cecilia Casas, M.S. PPD Development, Research Triangle Park, NC):

3) SAS Help and Documentation

4) Where, When, and How to Use Dictionary Tables in SAS: NESUG 1999 Proceedings (By Jiang Jin, EDP Contract Services)

5) Simple Ways to Use PROC SQL and SAS DICTIONARY TABLES to Verify Data Structure of the Electronic Submission Data Sets: Pharmasug 2006 tutorials (By Christine Teng and Wenjie Wang, Merck Research Labs, Merck & Co., Inc., Rahway, NJ)

Wednesday, July 22, 2009

Even you can Use HASH and DOUBLE DASH: It’s that Simple……

In order to understand HASH and DOUBLE HASH concept in SAS you need to know about two different ranges of variables:

1) Numbered list:

When a set of variables have the same prefix, and the rest of the name is a consecutive set of numbers, we can use a single dash (-) to refer to an entire range. Some exs.. are….

VAR1 VAR2 VAR3 VAR4 VAR5

Shortcut list you can use to access all 5 variables is VAR1-VAR5

COL1 COL2 COL3 COL4 COL5 COL6 COL7

Shortcut list you can use to access all 7 variables is COL1-COL7


2) Name range list:
When you refer to a list of variables in the order in which they were defined in the SAS dataset, you can use a double dash (--) to refer to the entire range of variables in between them.

Ex: VAR1 VAR2 VAR3 COUNT VAR4 COL1 VAR5

Shortcut list you can use to access all 7 variables (including COL1, which has different prefix name than others) is VAR1- -VAR5

The general rule you should always remember for dash and double dash is:

Single dash is useful to access the consequently numbered variables.

Double dash is useful to access variables based on their order/position regardless of variable name.

Execute following program in SAS to understand the concept better.


data test;
var1=1;
var2=2;
var3=3;
COL=1;
var4=4;
var5=5;
run;

ods listing close;
ods rtf file="Single DASH output.rtf" style=rtfout;
title1 'Using Single DASH';

proc print data=test;
var var1-var5;
run;

ods rtf close;
ods trace off;
ods listing;


ods listing close;
ods rtf file="Double DASH output.rtf" style=rtfout;
title1 'Using DOUBLE DASH';

proc print data=test;
var var1--var5;
run;

ods rtf close;
ods trace off;
ods listing;


SINGLE DASH OUTPUT:

Obs var1 var2 var3 var4 var5


1 1 2 3 4 5

DOUBLE DASH OUTPUT:

Obs var1 var2 var3 COL var4 var5
1 1 2 3 1 4 5

Renaming All Variables in a SAS Data Set Using the SASHELP VIEWS

*Create a temporary dataset... DSN;
data dsn;
a=1;
b=2;
c=3;
d=4;
e=5;
f=6;
run;


%macro test(lib,dsn);

*/1)*/ data _null_;
set sashelp.vtable(where=(libname="&LIB" and memname="&DSN"));
call symput('nvars',nvar);
run;

*/2)*/ data dsn;
set sashelp.vcolumn(where=(libname="&LIB" and memname="&DSN"));
call symput(cats("var",_n_),name);
run;


*/3)*/ proc datasets library=&LIB;
modify &DSN;
rename
%do i = 1 %to &nvars;
&&var&i=Rename_&&var&i.
%end;
;
quit;
run;
%mend;

%test(WORK,DSN);

After submitting the above program... the output looks like this....

Output:
Rename_a Rename_b Rename_c Rename_d Rename_e Rename_f
1 2 3 4 5 6

Here is a way I know of.. to rename all the variables in the dataset;

It can be done using the SASHELP views as follows:

1) The 1st step of the program determines the total number of variables inside the dataset with the help of SASHELP.Vtable, Data _null_ step used to do the same …. A macro variable NVAR (Number of Variables) will be created after this step which can be accessed in the following steps.

2) What the 2nd step does is….assigns a unique IDNUM to all the variables of the dataset with the help of SASHELP.Vcolumn.

3) What the 3rd step does is … it uses Proc Datasets with the MODIFY statement to rename all the variables in the dataset. The DO LOOP is been used to resolve the 6 macro variables in a macro of a temporary dataset.

Thursday, July 9, 2009

Subscript or Superscript in the footers/titles of RTF output

Wondering how to create a subscripts and superscipt in title or footnotes of the rtf output... here is a way to do it....

Execute the following program to get an Idea about how to keep SUBSCRIPTS and SUPERSCRIPTS in footnotes and titles of rtf output..

************************************************************;
data test;
length
secnam $15;
input
sortord secnam $ pvalue;
cards;
1 demog 0.8812
2 ae 0.7112
3 disposition 0.8112
4 medicalhistory 0.9112
;

run;

ods listing close;
ods rtf file="Test output.rtf" style=rtfout;
ods escapechar='\';

proc report data = test missing split="$" spacing=0 headline nowd;
column sortord secnam pvalue;
define sortord / order noprint;
define secnam / order flow "Demographics$Variable\{super a}";
define pvalue / display flow "ANOVA$P-Value\{sub p}";
run;


ods rtf close;
ods trace off;
ods listing;
*************************************************************;

Variable\{super a}";
*Adds a superscript a to Demographics variable;
ANOVA$P-Value\{sub p}";*Adds a subscript p to ANOVA P-Value variable;

Make sure to use the same kind of brackets { }I have used in the sample code to get the superscripts and subscripts to get printed in the rtf output.

ESCAPECHAR need not be '\' as it is mentioned in the sample code... It could be '*' or '^' or '#' anything we can think of...

Snapshot of output: