Discover More Tips and Techniques on This Blog

Showing posts with label Label. Show all posts
Showing posts with label Label. Show all posts

PROC TRANSPOSE: How to Convert Variables(columns) into Observations(ROWS) and Observations(ROWS) into Variables(Columns)

During my early days as a SAS programmer, I used to get confused which statement in PROC TRANSPOSE used for what and why?

PROC TRANSPOSE syntax looks like a straightforward thing, but we need to look at important details it offers and without that we may get unexpected results.

Proc Transpose Options:
Proc Tranpose offers several options like OUT=, LABEL=, NAME=, and PREFIX=.
Each option is distinct from the others. "OUT=" option assigns an output dataset, "LABEL=" option assigns a nemae to the variable which has the label of the transposed variable. If we don’t use "LABEL=" option in the PROC TRANSPOSE syntax, the defalut “_LABEL_” will be assigned to the variable that is being transposed.

"NAME= " option works same as the "LABEL=" option, as if we use NAME=option in the TRANSPOSE syntax, which will assign the name to the variable that is being tranposed.

There is another option that we can use in the TRANSPOSE syntax that is "PREFIX=", which assigns the prefix for the transposed variables.

We have two different type of TRANSPOSE: “UP” and “DOWN”:

“UP” transpose change rows of a dataset to columns, whereas the “Down” transpose change columns to rows. We either use "UP" or "DOWN" tranpose depending upon the requirement.

Whenver we use PROC TRANSPOSE we need to ask some questions ourselves, since visualizing the PROC Transpose syntax is not that easy:

To develop the PROC TRANSPOSE syntax we need to know the answers to the above questions,

1) Which variable/variables need to get transposed?
A) Whatever the variable we mention in the ID statement.
2) What are the variables that need to stay same as in the input dataset?
A) Variables that are included in the BY statement.
3) Which variables values need to become the values for the transposed variable.
A) Variables that are included in the VAR statement.


Here I am taking a simple example:

I have a dataset called X {has 3 variables: QTR (char) mob (num) ncl (num)}

Data x;
input qtr$ mob ncl;

cards;
2006q1 0 4
2006q1 1 5
2006q1 2 4
2006q1 3 6
2006q2 0 7
2006q2 1 2
2006q2 2 8
2006q2 3 7
2006q3 0 2
2006q3 1 4
2006q3 2 8
2006q3 3 8
;

 run;


To get the required output use the following syntax: (Proc sort+Proc Tranpsoe+Arrays)

*Sorting needs to be done before we use PROC TRANSPOSE;
proc sort data=x;
by qtr mob;

run;
 

*Transposing the variable MOB using PROC TRANSPOSE; 
proc transpose data=x out=new prefix=mob;
var ncl;
by qtr mob;

id mob;
run;


*Array was used to convert all missing values to zero;
 
data new; 
set new;
array zero{4} mob0-mob3;
do i=1 to 4;
if zero(i)=. then zero(i)=0;
end;
drop i mob _name_;

run;


Output:





Example 2:

data grades;

input patid name $ class $ grade;
cards;
10 Alice E1 78
10 Alice E2 82
10 Alice E3 86
10 Alice E4 69
10 Alice P1 97
10 Alice F1 160
11 Barabara E1 88
11 Barabara E2 72
11 Barabara E3 86
11 Barabara E4 99

11 Barabara P1 100
11 Barabara F1 170
12 Jane E1 98
12 Jane E2 92
12 Jane E3 92
12 Jane E4 99
12 Jane P1 99
12 Jane F1 185
;
run;


*Using datastep;


data grades2;
set grades;
by patid;
if class='E1' then E1=grade;
else if class='E2' then E2=grade;
else if class='E3' then E3=grade;
else if class='E4' then E4=grade;
else if class='P1' then P1=grade;
else if class='F1' then F1=grade;
if last.patid then output;
retain E: P: F:;
drop class grade;
run;




*Using Arrays;

data grades1;
array t(*) e1 e2 e3 e4 p1 f1;
do i=1 to 6;
set grades;
t(i)=grade;
end;
drop i class grade;
run;




*Using Proc transpose;


proc transpose data=grades out=grades3(drop=_:);
by patid name;
var grade;
id class;
run;




*REVERSE PROCESS; Getting back the Original dataset;


*Using Arrays;

data grade;
set grades1;
array t(*) e1 e2 e3 e4 p1 f1;
do i=1 to 6;
grade=t(i);
output;
end;
drop e1 e2 e3 e4 p1 f1 i;
run;






*Using Proc transpose;


proc transpose data=grades1 out=grade(rename=(_name_=class));
by patid name;
var e1 e2 e3 e4 p1 f1;
run;






*Transposing multiple variables using ARRAYS;


*Example:; Arrays are very effective when you want to transpose multiple variables at a time.;


data test;
length petest $40 peorres $200;
set test(drop=petest);
array tra {*} peorres01-peorres12;
array tr {*} petest01-petest12;
array t {*} peclnsig01-peclnsig12;
do i=1 to 12;
peorres=tra(i);
petest=tr(i);
peclnsig=t(i);
output;
end;
drop peclnsig01-peclnsig12 peorres01-peorres12 petest01-petest12 i;
run;

****You need to use multiple proc transpose steps if you want to do the above process using Proc Transpose .

Displaying the Graphs (Bar charts) using PROC GCHART in SAS

Displaying the Graphs (Bar charts) using PROC GCHART in SAS :

Just a day ago, I've received a question on Graphs from one of the member of  my orkut community, which prompted me to create following example.

Below are 5 different types of Graphs were produced (including 3d graphs) using Proc Gchart.
Sample data set used for the examples:

data x;input quarter $ ncl ram;
cards;
q1 20000 2000
q1 30000 3000
q1 45000 2000
q2 23000 2003
q2 45000 4500
q3 46000 5600
q3 89000 600
q3 67000 4500
q4 45890 890
q4 46000 9800

;
run;

*Example1: Simple barchart;
/* Add a title */

title 'Sales Report';

/* Produce the bar chart */
proc gchart data=x;vbar quarter / sumvar=ncl nozero ;
format ncl dollar8.;
run;
quit;
Snapshot of the output:





*Example2: Producing 3D graphs with cylinder shape using Proc Gchart;

pattern c=blue;
axis1 minor=none label=('NCL Sum');


title 'Bar Chart Shape';

/* use the SHAPE= option to specify a cylinder */

proc gchart data=x;
vbar3d quarter / sumvar=ncl discrete raxis=axis1 shape=cylinder
cframe=yellow autoref width=5;
format ncl dollar8.;
run;
quit;


Snapshot of the output:

axis1 minor=none label=('NCL Sum') offset=(0,0);

*Example3: Producing 3D graphs with reference lines using Proc Gchart;
/* use the FRONTFREF option to place reference lines in front of the bars */

proc gchart data=x;
vbar3d quarter / sumvar=ncl discrete raxis=axis1
coutline=black cframe=yellow width=
5vref=30 inside=sum frontref;
format ncl dollar8.;
run;
quit;


Snapshot of the output:

*Example4: Producing graphs after annotating the value above each midpoint bar using Proc Gchart;
*Create annotate dataset to assign the value on each midpoint bar;


data annotate;
length function color style text $ 8;
retain function 'label' color 'black' when 'a' style 'swiss' xsys ysys '2' position '2' size 3 hsys '3' ;
set x;
by quarter;

/*calculating the cumulative totals for each quarter;*/
if first.quarter then final=ncl;
else if ^first.quarter then final+ncl;
midpoint=quarter;
if last.quarter then text=left(put(final,dollar8.));
y=ncl;
run;

/* Define colour patterns for bars */pattern1 c=blue;
pattern2 c=green;
pattern3 c=red;
pattern4 c=yellow;

/* Adding the title */
Title 'Sales per Quarter Report';
/* Produce the bar chart */proc gchart data=x;
vbar quarter /discrete sumvar=ncl
/* subgroup – Width – width of the bar...... Space – space between two bars
Annotate – write the total value on the top of each bar.*/
subgroup=quarter
width=7space=2.0annotate=anno
;
format ncl dollar8.;
run;
quit;


Snap shot of the output:



5)Here is another way to create a graph..
I've added the titles and footnotes.... to this graph..


goptions reset=global gunit=pct border cback=white


colors=(black red green blue) ftitle=swissb
ftext=swiss htitle=6 htext=4
offshadow=(1.5,1.5);



title1 'NCL vs Quarter';
footnote1 h=3 j=r 'Sarath';


axis1 label=none
origin=(24,);
axis2 label=none
minor=(number=1)
offset=(,0);
legend1 label=none
shape=bar(3,3)
cborder=black
cblock=gray
origin=(24,);



pattern1 color=lipk;
pattern2 color=cyan;
pattern3 color=lime;
pattern4 color=red;
pattern5 color=blue;


proc gchart data=x;
format ncl dollar8.;
vbar3d quarter / sumvar=ncl
subgroup=ncl
outside=sum
width=9
space=7
maxis=axis1
raxis=axis2
coutline=black
legend=legend1;
run;
quit;





Snapshot of the output:









SAS GRaph


Disclosure:

In the spirit of transparency and innovation, I want to share that some of the content on this blog is generated with the assistance of ChatGPT, an AI language model developed by OpenAI. While I use this tool to help brainstorm ideas and draft content, every post is carefully reviewed, edited, and personalized by me to ensure it aligns with my voice, values, and the needs of my readers. My goal is to provide you with accurate, valuable, and engaging content, and I believe that using AI as a creative aid helps achieve that. If you have any questions or feedback about this approach, feel free to reach out. Your trust and satisfaction are my top priorities.