Saturday, May 19, 2012

My 5 Important reasons to use Proc SQL

• Proc SQL requires few lines of SAS code compared with datastep and or Proc steps
• Frequency counting can be done in no time… which is very helpful during the QC or validation
• Proc SQL can merge datasets together using different variable names unlike datastep.
• Proc SQL can merge many datasets together in the same step on different variables
• Proc SQL allows you to join more than two datasets together at the same time on different levels
• The merge process Proc SQL join does not overlays the duplicate by-column, where the Merge    statement of the data step does.

Data step vs Proc SQL

•  Data step Merge– Pre sorting of the dataset by the by-variable needed before the merging process
– Requires common variable names

–  May need few more lines of code than Proc SQL

•   PROC SQL Join process works different than the typical Data step Merge.....
– Duplicate matching columns won't be getting overlaid
– Can merge more than one dataset together, on different levels (don’t need to merge multiple datasets together using the same variable)


Roger said...

True, but the left join performance can be very slow.

