Discover More Tips and Techniques on This Blog

Importance of Warnings and Notes messages from SAS log

The errors I will list here will be very few in number. They are errors that you will likely make at some time if you do not remain alert. These errors could have serious consequences so that is why I have described them as "deadly errors".

Missing "by" statement in merge:

Using "set" instead of "merge":

Existing "in" variable in merge:

"Flag" variables in data steps:

"Fixing" data during a merge:

Averaging averages:

Merging in Key Variables:

Read more at...........


Importance of Warnings and Notes messages from SAS log



A Utility Program for Checking SAS Log Files

Proc Sort NODUP vs NODUPKEY

The SORT Procedure (Proc Sort): Options




We can do many things using PROC SORT like

create a new data setsubset the data rename/ drop/ keep variablesformat, or label your variables etc



Options Available with Proc Sort:

  • OUT= OPTION
  • DESCENDING OPTION
  • DROP=, KEEP=, AND RENAME= OPTIONS
  •  FORMAT AND LABEL STATEMENTS
  • WHERE= OPTION OR WHERE STATEMENT
  • FIRSTOBS= AND OBS= OPTIONS
  • NODUPRECS AND NODUPKEY OPTIONS
  • DUPOUT


A common interview question for SAS jobs is "What is the difference between proc sort nodup and proc sort nodupkey?". The answer the interviewer is expecting is usually "proc sort nodup gets rid of duplicate records with the same sort key but proc sort nodupkey gets rid of other records with the same sort key". However, this is not correct.




Common Programming Mistake with Proc Sort NODUPRECS -


Equivalent of NODUPKEY in PROC SQL

Ian Whitlock Explains...

NODUPKEY is like FIRST. processing. Both depend on order which is an alien

concept to SQL. SQL depends on information stored as data, not in variable
names and not in order.

If for example you had the variables GROUP and SEQ where SEQ is a sequence
number within group, then you could use a GROUP BY GROUP clause with HAVING
SEQ=1. In short there must be something in the data values that indicates
which records you want.

The suggestion to use DISTINCT works when all records having the same value
of GROUP have all other relevant variables with equal values. This would be
equivalent to the NODUP option in PROC SORT.







Disclosure:

In the spirit of transparency and innovation, I want to share that some of the content on this blog is generated with the assistance of ChatGPT, an AI language model developed by OpenAI. While I use this tool to help brainstorm ideas and draft content, every post is carefully reviewed, edited, and personalized by me to ensure it aligns with my voice, values, and the needs of my readers. My goal is to provide you with accurate, valuable, and engaging content, and I believe that using AI as a creative aid helps achieve that. If you have any questions or feedback about this approach, feel free to reach out. Your trust and satisfaction are my top priorities.