Discover More Tips and Techniques on This Blog

PRXMATCH Function

Prxmatch () function is very useful in locating the matching strings. Prxmatch() function has 2 parameters, the first parameter is the regular expression ID (i.e what you are looking in a string for a match) and the second parameter is the character string to be searched. PRXMATCH () function returns the start position of the matching string.
Syntax:

PRXMATCH (perl-regular-expression, source);

Even though PRXMATCH function can be used when....
1) When you want to identify if there is alphanumeric (has any letter from A to Z) in a variable.
2) If you need to search a character variable for multiple different substrings.

Here is how PRXMATCH works in the Ist case.

*Prxmatch () function is very useful in locating the matching strings;

DATA finda2z;
INPUT ID $ 1-3 string $ 5-10;
prxmatch=prxmatch("/[a-zA-Z]/",string);
DATALINES;
001 ACBED
002 11
003 12
004 zx
005 11 2c
006 abc123
;
run;


proc print;
run;

Output:


*Here PRXMATCH function will return the start position of matching string. In this case, a to z or A to Z.

If you want to find out which observation has matching string for a specified variable.

Use the following code.

prxmatch=prxmatch("/[a-zA-Z]/",string)>0;

*If match found, the value returned is 1 or else 0.



To keep those records that do not match this pattern you will look for those records where PRXMATCH returns a zero.

PRXMATCH () function is very helpful If you need to search a character variable for multiple different substrings in a variable.

Problem: Select observations where AETERM has substrings ‘ nausea’, ‘vomiting’ and ‘fever’.

The old method is to combine several INDEX function statements together with OR conditions like as …..

if index(aeterm,'nausea') > 0 or
if index(aeterm,'vomiting') > 0 or
if index(aeterm,'fever') > 0 ;

The PRXMATCH function can do this all in one statement. Less typing….few lines of code.

if prxmatch ("m/nausea|vomiting|fever/i",aeterm) > 0 ;

The 'm' option in perl-regular-expression means, PRXMATCH is going to start a matching operation.
The 'i' option tells SAS not to worry about the case, i.e, consider  "NAUSEA" as same as "nausea" while searching for a match.
Another advantage of using this 'i' modifier is we can make parts of a string case sensitive and insensitive using  ( i:)  or (-i:).
( i:)  turns ON the case insensitive search
(-i:) - turn OFF the case insensitive search

Pipes ‘|’ should be used to separate the search strings.

Please refer PRXMATCH in the Functions section of the SAS Language Reference: Dictionary in the Online SAS Documentation for more information.

PRXMATCH special characters and it's meaning:
^ - start with

$ - end with
\D - any non digits
\d - digits
? - may or may not have?
| - or
* - repeating
( i:) - turns ON the case insensitive search

(-i:) - turn OFF the case insensitive search




Disclosure:

In the spirit of transparency and innovation, I want to share that some of the content on this blog is generated with the assistance of ChatGPT, an AI language model developed by OpenAI. While I use this tool to help brainstorm ideas and draft content, every post is carefully reviewed, edited, and personalized by me to ensure it aligns with my voice, values, and the needs of my readers. My goal is to provide you with accurate, valuable, and engaging content, and I believe that using AI as a creative aid helps achieve that. If you have any questions or feedback about this approach, feel free to reach out. Your trust and satisfaction are my top priorities.