Thursday, September 5, 2024

Mastering SAS: 6 Ways to Upcase All Variables in Your Dataset

How to Upcase All Variables in a SAS Dataset

How to Upcase All Variables in a SAS Dataset

When working with character data in SAS, you may often need to ensure that all text variables are in uppercase. Fortunately, SAS provides several methods to upcase variables in a dataset, depending on your specific needs. In this article, we will explore various ways to achieve this task and provide code snippets for each method.

1. Using DATA Step with UPCASE Function

The UPCASE function is the simplest way to convert all character variables to uppercase using a DATA step.


data upcase_dataset;
    set original_dataset;
    array char_vars _character_; 
    do i = 1 to dim(char_vars);
        char_vars[i] = upcase(char_vars[i]);
    end;
    drop i;
run;
    

This approach loops through all character variables and applies the UPCASE function to each of them.

2. Using PROC SQL with CASE Statement

In this method, PROC SQL is used with an explicit UPCASE call for each character variable. While this approach is less dynamic, it works well for small datasets.


proc sql;
    create table upcase_dataset as
    select upcase(var1) as var1,
           upcase(var2) as var2,
           upcase(var3) as var3
    from original_dataset;
quit;
    

3. Using PROC DATASETS and FORMAT

With PROC DATASETS, you can apply an uppercase format to all character variables in the dataset in one go.


proc datasets lib=work nolist;
    modify original_dataset;
    format _character_ $upcase.;
run;
quit;
    

4. Using a Macro with PROC SQL

This dynamic method uses a macro to automatically identify and upcase all character variables in the dataset. This is useful for larger datasets or when the number of character variables is unknown.


%macro upcase_all_vars(ds);
    proc sql noprint;
        select name into :char_vars separated by ' '
        from dictionary.columns
        where libname='WORK' and memname=upcase("&ds") and type='char';
    quit;

    data upcase_dataset;
        set &ds;
        %let count = %sysfunc(countw(&char_vars));
        %do i = 1 %to &count;
            %let var = %scan(&char_vars, &i);
            &var = upcase(&var);
        %end;
    run;
%mend;

%upcase_all_vars(original_dataset);
    

5. Using PROC FCMP

This approach involves creating a custom function using PROC FCMP and then applying it across all character variables.


proc fcmp outlib=work.functions.dataset_utils;
    function upcase_all(var $) $;
        return (upcase(var));
    endsub;
run;

options cmplib=work.functions;

data upcase_dataset;
    set original_dataset;
    array char_vars _character_;
    do i = 1 to dim(char_vars);
        char_vars[i] = upcase_all(char_vars[i]);
    end;
    drop i;
run;
    

6. Using HASH Object

An advanced approach is to use the SAS HASH object, which efficiently handles the upcase operation, especially when dealing with large datasets.


data upcase_dataset;
    if _n_ = 1 then do;
        declare hash h(dataset: 'original_dataset');
        h.defineKey('_N_');
        h.defineData('_ALL_');
        h.defineDone();
    end;

    set original_dataset;
    array char_vars _character_;
    do i = 1 to dim(char_vars);
        char_vars[i] = upcase(char_vars[i]);
    end;
    drop i;
run;
    

Conclusion

Each of the methods discussed above has its own strengths, depending on the size of the dataset and whether the variables are known in advance. Whether you're working with small or large datasets, SAS offers a variety of ways to upcase all variables. By selecting the most appropriate approach, you can simplify your workflow and ensure consistency across your data.

If you have any questions or need further assistance, feel free to leave a comment below!

Learn how to view SAS dataset labels without opening the dataset directly in a SAS session. Easy methods and examples included!

Quick Tip: See SAS Dataset Labels Without Opening the Data Quick Tip: See SAS Dataset Labels With...