Discover More Tips and Techniques on This Blog

Advanced SDTM Mapping Pitfalls and How to Avoid Them

Advanced SDTM Mapping Pitfalls and How to Avoid Them

Introduction

Mapping clinical data to SDTM domains is a complex process involving many technical and logical challenges. For experienced programmers, common issues often revolve around proper handling of controlled terminology, managing derived variables, ensuring consistency between domains, and maintaining relational integrity. This article explores some of the most common SDTM mapping pitfalls, with advanced solutions and SAS code examples, to help avoid regulatory submission errors.

Pitfall 1: Handling Derived Variables Incorrectly

One of the most common issues in SDTM mapping is incorrectly handling derived variables, which can lead to inaccuracies in key datasets such as EX (Exposure) and VS (Vital Signs).

Example:

A derived variable such as EXDUR (Exposure Duration) in the EX domain may not be properly calculated or may fail to align with the duration of the actual treatment.

Solution:

Use a SAS data step to derive exposure duration based on start and end dates (EXSTDTC and EXENDTC), ensuring proper ISO 8601 formatting and alignment with other datasets.


data ex_derived;
    set ex;
    /* Calculate the difference in days between EXSTDTC and EXENDTC */
    if not missing(EXSTDTC) and not missing(EXENDTC) then do;
        EXDUR = intck('day', input(EXSTDTC, yymmdd10.), input(EXENDTC, yymmdd10.));
    end;
    else EXDUR = .;
run;

proc print data=ex_derived;
    title "Derived EXDUR Based on Start and End Dates";
run;
    

Pitfall 2: Inconsistent Mapping of Medical History (MH) and Adverse Events (AE)

A common challenge is ensuring consistency between the AE (Adverse Events) and MH (Medical History) domains when both involve related clinical data. Programmers often struggle to map similar events across these domains, resulting in duplicated or inconsistent information.

Example:

A medical history event such as "Diabetes" may appear in both the MH and AE domains, but without proper mapping, this can create discrepancies.

Solution:

Create validation checks that ensure consistency between MH and AE, flagging records where conditions in MH appear without an associated AE or where the same event is inconsistently coded.


proc sql;
    create table check_ae_mh as 
    select a.USUBJID, a.MHTERM as Event, b.AETERM as AE_Event
    from mh as a
    left join ae as b
    on a.USUBJID = b.USUBJID and a.MHTERM = b.AETERM
    where missing(b.AETERM);
quit;

proc print data=check_ae_mh;
    title "Medical History Terms Without Corresponding Adverse Events";
run;
    

Pitfall 3: Issues with Handling Visit Windows

Assigning the correct visit window (such as VISITNUM, VISIT, or VISITDY) is critical in SDTM domains such as VS (Vital Signs), LB (Laboratory Tests), and EG (ECG). A common issue is assigning observations to the wrong visit window when the visit dates don’t match the planned visit windows.

Example:

If a patient visit happens on Day 15, but the expected window is Day 14, misalignment in visit windows can occur, leading to incorrect visit assignment.

Solution:

Create visit windowing rules using advanced date logic, ensuring that observations are correctly assigned to the proper visit window.


data vs_visit_window;
    set vs;
    if abs(VISITDY - planned_day) <= 2 then VISITWINDOW = 'On Time';
    else if VISITDY < planned_day then VISITWINDOW = 'Early';
    else VISITWINDOW = 'Late';
run;

proc print data=vs_visit_window;
    where VISITWINDOW ne 'On Time';
    title "Vital Signs with Early or Late Visits";
run;
    

Pitfall 4: Incorrect Handling of Controlled Terminology (CT)

Another common issue is failing to properly implement and regularly update controlled terminology for variables such as AEDECOD (Adverse Event Dictionary Term) or CMDECOD (Concomitant Medication Dictionary Term). Regulatory submissions often fail due to outdated or incorrect terminology.

Example:

Incorrectly mapped terms in the Concomitant Medications (CM) domain to outdated controlled terminology (e.g., WHO Drug) can result in a failed submission.

Solution:

Use a dynamic lookup to validate the controlled terms against the latest version of the dictionaries, flagging any mismatches or outdated terms.


proc sql;
    create table cm_term_check as 
    select a.*, b.whodrug_term
    from cm as a
    left join whodrug as b
    on a.CMDECOD = b.whodrug_term
    where missing(b.whodrug_term);
quit;

proc print data=cm_term_check;
    title "Mismatched Terms in Concomitant Medications Domain";
run;
    

Pitfall 5: Incorrect RELREC (Related Records) Relationships

Establishing correct relationships between records in different domains is essential when creating the RELREC domain. Misaligned or incomplete relationships can cause inconsistencies in your submission and prevent accurate linking between datasets.

Example:

A common issue is incorrectly linking the AE (Adverse Events) domain to the LB (Laboratory Tests) domain, where the related records aren't properly structured.

Solution:

Validate the relational integrity of records between AE and LB using the RDOMAIN and IDVARVAL variables in the RELREC domain.


proc sql;
    create table relrec_check as 
    select a.*, b.LBSEQ
    from relrec as a
    left join lb as b
    on a.USUBJID = b.USUBJID and a.IDVARVAL = b.LBSEQ
    where missing(b.LBSEQ);
quit;

proc print data=relrec_check;
    title "Mismatched RELREC Relationships Between AE and LB Domains";
run;
    

Pitfall 6: Handling Variables in Supplemental Qualifiers (SUPPxx)

Supplemental qualifiers (SUPPxx) domains are often used for storing non-standard variables, but incorrect or incomplete linking between the main domain and the SUPPxx domain can lead to issues during submission.

Example:

In the AE domain, supplemental qualifiers such as additional severity information may not be properly linked, causing discrepancies.

Solution:

Check for missing or improperly linked records between AE and SUPPAE domains, ensuring that each supplemental qualifier is accurately referenced.


proc sql;
    create table suppae_check as 
    select a.*, b.AESEQ
    from suppae as a
    left join ae as b
    on a.USUBJID = b.USUBJID and a.IDVARVAL = b.AESEQ
    where missing(b.AESEQ);
quit;

proc print data=suppae_check;
    title "Unlinked SUPP Qualifiers in AE Domain";
run;
    

Conclusion

Addressing these advanced SDTM mapping pitfalls requires a combination of regular validation, correct handling of controlled terminology, and careful management of relationships across datasets. By implementing automated SAS routines, you can streamline the process and ensure compliance with regulatory standards, preventing costly delays in submission.

Disclosure:

In the spirit of transparency and innovation, I want to share that some of the content on this blog is generated with the assistance of ChatGPT, an AI language model developed by OpenAI. While I use this tool to help brainstorm ideas and draft content, every post is carefully reviewed, edited, and personalized by me to ensure it aligns with my voice, values, and the needs of my readers. My goal is to provide you with accurate, valuable, and engaging content, and I believe that using AI as a creative aid helps achieve that. If you have any questions or feedback about this approach, feel free to reach out. Your trust and satisfaction are my top priorities.