Exploring the Analysis Data Model

Today, I stumbled upon a blog which is interesting and resourceful. I liked the article so much so want to share with all my friends here.

Here is the direct link for the post to download or to review:
Actual Article:

The Analysis Data Model (ADaM) is a standard released by the Clinical Data Interchange Standards Consortium (CDISC) and has quickly become widely used in the submission of clinical trial information. ADaM has very close ties to another of CDISCs released standards, Study Data Tabulation Model (SDTM).

The main difference between these two CDISC standards is the way in which the data is displayed. SDTM provides a standard for the creation and mapping of collected data from Raw sources, where as ADAM provides a standard for the creation of analysis-ready data, often using SDTM data as the source.

The purpose of the analysis-ready ADaM data is to provide the programmer with a means to create tables, listings and figures with minimal time and effort whilst ensuring a clear level of traceability in the derived values. This is a key factor of ADaM data as there is a need for a clear and unambiguous flow from the study tabulation data to the analysis data which supports the statistical analyses performed in a clinical study.

CDISC state the following key principles for Analysis Datasets:

facilitate clear and unambiguous communication and provide a level of traceability
be useable by currently available tools
be linked to machine-readable metadata
be analysis-ready

To perform statistical analysis on a study, data maybe required from many domains, such as labs, adverse events, demographics and subject characteristics. Bringing this data into ADaM datasets and performing any complex derivations required for display endpoints means that no further data manipulation is required to produce statistical outputs.

When creating the ADaM datasets the requirements of the analyses must be taken into consideration. This will ensure the desired numbers of datasets are produced – at the very least; a subject level dataset is required. Some of the data will be duplicated between domains, for example Age and Gender data. This is acceptable as this will aid the output creation or data review.

The naming convention for the datasets will follow “ADxxxx”, where the “xxxx” part will be sponsor-defined - AE for adverse events, LB for Laboratory results for example. The subject-level dataset, which will be discussed later, will be named “ADSL”. For the ADaM variables, the naming conventions should follow the standardized variable names defined in the ADaM Implementation Guide (ADaMIG). Any variables from the SDTM which are used directly in the ADaM dataset should keep the same variable name to avoid confusion. Sponsor-defined variable names will be given to any other analysis variables. Following these conventions will provide clarity for the reviewer.

As previously mentioned, a key requirement for ADaM data is a subject-level analysis dataset. This dataset and its documentation are always required – even if no other data is submitted. The subject-level dataset, or “ADSL” as it is named within ADaM conventions, contains a record for each subject with variables which display key information for subject disposition, demographic, and baseline characteristics.

Other variables within ADSL will contain planned or actual treatment group information as well key dates and times of the subjects study participation on the study. Not all variables within ADSL may be used directly for analysis but could be used in conjunction with other datasets for display or grouping purposes or possibly included simply as variables of interest for review.

To conclude, the CDISC summary of ADSL is as follows: “The critical variables in ADSL will include those that are either descriptive, known to affect the subject’s response to drug (in terms of either efficacy or safety), used as strata for randomization, or identify the subject or event as belonging to specific subgroups (e.g. population flags). For example, subjects may be randomized after being stratified by age group because it is believed that younger subjects respond differently to the study drug. In this situation, a subject’s age category would be considered a critical variable for a study and included in ADSL.”

I hope you guys liked it.

Sarath

Search This Blog

Exploring the Analysis Data Model – ADaM Datasets