Thursday, November 29, 2012

Creating Custom or Non-Standard CDISC SDTM Domains

Here is the nice article about creating custom SDTM domains.........

Creating Custom or Non-Standard CDISC SDTM Domains

Within the Clinical Data Interchange Standards Consortium (CDISC) Study Data Tabulation Model (SDTM), standard domains are split into four main types: special purpose, relationships, trial design and general observation classes. General observation classes cover the majority of observations collected during a study and can be divided among three general classes:
  • The Interventions class captures investigational, therapeutic and other treatments that are administered to the subject (with some actual or expected physiological effect) either as specified by the study protocol (e.g., “exposure”), coincident with the study assessment period (e.g., “concomitant medications”), or other substances self-administered by the subject (such as alcohol, tobacco, or caffeine).
  • The Events class captures planned protocol milestones such as randomization and study completion, and occurrences, conditions, or incidents independent of planned study evaluations occurring during the trial (e.g., adverse events) or prior to the trial (e.g., medical history).
  • The Findings class captures the observations resulting from planned evaluations to address specific tests or questions such as laboratory tests, ECG testing, and questions listed on questionnaires.
When creating a custom domain, one should first confirm that there are no published domains available that the data can fit with. This can be done by checking against the reserved domain codes listed in the appendices of the SDTM implementation guide or checking the CDISC website for any recently published domains. The following list of points are not acceptable when creating custom domains:
  • If there is a common topic where the nature of the data is the same as another published domain.
  • If the custom domain is due to separation based on time.
  • If the data have been collected or are going to be used for different reasons. For example, if a lab parameter is collected for efficacy purposes the data must be represented in the LB domain and not in a custom ‘efficacy’ domain.
  • Data that were collected on separate CRF modules or pages and may fit into an existing domain.
  • If it is necessary to represent relationships between data that are hierarchical in nature. The use of RELREC can be utilized instead,
Once it is confirmed that the data does not fit with any published domains, it should be determined which of the three general observation classes best fits the topic of the data. The custom domain must fit in to one of the three general observation classes. The next step is to determine a two-letter domain code for the custom domain. Note that this should not be the same as any already published or in discussion domain code. The domain codes X-, Y- and Z- are reserved for sponsor use, where the hyphen may be replaced by any letter or number. This domain code will be the name of the domain and will also be used to replace all prefixes of variables for the class. The following steps can then be followed to create the custom domain:
  1. Select and include the required Identifier variables (STUDYID, DOMAIN, USUBJID and --SEQ) and any permissible Identifier variables (--GRPID, --REFID and --SPID).
  2. Include the Topic variable from the identified general observation class (--TRT for interventions, --TERM for events and --TESTCD for Findings).
  3. Select and include the relevant Qualifier variables from the identified general observation class only. These can be found in Section 2.2.1, 2.2.2 and 2.2.3 of the Study Data Tabulation Model.
  4. Select and include the applicable Timing variables. These can be found in Section 2.2.5 of the Study Data Tabulation Model and relate to all general observation classes.
  5. Set the order of the variables within the domain: identifiers must be followed by topic variables, qualifiers and finally timing variables. The variables must then be ordered within these roles to match the order of variables given in Sections 2.2.1, 2.2.2, 2.2.3, 2.2.4 and 2.2.5 of the Study Data Tabulation Model. The variable order in the define.xml must also match the order of the variables within the domain.
    6. Adjust the labels of the variables only as appropriate to properly convey the meaning in the context of the data being submitted in the newly created domain. Use title case for all labels.
  6. Ensure that appropriate standard variables are being properly applied by comparing the use of variables in standard domains.
  7. Ensure that there are no sponsor-defined variables added to the domain. Any sponsor-defined variables should be in a Supplemental Qualifier dataset.
  8. Variable attributes within the domain and Supplemental Qualifier dataset must conform to the SAS Version 5 transport file conventions. For example, variable names must be no longer than 8 characters, variables labels must be no longer than 40 characters and data value lengths must be no longer than 200 characters. Also, where possible the domain should be less than 400 MB in size, otherwise one should contact their review division before splitting domains as they may accept domains with a larger file size.

Study Data Tabulation Model, Version 1.2; CDISC Submission Data Standards Team.
Study Data Tabulation Model Implementation Guide: Human Clinical Trials, Version 3.1.2; CDISC Submission Data Standards Team.