HTAN Biospecimen Data Standard

Overview

This page describes the data levels and data collection for the HTAN biospecimen data standard.

Description of Model

The HTAN biospecimen data model is designed to capture essential biospecimen data elements, including:

  • Acquisition method, e.g. autopsy, biopsy, fine needle aspirate, etc.
  • Topography Code, indicating site within the body, e.g. based on ICD-O-3.
  • Collection information e.g. time, duration of ischemia, temperature, etc.  
  • Processing of parent biospecimen information e.g. fresh, frozen, etc. 
  • Biospecimen and derivative clinical metadata ie Histologic Morphology Code, e.g. based on ICD-O-3.
  • Coordinates for derivative biospecimen from their parent biospecimen.
  • Processing of derivative biospecimen for downstream analysis e.g. dissociation, sectioning, analyte isolation, etc.

The model consists of two tiers:

Data Level

Description

Tier 1

Base biospecimen data common to most assays and HTAN Research Network atlases

Tier 2

Assay-specific or atlas-specific extensions to the base model

Biospecimen Tier 1

Baseline HTAN biospecimen data leverages existing common data elements from four sources:

Additionally, if a comparable CDE could not be found in these sources for a specific attribute, an HTAN-specific attribute was created.

Biospecimen Tier 2

Attributes identified for inclusion in Tier 2 include those described in the caDSR system and, similarly to Tier 1, HTAN-specific elements:

  • Atlas-specific – Attributes that are atlas specific, but may used by more than 1 atlas.
  • Histological Assessment – Biospecimen attributes specific to histological assessment.
  • Multiplex Image Staining – Biospecimen attributes specific to multiplex image staining.
  • Single Cell RNA / Single Nucleus RNA Seq – Biospecimen attributes specific to Single Cell RNA / Single Nucleus RNA Seq
  • Bulk RNA & DNA Seq – Biospecimen attributes specific to Bulk RNA & DNA Seq