HTAN Biospecimen Data

The HTAN biospecimen data model is designed to capture essential biospecimen data elements, including:

  • Acquisition method, e.g. autopsy, biopsy, fine needle aspirate, etc.
  • Topography Code, indicating site within the body, e.g. based on ICD-O-3.
  • Collection information e.g. time, duration of ischemia, temperature, etc.
  • Processing of parent biospecimen information e.g. fresh, frozen, etc.
  • Biospecimen and derivative clinical metadata i.e. Histologic Morphology Code, e.g. based on ICD-O-3.
  • Coordinates for derivative biospecimen from their parent biospecimen.
  • Processing of derivative biospecimen for downstream analysis e.g. dissociation, sectioning, analyte isolation, etc.

HTAN biospecimen metadata leverages existing common data elements from four sources:

Attributes

WARNING: Manifests provided on this page are for reference only. DO NOT USE THESE MANIFESTS FOR DATA SUBMISSION.

Directions

The interactive tables below are provided to help users understand the HTAN Data Model. The tables allow a user to view, search or download attributes either:

  1. in a specific manifest; or
  2. in all manifests represented on this page.

To view a specific manifest, click on the link in the Manifests tab. The manifest will appear in a new tab on the page. Navigate to the new tab to search for attributes or download the manifest.
To search for attributes among all manifests, navigate to the All Attributes tab and use the search box provided at the top of the tab. All attributes can also be downloaded as a csv file.

Manifest
Description
HTAN biological entity; this can be tissue, blood, analyte and subsamples of those
Attribute
Manifest Name
Description
Required
Conditional If
Data Type
Valid Values
HTAN Biospecimen ID
- Biospecimen
HTAN ID associated with a biosample based on HTAN ID SOP (eg HTANx_yyy_zzz)
True
String
Source HTAN Biospecimen ID
- Biospecimen
This is the HTAN ID that may have been assigned to the biospecimen at the site of biospecimen origin (e.g. BU).
False
String
HTAN Parent ID
- Biospecimen
HTAN ID of parent from which the biospecimen was obtained. Parent could be another biospecimen or a research participant.
True
String
Timepoint Label
- Biospecimen
Label to identify the time point at which the clinical data or biospecimen was obtained (e.g. Baseline, End of Treatment, Overall survival, Final). NO PHI/PII INFORMATION IS ALLOWED.
True
String
Collection Days from Index
- Biospecimen
Number of days from the research participant's index date that the biospecimen was obtained. If not applicable please enter 'Not Applicable'
True
String
Adjacent Biospecimen IDs
- Biospecimen
List of HTAN Identifiers (separated by commas) of adjacent biospecimens cut from the same sample; for example HTA3_3000_3, HTA3_3000_4, ...
False
String
Biospecimen Type
- Biospecimen
Biospecimen Type
True
String
- tissue biospecimen type
- blood biospecimen type
- analyte biospecimen type
- mouth rinse biospecimen type
- stool biospecimen type
- urine biospecimen type
- ascites biospecimen type
- sputum biospecimen type
- fluids biospecimen type
- bone marrow biospecimen type
- cells biospecimen type
Acquisition Method Type
- Biospecimen
Records the method of acquisition or source for the specimen under consideration.
True
String
- autopsy
- biopsy
- fine needle aspirate
- surgical resection
- punch biopsy
- shave biopsy
- excision
- re-excision
- sentinel node biopsy
- lymphadenectomy - regional nodes
- other acquisition method
- non induced sputum
- induced sputum
- bal (bronchial alveolar lavage)
- cytobrush
- blood draw
- fluid collection
- forceps biopsy
- core needle biopsy
- endoscopic biopsy
- not specified
Fixative Type
- Biospecimen
Text term to identify the type of fixative used to preserve a tissue specimen
True
String
- acetone
- alcohol
- formalin
- glutaraldehyde
- oct media
- rnalater
- saline
- 95% ethanol
- dimidoester
- carbodiimide
- dimethylacetamide
- para-benzoquinone
- paxgene tissue
- tcl lysis buffer
- np40 lysis buffer
- methacarn
- cryo-store
- carnoy's fixative
- polaxamer
- other
- none
- unknown
- unfixed
Storage Method
- Biospecimen
The method by which a biomaterial was stored after preservation or before another protocol was used.
True
String
- ambient temperature
- cut slide
- fresh
- frozen at -20c
- frozen at -70c
- frozen at -80c
- frozen at -150c
- frozen in liquid nitrogen
- frozen in vapor phase
- paraffin block
- rnalater at 4c
- rnalater at 25c
- rnalater at -20c
- refrigerated at 4 degrees
- refrigerated vacuum chamber
- 4c in vacuum chamber
- desiccant at 4c
- not applicable
- unknown
Processing Days from Index
- Biospecimen
Number of days from the research participant's index date that the biospecimen was processed. If not applicable please enter 'Not Applicable'
True
String
Site Data Source
- Biospecimen
Text to identify the data source for the specimen/sample from within the HTAN center, if applicable. Any identifier used within the center to identify data sources. No PHI/PII is allowed.
False
String
Collection Media
- Biospecimen
Material Specimen is collected into post procedure
False
String
- dmem
- dmem+serum
- rpmi
- rpmi+serum
- pbs
- pbs+serum
- none
Mounting Medium
- Biospecimen
The solution in which the specimen is embedded, generally under a cover glass. It may be liquid, gum or resinous, soluble in water, alcohol or other solvents and be sealed from the external atmosphere by non-soluble ringing media
False
String
- aqueous water based
- non-aqueous solvent based
- xylene
- toluene
- antifade with dapi
- antifade without dapi
- pbs
- unknown
- not reported
Processing Location
- Biospecimen
Site with an HTAN center where specimen processing occurs, if applicable. Any identifier used within the center to identify processing location. No PHI/PII is allowed.
False
String
Histology Assessment By
- Biospecimen
Text term describing who (in what role) made the histological assessments of the sample
False
String
- pathologist
- research scientist
- other
- unknown
Histology Assessment Medium
- Biospecimen
The method of assessment used to characterize histology
False
String
- digital
- microscopy
- other
- unknown
Preinvasive Morphology
- Biospecimen
Histologic Morphology not included in ICD-O-3 morphology codes, for preinvasive lesions included in the HTAN
False
String
- melanocytic hyperplasia
- atypical melanocytic proliferation
- melanoma in situ - superficial spreading
- melanoma in situ - lentigo maligna type
- melanoma in situ - acral-lentiginous
- melanoma in situ - arising in a giant congenital nevus
- persistent melanoma in situ
- melanoma in situ - not otherwise classified
- scar - no residual melanoma
- invasive melanoma - superficial spreading
- invasive melanoma - nodular type
- invasive melanoma - lentigo maligna
- invasive melanoma - acral lentiginous
- invasive melanoma - desmoplastic
- invasive melanoma - nevoid
- invasive melanoma - other
- normal wda
- reserve cell hyperplasia
- squamous metaplasia - mature
- squamous metaplasia - immature
- mild dysplasia
- moderate dysplasia
- severe dysplasia
- squamous carcinoma in situ
- atypical adenomatous hyperplasia
- adenocarcinoma in situ - non mucinous
- adenocarcinoma in situ - mucinous
- benign tumor nos
- hamartoma
- carcinoma nos
- no diagnosis possible
Tumor Infiltrating Lymphocytes
- Biospecimen
Measure of Tumor-Infiltrating Lymphocytes [Number]
False
String
Degree of Dysplasia
- Biospecimen
Information related to the presence of cells that look abnormal under a microscope but are not cancer. Records the degree of dysplasia for the cyst or lesion under consideration.
False
String
- normal or basal cell hyperplasia or metaplasia
- mild dysplasia
- moderate dysplasia
- severe dysplasia
- carcinoma in situ
- unknown
Dysplasia Fraction
- Biospecimen
Resulting value to represent the number of pieces of dysplasia divided by the total number of pieces. [Text: max length 5]
False
String
Number Proliferating Cells
- Biospecimen
Numeric value that represents the count of proliferating cells determined during pathologic review of the sample slide(s).
False
String
Percent Eosinophil Infiltration
- Biospecimen
Numeric value to represent the percentage of infiltration by eosinophils in a tumor sample or specimen.
False
String
Percent Granulocyte Infiltration
- Biospecimen
Numeric value to represent the percentage of infiltration by granulocytes in a tumor sample or specimen.
False
String
Percent Inflam Infiltration
- Biospecimen
Numeric value to represent local response to cellular injury, marked by capillary dilatation, edema and leukocyte infiltration; clinically, inflammation is manifest by redness, heat, pain, swelling and loss of function, with the need to heal damaged tissue.
False
String
Percent Lymphocyte Infiltration
- Biospecimen
Numeric value to represent the percentage of infiltration by lymphocytes in a solid tissue normal sample or specimen.
False
String
Percent Monocyte Infiltration
- Biospecimen
Numeric value to represent the percentage of monocyte infiltration in a sample or specimen.
False
String
Percent Necrosis
- Biospecimen
Numeric value to represent the percentage of cell death in a malignant tumor sample or specimen.
False
String
Percent Neutrophil Infiltration
- Biospecimen
Numeric value to represent the percentage of infiltration by neutrophils in a tumor sample or specimen.
False
String
Percent Normal Cells
- Biospecimen
Numeric value to represent the percentage of normal cell content in a malignant tumor sample or specimen.
False
String
Percent Stromal Cells
- Biospecimen
Numeric value to represent the percentage of reactive cells that are present in a malignant tumor sample or specimen but are not malignant such as fibroblasts, vascular structures, etc.
False
String
Percent Tumor Cells
- Biospecimen
Numeric value that represents the percentage of infiltration by tumor cells in a sample.
False
String
Percent Tumor Nuclei
- Biospecimen
Numeric value to represent the percentage of tumor nuclei in a malignant neoplasm sample or specimen.
False
String
Fiducial Marker
- Biospecimen
Imaging specific: fiducial markers for the alignment of images taken across multiple rounds of imaging.
False
String
- nuclear stain - dapi
- fluorescent beads
- grid slides - hemocytometer
- adhesive markers
- other
- unknown
- not reported
Slicing Method
- Biospecimen
Imaging specific: the method by which the tissue was sliced.
False
String
- vibratome
- cryosectioning
- tissue molds
- sliding microtome
- sectioning
- other
- unknown
- not reported
Lysis Buffer
- Biospecimen
scRNA-seq specific: Type of lysis buffer used
False
String
Method of Nucleic Acid Isolation
- Biospecimen
Bulk RNA & DNA-seq specific: method used for nucleic acid isolation. E.g. Qiagen Allprep, Qiagen miRNAeasy. [Text - max length 100]
False
String
HTAN Parent Biospecimen ID
- Biospecimen
HTAN Biospecimen Identifier (eg HTANx_yyy_zzz) indicating the biospecimen(s) from which these files were derived; multiple parent biospecimen should be comma-separated
True
- Is lowest level is "Yes - Is lowest level"
String
Ischemic Time
- Biospecimen
Duration of time, in seconds, between when the specimen stopped receiving oxygen and when it was preserved or processed. Integer value.
False
- Biospecimen is "Urine"'
- 'Biospecimen is "Analyte"'
- 'Biospecimen is "Bone"'
- 'Biospecimen is "Tissue"
String
Ischemic Temperature
- Biospecimen
Specify whether specimen experienced warm or cold ischemia.
False
- Biospecimen is "Urine"'
- 'Biospecimen is "Analyte"'
- 'Biospecimen is "Bone"'
- 'Biospecimen is "Tissue"
String
- warm ischemia
- cold ischemia
- ambient air
- 4c wet ice
- negative -20c
- dry ice
- liquid nitrogen
- unknown
Histologic Morphology Code
- Biospecimen
The microscopic anatomy of normal and abnormal cells and tissues of the specimen as captured in the morphology codes of the International Classification of Diseases for Oncology, 3rd Edition (ICD-O-3). Example - 8010/0
True
- Biospecimen is "Urine"'
- 'Biospecimen is "Bone"'
- 'Biospecimen is "Tissue"
String
Preservation Method
- Biospecimen
Text term that represents the method used to preserve the sample.
True
- Biospecimen is "Urine"'
- 'Biospecimen is "Bone"'
- 'Biospecimen is "Tissue"
String
- cryopreserved
- cryopreservation in liquid nitrogen - dead tissue
- cryopreservation in dry ice - dead tissue
- cryopreservation in liquid nitrogen - live cells
- formalin fixed paraffin embedded - ffpe
- formalin fixed-unbuffered
- formalin fixed-buffered
- fresh
- fresh dissociated and single cell sorted into plates in np40 buffer
- oct
- snap frozen
- frozen
- negative 80 deg c
- liquid nitrogen
- fresh dissociated
- fresh dissociated and single cell sorted
- fresh dissociated and single cell sorted into plates
- methacarn fixed paraffin embedded - mfpe
- unknown
- not reported
Analyte Type
- Biospecimen
The kind of molecular specimen analyte: a molecular derivative (I.e. RNA / DNA / Protein Lysate) obtained from a specimen
True
- Biospecimen is "Analyte"
String
- cfdna analyte
- dna analyte
- rna analyte
- total rna analyte
- tissue block analyte
- tissue section analyte
- pbmcs or plasma or serum analyte
- cdna libraries analyte
- pbmcs
- plasma
- serum analyte
- lipid
- protein
- metabolite
Tissue Biospecimen Type
- Biospecimen
Tissue biospecimen
False
String
Blood Biospecimen Type
- Biospecimen
Blood biospecimen
False
String
Analyte Biospecimen Type
- Biospecimen
A molecular derivative (I.e. RNA / DNA / Protein Lysate) obtained from a specimen
False
String
Urine Biospecimen Type
- Biospecimen
Urine biospecimen
False
String
Bone Marrow Biospecimen Type
- Biospecimen
Bone Marrow biospecimen
False
String
Fixation Duration
- Biospecimen
The length of time, from beginning to end, required to process or preserve biospecimens in fixative (measured in minutes)
True
- Biospecimen is "Analyte"
String