HTAN Spatial Transcriptomics

HTAN supports several spatial sequencing modalities. Modalities supported are growing as new data are generated.

Spatial transcriptomic sequencing data are divided into four primary levels coupled with an auxiliary set of files used in or generated by processing workflows for spatial transcriptomics:

LevelDefinitionExample Data
1Raw sequencing dataFASTQs
2Aligned primary dataAligned BAMs
3Derived biomolecular data mapped to image positionsBarcodes, features, filtered and unfiltered matrices
4In progressTBD
AuxiliaryAdditional data such as imaging, qc, and json scale factorsTIFF, JPG, PNG, JSON, HTML, etc

Attributes

WARNING: Manifests provided on this page are for reference only. DO NOT USE THESE MANIFESTS FOR DATA SUBMISSION.

Directions

The interactive tables below are provided to help users understand the HTAN Data Model. The tables allow a user to view, search or download attributes either:

  1. in a specific manifest; or
  2. in all manifests represented on this page.

To view a specific manifest, click on the link in the Manifests tab. The manifest will appear in a new tab on the page. Navigate to the new tab to search for attributes or download the manifest.
To search for attributes among all manifests, navigate to the All Attributes tab and use the search box provided at the top of the tab. All attributes can also be downloaded as a csv file.

Manifest
Description
Files contain raw RNA-seq data associated with spot/slide data.
Alignment workflows downstream of Spatial Transcriptomics RNA-seq Level 1.
Processed data files based on Spatial Transcriptomics RNA-seq Level 2 and Spatial Transcriptomics Auxiliary files.
Auxiliary data associated with spot/slide analysis (aligned Images, quality control files, etc) from Spatial Transcriptomics.
Files contain raw data output from the NanoString GeoMx DSP Pipeline. These can include RCC or DCC Files.
Files contain processed data from the NanoString GeoMx DSP Pipeline. This level depends on GeoMx Level 1 and Imaging Level 2.
All data pertaining to the 10X Genomics Xenium In-Situ Hybridization experiment
RNA and Protein Panel assays applied as part of Nanostring CosMx Spatial Molecular Imager (SMI)
Raw sequencing files for the Slide-seq assay.
Aligned sequencing files and QC for the Slide-seq assay.
Gene matrices with features and barcodes for Slide-seq as well as spatial information (bead location files).
GeoMx ROI and Segment Metadata Attributes. The assayed biospecimen should be reported one per row with the associated ROI coordinates.
GeoMx ROI and Segment Metadata Attributes. The assayed biospecimen should be reported one per row with the associated ROI coordinates.
Attribute
Manifest Name
Description
Required
Conditional If
Data Type
Valid Values
Filename
- 10x Visium Spatial Transcriptomics - RNA-seq Level 1
- 10x Visium Spatial Transcriptomics - RNA-seq Level 2
- 10x Visium Spatial Transcriptomics - RNA-seq Level 3
- 10x Visium Spatial Transcriptomics - Auxiliary Files
- NanoString GeoMx DSP Spatial Transcriptomics Level 1
- NanoString GeoMx DSP Spatial Transcriptomics Level 3
- 10X Genomics Xenium ISS Experiment
- Nanostring CosMx SMI Experiment
- Slide-seq Level 1
- Slide-seq Level 2
- Slide-seq Level 3
Name of a file
True
String
Run ID
- 10x Visium Spatial Transcriptomics - RNA-seq Level 1
- 10x Visium Spatial Transcriptomics - RNA-seq Level 2
- 10x Visium Spatial Transcriptomics - RNA-seq Level 3
- 10x Visium Spatial Transcriptomics - Auxiliary Files
- Slide-seq Level 3
A unique identifier for this individual run (typically associated with a single slide) of the spatial transcriptomic processing workflow.
True
String
File Format
- 10x Visium Spatial Transcriptomics - RNA-seq Level 1
- 10x Visium Spatial Transcriptomics - RNA-seq Level 2
- 10x Visium Spatial Transcriptomics - RNA-seq Level 3
- 10x Visium Spatial Transcriptomics - Auxiliary Files
- NanoString GeoMx DSP Spatial Transcriptomics Level 1
- NanoString GeoMx DSP Spatial Transcriptomics Level 3
- 10X Genomics Xenium ISS Experiment
- Nanostring CosMx SMI Experiment
- Slide-seq Level 1
- Slide-seq Level 2
- Slide-seq Level 3
Format of a file (e.g. txt, csv, fastq, bam, etc.)
True
String
- hdf5
- bedgraph
- idx
- idat
- bam
- bai
- excel
- powerpoint
- tif
- tiff
- ome-tiff
- png
- doc
- pdf
- fasta
- fastq
- sam
- vcf
- bcf
- maf
- bed
- chp
- cel
- sif
- tsv
- csv
- txt
- plink
- bigwig
- wiggle
- gct
- bgzip
- zip
- seg
- html
- mov
- hyperlink
- svs
- md
- flagstat
- gtf
- raw
- msf
- rmd
- bed narrowpeak
- bed broadpeak
- bed gappedpeak
- avi
- pzfx
- fig
- xml
- tar
- r script
- abf
- bpm
- dat
- jpg
- locs
- sentrix descriptor file
- python script
- sav
- gzip
- sdf
- rdata
- hic
- ab1
- 7z
- gff3
- json
- sqlite
- svg
- sra
- recal
- tranches
- mtx
- tagalign
- dup
- dicom
- czi
- mex
- cloupe
- am
- cell am
- mpg
- m
- mzml
- scn
- dcc
- rcc
- pkc
- sf
- bedpe
HTAN Parent Biospecimen ID
- 10x Visium Spatial Transcriptomics - RNA-seq Level 1
- 10x Visium Spatial Transcriptomics - RNA-seq Level 2
- 10x Visium Spatial Transcriptomics - RNA-seq Level 3
- 10x Visium Spatial Transcriptomics - Auxiliary Files
- NanoString GeoMx DSP Spatial Transcriptomics Level 1
- NanoString GeoMx DSP Spatial Transcriptomics Level 3
- 10X Genomics Xenium ISS Experiment
- Nanostring CosMx SMI Experiment
- Slide-seq Level 1
- NanoString GeoMx DSP ROI DCC Segment Annotation Metadata
- NanoString GeoMx DSP ROI RCC Segment Annotation Metadata
HTAN Biospecimen Identifier (eg HTANx_yyy_zzz) indicating the biospecimen(s) from which these files were derived; multiple parent biospecimen should be comma-separated
True
- Is lowest level is "Yes - Is lowest level"
String
HTAN Data File ID
- 10x Visium Spatial Transcriptomics - RNA-seq Level 1
- 10x Visium Spatial Transcriptomics - RNA-seq Level 2
- 10x Visium Spatial Transcriptomics - RNA-seq Level 3
- 10x Visium Spatial Transcriptomics - Auxiliary Files
- NanoString GeoMx DSP Spatial Transcriptomics Level 1
- NanoString GeoMx DSP Spatial Transcriptomics Level 3
- 10X Genomics Xenium ISS Experiment
- Nanostring CosMx SMI Experiment
- Slide-seq Level 1
- Slide-seq Level 2
- Slide-seq Level 3
Self-identifier for this data file - HTAN ID of this file HTAN ID SOP (eg HTANx_yyy_zzz)
True
String
Read Indicator
- 10x Visium Spatial Transcriptomics - RNA-seq Level 1
- Slide-seq Level 1
Indicate if this is Read 1 (R1), Read 2 (R2), Index Reads (I1), or Other
True
String
- r1
- r2
- r1&r2
- i1
- other
Spatial Read1
- 10x Visium Spatial Transcriptomics - RNA-seq Level 1
- Slide-seq Level 1
Read 1 content description
True
String
- cdna
- spatial barcode and umi
Spatial Read2
- 10x Visium Spatial Transcriptomics - RNA-seq Level 1
- Slide-seq Level 1
Read 2 content description
True
String
- cdna
- spatial barcode and umi
Spatial Library Construction Method
- 10x Visium Spatial Transcriptomics - RNA-seq Level 1
- Slide-seq Level 1
Process which results in the creation of a library from fragments of DNA using cloning vectors or oligonucleotides with the role of adaptors [OBI_0000711]
True
String
- smart-seq2
- smart-seqv4
- 10xv1.0
- 10xv1.1
- 10xv2
- 10xv3
- 10xv3.1
- drop-seq
- indropsv2
- indropsv3
- trudrop
- nextera xt
Library Preparation Days from Index
- 10x Visium Spatial Transcriptomics - RNA-seq Level 1
- Slide-seq Level 1
Number of days between sample for assay was received in lab and the libraries were prepared for sequencing [number]. If not applicable please enter 'Not Applicable'
False
String
Sequencing Library Construction Days from Index
- 10x Visium Spatial Transcriptomics - RNA-seq Level 1
- Slide-seq Level 1
Number of days between sample for assay was received in lab and day of sequencing library construction [number]. If not applicable please enter 'Not Applicable'
True
String
End Bias
- 10x Visium Spatial Transcriptomics - RNA-seq Level 1
- Slide-seq Level 1
The end of the cDNA molecule that is preferentially sequenced, e.g. 3/5 prime tag/end or the full length transcript
True
String
- 3 prime
- 5 prime
- full length transcript
Reverse Transcription Primer
- 10x Visium Spatial Transcriptomics - RNA-seq Level 1
- Slide-seq Level 1
An oligo to which new deoxyribonucleotides can be added by DNA polymerase [SO_0000112]. The type of primer used for reverse transcription, e.g. oligo-dT or random primer. This allows users to identify content of the cDNA library input e.g. enriched for mRNA
True
String
- oligo-dt
- poly-dt
- feature barcoding
- random
Sequencing Platform
- 10x Visium Spatial Transcriptomics - RNA-seq Level 1
- Slide-seq Level 1
A platform is an object aggregate that is the set of instruments and software needed to perform a process [OBI_0000050]. Specific model of the sequencing instrument.
True
String
- illumina next seq 500
- illumina next seq 550
- illumina next seq 2500
- illumina novaseq 6000
- illumina miseq
- 454 gs flx titanium
- ab solid 4
- ab solid 2
- ab solid 3
- complete genomics
- illumina hiseq x ten
- illumina hiseq x five
- illumina genome analyzer ii
- illumina genome analyzer iix
- illumina hiseq 2000
- illumina hiseq 2500
- illumina hiseq 4000
- illumina nextseq
- ion torrent pgm
- ion torrent proton
- ion torrent s5
- pacbio rs
- novaseq 6000
- novaseqs4
- ultima genomics ug100
- oxford nanopore minion
- gridion
- promethion
- pacbio sequel2
- revio
- illumina nextseq 1000
- illumina nextseq 2000
- other
- unknown
- not reported
Capture Area
- 10x Visium Spatial Transcriptomics - RNA-seq Level 1
- 10x Visium Spatial Transcriptomics - RNA-seq Level 2
- 10x Visium Spatial Transcriptomics - RNA-seq Level 3
- 10x Visium Spatial Transcriptomics - Auxiliary Files
Area (or Capture Area) - One of the either four or two active regions where tissue can be placed on a Visium slide. Each area is intended to contain only one tissue sample. Slide areas are named consecutively from top to bottom: A1, B1, C1, D1 for Visium slides with 6.5 mm Capture Area and A, B for CytAssist slides with 11 mm Capture Area. Both CytAssist slides with 6.5 mm Capture Area and Gateway Slides contain only two slide areas, A1 and D1.
False
String
- a
- b
- c
- d
- a1
- b1
- c1
- d1
Slide Version
- 10x Visium Spatial Transcriptomics - RNA-seq Level 1
Version of imaging slide used. Slide version is critical for the analysis of the sequencing data as different slides have different capture area layouts.
False
String
- v1
- v2
- v3
- v4
Slide ID
- 10x Visium Spatial Transcriptomics - RNA-seq Level 1
- 10x Visium Spatial Transcriptomics - Auxiliary Files
- 10X Genomics Xenium ISS Experiment
- Nanostring CosMx SMI Experiment
For Visium, it is the unique identifier printed on the label of each Visium slide. The serial number starts with V followed by a number which can range between one through five and ends with a dash and a three digit number, such as 123. For CosMx, this refers to the loaded Flow Cell ID. For Xenium, this ID indicates the slide orientation, as it matches the relative location of the ID on the physical Xenium slide.
False
String
Image Re-orientation
- 10x Visium Spatial Transcriptomics - RNA-seq Level 1
To ensure good fiducial alignment and tissue spots detection, it is important to correct for this shift in orientation.
False
String
- true
- false
Permeabilization Time
- 10x Visium Spatial Transcriptomics - RNA-seq Level 1
Fixed and stained tissue sections are permeabilized for different times. Each Capture Area captures polyadenylated mRNA from the attached tissue section. Measure is provided in minutes.
False
String
RIN
- 10x Visium Spatial Transcriptomics - RNA-seq Level 1
A numerical assessment of the integrity of RNA based on the entire electrophoretic trace of the RNA sample including the presence or absence of degradation products. Number
False
String
DV200
- 10x Visium Spatial Transcriptomics - RNA-seq Level 1
Represents the percentage of RNA fragments that are >200 nucleotides in size. Number
False
String
Checksum
- 10x Visium Spatial Transcriptomics - RNA-seq Level 2
- Slide-seq Level 2
MD5 checksum of the BAM file
True
String
HTAN Parent Data File ID
- 10x Visium Spatial Transcriptomics - RNA-seq Level 2
- 10x Visium Spatial Transcriptomics - RNA-seq Level 3
- 10x Visium Spatial Transcriptomics - Auxiliary Files
- Slide-seq Level 2
- Slide-seq Level 3
HTAN Data File Identifier indicating the file(s) from which these files were derived
True
String
UMI Tag
- 10x Visium Spatial Transcriptomics - RNA-seq Level 2
- Slide-seq Level 2
SAM tag for the UMI field; please provide a valid UB, UMI (e.g. UB:Z or UR:Z)
True
String
Spatial Barcode Tag
- 10x Visium Spatial Transcriptomics - RNA-seq Level 2
- Slide-seq Level 2
SAM tag for spot barcode field; please provide a valid spot barcode tag (e.g. CB:Z)
True
String
Applied Hard Trimming
- 10x Visium Spatial Transcriptomics - RNA-seq Level 2
- Slide-seq Level 2
Was Hard Trimming applied
True
String
- yes - applied hard trimming
- no
Workflow Version
- 10x Visium Spatial Transcriptomics - RNA-seq Level 2
- 10x Visium Spatial Transcriptomics - RNA-seq Level 3
- 10x Visium Spatial Transcriptomics - Auxiliary Files
- Slide-seq Level 2
- Slide-seq Level 3
Major version of the workflow (e.g. Cell Ranger v3.1)
True
String
Genomic Reference
- 10x Visium Spatial Transcriptomics - RNA-seq Level 2
- Slide-seq Level 2
Exact version of the human genome reference used in the alignment of reads (e.g. GCF_000001405.39)
True
- Pseudo Alignment Used is "Yes - Pseudo Alignment Used"
String
Genomic Reference URL
- 10x Visium Spatial Transcriptomics - RNA-seq Level 2
- Slide-seq Level 2
Link to human genome sequence (e.g. ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_34/GRCh38.primary_assembly.genome.fa.gz)
True
- Pseudo Alignment Used is "Yes - Pseudo Alignment Used"
String
Genome Annotation URL
- 10x Visium Spatial Transcriptomics - RNA-seq Level 2
- Slide-seq Level 2
Link to the human genome annotation (GTF) file (e.g. ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_34/gencode.v34.annotation.gtf.gz)
True
String
Visium File Type
- 10x Visium Spatial Transcriptomics - RNA-seq Level 3
- 10x Visium Spatial Transcriptomics - Auxiliary Files
The file type generated for the visium experiment.
True
String
- reference png
- reference jpg
- json scale factors
- probe dataset csv
- qc result html
- filtered mex
- unfiltered mex
- tissue_positions
- barcodes
- features
- fiducial image png
- fiducial image jpg
- detected image png
- detected jpg
- high res image
- low res image
Spots under tissue
- 10x Visium Spatial Transcriptomics - RNA-seq Level 3
The number of barcodes associated with a spot under tissue.
True
String
Mean Reads per Spatial Spot
- 10x Visium Spatial Transcriptomics - RNA-seq Level 3
The number of reads, both under and outside of tissue, divided by the number of barcodes associated with a spot under tissue.
True
String
Median Number Genes per Spatial Spot
- 10x Visium Spatial Transcriptomics - RNA-seq Level 3
- Slide-seq Level 3
The median number of genes detected per spot under tissue-associated barcode. Detection is defined as the presence of at least 1 UMI count.
True
String
Sequencing Saturation
- 10x Visium Spatial Transcriptomics - RNA-seq Level 3
- NanoString GeoMx DSP ROI DCC Segment Annotation Metadata
The fraction of reads originating from an already-observed UMI. This is a function of library complexity and sequencing depth. More specifically, this is the fraction of confidently mapped, valid spot-barcode, valid UMI reads that had a non-unique (spot-barcode, UMI, gene).
True
String
Proportion Reads Mapped
- 10x Visium Spatial Transcriptomics - RNA-seq Level 3
Proportion of mapped reads collected from samtools. Number
False
String
Proportion Reads Mapped to Transcriptome
- 10x Visium Spatial Transcriptomics - RNA-seq Level 3
Fraction of reads that mapped to a unique gene in the transcriptome. The read must be consistent with annotated splice junctions. These reads are considered for UMI counting.
True
String
Median UMI Counts per Spot
- 10x Visium Spatial Transcriptomics - RNA-seq Level 3
- Slide-seq Level 3
The median number of UMI counts per tissue covered spot.
True
String
Synapse ID of GeoMx DSP PKC File
- NanoString GeoMx DSP Spatial Transcriptomics Level 1
The Synapse ID(s) associated with the PKC mapping file for the assay. Multiple files are listed as comma separated values.
True
String
GeoMx DSP NGS Sequencing Platform
- NanoString GeoMx DSP Spatial Transcriptomics Level 1
A platform is an object aggregate that is the set of instruments and software needed to perform a process [OBI_0000050]. Specific model of the sequencing instrument.
False
String
GeoMx DSP NGS Library Selection Method
- NanoString GeoMx DSP Spatial Transcriptomics Level 1
How RNA molecules are isolated.
False
String
GeoMx DSP NGS Library Preparation Kit Name
- NanoString GeoMx DSP Spatial Transcriptomics Level 1
Name of Library Preparation Kit. String
False
String
GeoMx DSP Library Preparation Kit Vendor
- NanoString GeoMx DSP Spatial Transcriptomics Level 1
Vendor of Library Preparation Kit. String
False
String
GeoMx DSP Library Preparation Kit Version
- NanoString GeoMx DSP Spatial Transcriptomics Level 1
Version of Library Preparation Kit. String
False
String
Synapse ID of GeoMx Lab Worksheet File
- NanoString GeoMx DSP Spatial Transcriptomics Level 1
Synapse ID(s) of Lab Worksheet Files output from the GeoMx DSP workflow. Multiple files are listed as comma separated values.
False
String
Software and Version
- NanoString GeoMx DSP Spatial Transcriptomics Level 1
- 10X Genomics Xenium ISS Experiment
- Nanostring CosMx SMI Experiment
Name of software used to generate expression values. String
True
- Pseudo Alignment Used is "Yes - Pseudo Alignment Used"
String
GeoMx DSP Assay Type
- NanoString GeoMx DSP Spatial Transcriptomics Level 3
The assay type which was used for the GeoMx DSP pipeline.
True
String
- rna ncounter
- protein ncounter
- protein ngs
- rna ngs