Algorithm Walkthrough¶
This page explains how SpaceTracer converts spatial transcriptomics alignments into lineage-informative somatic SNV evidence.
Pipeline design¶
SpaceTracer is organized as a directed acyclic graph (DAG) of processing steps. Each step produces context outputs consumed by downstream steps.
The current core execution flow (aligned to SpaceTracer/steps) is:
step0_cluster.py->clusterstep1_bam_processing.py->bam_processingstep2_mpileup.py->mpileupstep3_UMI_combine.py->umi_combinestep3_cell_number.py->cell_numstep3_get_prior.py->priorstep4_genotyping.py->genotypingstep5_spatial_feature.py,step5_mappability_feature.py,step5_read_feature.py,step5_RNA_level_feature.py-> feature branchesstep5_phasing.py->phasingstep6_merge_all_features.py->merge_featurestep7_mutation_prediction.py->mutation_prediction
For a detailed per-step reference (inputs, parameters, and outputs), see Step Reference Overview. For practical rerun/debug patterns, see Single-Step Debug Cookbook.
Step-by-step meaning¶
1) cluster¶
Builds or loads spot/domain grouping information used in downstream genotype and spatial inference.
2) bam_processing¶
Prepares BAM-level data for robust pileup and candidate detection (sorting/filtering/index-friendly preprocessing).
3) mpileup¶
Generates base-level evidence from aligned reads across the genome/chunks.
4) umi_combine¶
Aggregates read evidence at UMI level to reduce read-level technical noise and improve confidence.
5) cell_num¶
Estimates spot/cell-level support statistics needed for later probabilistic genotyping.
6) prior¶
Builds prior information for mutation likelihood estimation.
7) genotyping¶
Combines evidence and priors to infer genotype-level mutation signals.
8) Feature branches¶
From genotyping outputs, SpaceTracer computes multiple complementary feature families:
spatial_feature: neighborhood/tissue-structure signalmappability_feature: regional mappability/confounder signalread_feature: read-level quality/bias signalRNA_feature: RNA-level context (including expression-related cues)
9) phasing¶
Refines candidate evidence with RNA-informed phasing information and cluster-level event summaries.
10) merge_feature¶
Merges all feature families (plus phasing outputs) into an integrated feature representation for downstream prioritization.
Step detail index¶
- cluster
- bam_processing
- mpileup
- umi_combine
- cell_num
- prior
- genotyping
- spatial_feature
- mappability_feature
- read_feature
- RNA_feature
- phasing
- merge_feature
- mutation_prediction
Why this structure works¶
SpaceTracer combines orthogonal information to suppress false positives:
- read/UMI evidence reduces sequencing artifacts
- prior modeling stabilizes genotype inference
- mappability and RNA-level features handle context-specific noise
- spatial features preserve in situ biological structure
This combination improves confidence in mosaic SNV discovery for lineage analysis.
Parallel execution and checkpoints¶
SpaceTracer supports:
- parallel execution for independent feature steps
- checkpoint-aware resume (skip completed steps)
- partial execution with
--start-fromand--stop-at - explicit subset execution with
--only-steps(no automatic dependency completion outside the listed subset)
These capabilities make iterative analysis and parameter tuning practical on real datasets.
Practical interpretation¶
When reading results, think in three layers:
- Evidence layer: pileup + UMI + genotype calls
- Feature layer: spatial/read/mappability/RNA features
- Integration layer: merged features for final candidate prioritization
This helps diagnose whether a candidate variant is supported by strong multi-modal evidence or likely a technical artifact.