genotyping¶
Purpose¶
Performs cluster-level and spot-level genotype inference from UMI counts and priors.
Upstream¶
clusterpriorcell_num
Required inputs¶
spot_count_fileprior_fileclusterinformationcell_num
Input interpretation¶
| Input key | Source step/config | Required | Interpretation |
|---|---|---|---|
spot_count_file |
umi_combine manifest output |
Yes | Chunk manifest pointing to spot-level count parquet files used for genotype inference. |
prior_file |
prior output (or fixed/empty mode) |
Yes | Prior-frequency table for genotype calculations; behavior differs if fixed/empty prior is used. |
cluster |
cluster output (cluster_file) |
Yes | Spot-to-cluster mapping used for cluster-level aggregation before individual calls. |
cell_num |
steps.cell_number / context |
Yes | Cell-number support used in spot-level genotype refinement logic. |
Parameters (steps.genotyping)¶
| Parameter | Type | Typical/default | Interpretation |
|---|---|---|---|
alpha |
float | 0.05 |
Statistical significance threshold used in allele-level filtering logic. |
epsQ |
int | 20 |
Quality-to-error conversion scale for UMI/read evidence aggregation. |
epsAF |
float | 0.003 |
Allele-frequency error floor used during cluster allele filtering. |
mu |
float | 1e-5 |
Prior mutation-rate term used in individual genotype inference. |
thr_dp |
int | 1000 |
Depth threshold for robust genotype calling/retention. |
pop_vaf |
float | 1e-5 |
Population-AF threshold used in genotype filtering logic. |
filter_oneallele |
bool | true |
If true, applies one-allele style filtering for stricter genotype selection. |
Tuning notes¶
alpha,epsAF, andmujointly control strictness of candidate retention.- Raise
thr_dpfor more conservative calls on noisy/high-depth data. - Lower
pop_vaffor stricter rare-variant emphasis. - Keep
filter_oneallele=trueunless you explicitly want a more permissive candidate set.
Outputs¶
Main outputs:
ind_geno_filter_fileind_geno_filter_mutation_listgermline_filecluster_vaf_filespot_geno_file
When run.keep_intermediates is true, extra intermediate count/genotype files are also emitted.
Tuning notes¶
- This step is the key bridge from Step 3 count/prior construction to Step 5 multi-feature extraction.
- Downstream feature steps all depend directly on these genotype outputs.