phasing

This step refines genotype-supported candidate loci using RNA-informed phasing signals before feature merging.

Purpose

  • integrate haplotype-style evidence at candidate sites
  • reduce ambiguity in loci with mixed support
  • provide refined context for merge_feature

Upstream

  • bam_processing
  • genotyping

Required inputs

  • upstream genotype outputs (for example ind_geno_filter_file and related context tables)
  • candidate-locus metadata generated by earlier steps

Input interpretation

Input key Source step/config Required Interpretation
in_filter_bam bam_processing output Yes Filtered BAM used to gather phase-supporting read evidence.
merged_germline_file genotype aggregation output Yes Germline context for differentiating phased/non-phased candidate behavior.
merged_ind_geno_filter_file genotyping output merge Yes Core candidate-locus table entering phasing refinement.
genome_fasta resource_details.genome_fasta Yes Reference sequence used in phasing context extraction.
gene_bed resource_details.gene_bed Yes Gene-region annotation used in phase-event summarization.

Parameters (steps.phasing)

Parameter Type Typical/default Interpretation
minprior float 0.01 Minimum prior cutoff for candidate loci entering phasing refinement.
min_dp int 20 Minimum depth per locus used in phasing evidence evaluation.
min_total_dp int 50 Minimum total depth requirement for robust phasing decisions.
alpha float 0.05 Statistical significance threshold in phasing-related tests.
phasing_pad int 1000 Flanking window size used to collect nearby phasing support.
merge_gap int 200 Max gap for merging nearby phase-related events.
max_target int 200000 Upper cap on target-region/event processing size.
seed int 42 Random seed for deterministic behavior where stochastic operations are used.

Tuning notes

  • Increase min_dp / min_total_dp for stricter evidence quality.
  • Increase phasing_pad when long-range local context improves support in your data.
  • Use merge_gap to control event fragmentation vs over-merging.

Outputs

  • phasing summary tables under the configured output directory
  • refined per-locus phasing annotations consumed by merge_feature