A Review of Haplotype Assembly: Methods and Challenges

Authors

  • Muniu Niu

DOI:

https://doi.org/10.54097/zm3nwp41

Keywords:

Haplotype assembly, Reference-guided methods, De novo assembly

Abstract

Haplotype assembly aims to partition sequencing reads according to their chromosomal origin based on variant information and to reconstruct the allelic configurations along individual chromosomes. Its core significance lies in revealing the true combinations of variants on the same chromosome, thereby enhancing the biological interpretability of genetic information. This paper presents a systematic review of haplotype assembly methods, with a focus on their underlying models and algorithmic strategies. Existing approaches are broadly categorized into reference-guided and de novo methods, and their characteristics are comparatively analyzed. Finally, future research directions are discussed, including the integration of multi-source sequencing data and the development of more robust and efficient models. Overall, this review provides a comprehensive overview of recent advances in haplotype assembly and offers valuable insights for future studies.

Downloads

Download data is not yet available.

References

[1] Browning S R, Browning B L. Haplotype phasing: existing methods and new developments [J]. Nature Reviews Genetics, 2011, 12(10): 703-714.

[2] Reuter J A, Spacek D V, Snyder M P. High-throughput sequencing technologies [J]. Molecular cell, 2015, 58(4): 586-597.

[3] Zhang X, Wu R, Wang Y, et al. Unzipping haplotypes in diploid and polyploid genomes [J]. Computational and structural biotechnology journal, 2020, 18: 66-72.

[4] Church D M, Schneider V A, Graves T, et al. Modernizing reference genome assemblies [J]. PLoS biology, 2011, 9(7): e1001091.

[5] Paszkiewicz K, Studholme D J. De novo assembly of short sequence reads [J]. Briefings in bioinformatics, 2010, 11(5): 457-472.

[6] Wang R S, Wu L Y, Li Z P, et al. Haplotype reconstruction from SNP fragments by minimum error correction [J]. Bioinformatics, 2005, 21(10): 2456-2462.

[7] Behjati S, Tarpey P S. What is next generation sequencing? [J]. Archives of disease in childhood-Education & practice edition, 2013, 98(6): 236-238.

[8] Schadt E E, Turner S, Kasarskis A. A window into third-generation sequencing [J]. Human molecular genetics, 2010, 19(R2): R227-R240.

[9] Han Y, He J, Li M, et al. Unlocking the potential of metagenomics with the PacBio high-Fidelity sequencing technology [J]. Microorganisms, 2024, 12(12): 2482.

[10] Forcato M, Nicoletti C, Pal K, et al. Comparison of computational methods for Hi-C data analysis [J]. Nature methods, 2017, 14(7): 679-685.

[11] Cheng H, Concepcion G T, Feng X, et al. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm [J]. Nature methods, 2021, 18(2): 170-175.

[12] Garg S, Fungtammasan A, Carroll A, et al. Chromosome-scale, haplotype-resolved assembly of human genomes [J]. Nature biotechnology, 2021, 39(3): 309-312.

[13] Chin C S, Peluso P, Sedlazeck F J, et al. Phased diploid genome assembly with single-molecule real-time sequencing [J]. Nature methods, 2016, 13(12): 1050-1054.

[14] Luo X, Kang X, Schönhuth A. Phasebook: haplotype-aware de novo assembly of diploid genomes from long reads [J]. Genome biology, 2021, 22(1): 299.

[15] Zhang X, Zhang S, Zhao Q, et al. Assembly of allele-aware, chromosomal-scale autopolyploid genomes based on Hi-C data [J]. Nature plants, 2019, 5(8): 833-845.

[16] Wu P, Liu C, Wang O, et al. AsmMix: A pipeline for high quality diploid de novo assembly [J]. bioRxiv, 2021: 2021.01. 15.426893.

[17] Berger E, Yorukoglu D, Peng J, et al. HapTree: a novel Bayesian framework for single individual polyplotyping using NGS data [J]. PLoS computational biology, 2014, 10(3): e1003502.

[18] Xie M, Wu Q, Wang J, et al. H-PoP and H-PoPG: heuristic partitioning algorithms for single individual haplotyping of polyploids [J]. Bioinformatics, 2016, 32(24): 3735-3744.

[19] Bansal V, Bafna V. HapCUT: an efficient and accurate algorithm for the haplotype assembly problem [J]. Bioinformatics, 2008, 24(16): i153-i159.

[20] Edge P, Bafna V, Bansal V. HapCUT2: robust and accurate haplotype assembly for diverse sequencing technologies [J]. Genome research, 2017, 27(5): 801-812.

[21] Martin M, Patterson M, Garg S, et al. WhatsHap: fast and accurate read-based phasing [J]. BioRxiv, 2016: 085050.

[22] Abou Saada O, Tsouris A, Eberlein C, et al. nPhase: an accurate and contiguous phasing method for polyploids [J]. Genome biology, 2021, 22(1): 126.

[23] Delaneau O, Zagury J F, Robinson M R, et al. Accurate, scalable and integrative haplotype estimation [J]. Nature communications, 2019, 10(1): 5436.

[24] Hosseini M, Veiner E, Bergendahl T, et al. pHapCompass: Probabilistic Assembly and Uncertainty Quantification of Polyploid Haplotype Phase [J]. arXiv preprint arXiv:2512.04393, 2025.

[25] Xue H, Rajan V, Lin Y. Graph coloring via neural networks for haplotype assembly and viral quasispecies reconstruction [J]. Advances in Neural Information Processing Systems, 2022, 35: 30898-30910.

[26] Consul S, Ke Z, Vikalo H. XHap: haplotype assembly using long-distance read correlations learned by transformers [J]. Bioinformatics Advances, 2023, 3(1): vbad169.

[27] Xue H, Rajan V, Lin Y. Graph coloring via neural networks for haplotype assembly and viral quasispecies reconstruction [J]. Advances in Neural Information Processing Systems, 2022, 35: 30898-30910.

[28] Battistella E, Maheshwari A, Ekim B, et al. ralphi: a deep reinforcement learning framework for haplotype assembly [C]//International Conference on Research in Computational Molecular Biology. Cham: Springer Nature Switzerland, 2025: 349-353.

Downloads

Published

30-04-2026

Issue

Section

Articles