Z-Calling: a tool for A/Z (2,6-diaminopurine) base calling and dZ-DNA detection using PacBio HiFi reads

z-calling:-a-tool-for-a/z-(2,6-diaminopurine)-base-calling-and-dz-dna-detection-using-pacbio-hifi-reads
Z-Calling: a tool for A/Z (2,6-diaminopurine) base calling and dZ-DNA detection using PacBio HiFi reads

Data availability

Pacbio CCS BAMs (containing kinetics signals) sequenced by our lab have been deposited in Genome Sequence Archive of China National Center for Bioinformation in project ID PRJCA031439 under GSA accessions CRA020168, CRA019888, and CRA020191. Human Sequel Ⅱ(HG002) and Revio (HG00106) datasets were acquired from the study of Baid et al.42 and Human Pangenome Reference Consortium34, which are available at https://console.cloud.google.com/storage/browser/details/brain-genomics-public/research/deepconsensus/publication/sequencing/hg00215kb/m64008201124002822.subreads.bam?pageState=(%22StorageObjectListData%22:(%22f%22:%22%255B%255D%22))&walkthrough%20id=panels–storage–bucket and https://s3-us-west-2.amazonaws.com/human-pangenomics/working/HPRC/HG00106/raw_data/PacBio_HiFi/m84081_231112_034048_s4.hifi_reads.bc2070.bam. The plasmids pRS426-ApPurZ-ApdATPase and pRS425-ApDUF550 generated during the current study are available from the corresponding author on reasonable request under a standard Material Transfer Agreement. Source data for the graphs and charts in this study are available in the Figshare repository (https://doi.org/10.6084/m9.figshare.31281748)43.

Code availability

All codes written and used by this study have been deposited in our github repository (https://github.com/xiaochuanle/Z-Calling) and in Zenodo (https://doi.org/10.5281/zenodo.17840213)41. Partial command lines used in data analysis are described in Supplementary Notes.

References

  1. Watson, J. D. & Crick, F. H. Molecular structure of nucleic acids; a structure for deoxyribose nucleic acid. Nature 171, 737–738 (1953).

    Google Scholar 

  2. Kirnos, M. D., Khudyakov, I. Y., Alexandrushkina, N. I. & Vanyushin, B. F. 2-aminoadenine is an adenine substituting for a base in S-2L cyanophage DNA. Nature 270, 369–370 (1977).

    Google Scholar 

  3. Cheong, C., Tinoco, I. Jr. & Chollet, A. Thermodynamic studies of base pairing involving 2,6-diaminopurine. Nucleic Acids Res. 16, 5115–5122 (1988).

    Google Scholar 

  4. Cristofalo, M. et al. Nanomechanics of Diaminopurine-Substituted DNA. Biophys. J. 116, 760–771 (2019).

    Google Scholar 

  5. Chollet, A. & Kawashima, E. DNA containing the base analogue 2-aminoadenine: preparation, use as hybridization probes and cleavage by restriction endonucleases. Nucleic Acids Res. 16, 305–317 (1988).

    Google Scholar 

  6. Kang, S., Liu, Q., Zhang, J., Zhang, Y. & Qi, H. 2,6-diaminopurine (Z)-containing toehold probes improve genotyping sensitivity. Biotechnol. Bioeng. 121, 1383–1392 (2024).

    Google Scholar 

  7. Haaima, G., Hansen, H. F., Christensen, L., Dahl, O. & Nielsen, P. E. Increased DNA binding and sequence discrimination of PNA oligomers containing 2,6-diaminopurine. Nucleic Acids Res. 25, 4639–4643 (1997).

    Google Scholar 

  8. Bailly, C. & Waring, M. J. The use of diaminopurine to investigate structural properties of nucleic acids and molecular recognition between ligands and DNA. Nucleic Acids Res. 26, 4309–4314 (1998).

    Google Scholar 

  9. Zhou, Y. et al. A widespread pathway for substitution of adenine by diaminopurine in phage genomes. Science 372, 512–516 (2021).

    Google Scholar 

  10. Czernecki, D., Bonhomme, F., Kaminski, P.-A. & Delarue, M. Characterization of a triad of genes in cyanophage S-2L sufficient to replace adenine by 2-aminoadenine in bacterial DNA. Nat. Commun. 12, 4710 (2021).

    Google Scholar 

  11. Sleiman, D. et al. A third purine biosynthetic pathway encoded by aminoadenine-based viral DNA genomes. Science 372, 516–520 (2021).

    Google Scholar 

  12. Pezo, V. et al. Noncanonical DNA polymerization by aminoadenine-based siphoviruses. Science 372, 520–524 (2021).

    Google Scholar 

  13. Gao, S. et al. Harnessing non-Watson–Crick’s base pairing to enhance CRISPR effectors cleavage activities and enable gene editing in mammalian cells. Proc. Natl. Acad. Sci. USA 121, e2308415120 (2024).

    Google Scholar 

  14. Zhang, M., Singh, N., Ehmann, M. E., Zheng, L. & Zhao, H. Incorporation of noncanonical base Z yields modified mRNA with minimal immunogenicity and improved translational capacity in mammalian cells. iScience 26, 107739 (2023).

    Google Scholar 

  15. Ceze, L., Nivala, J. & Strauss, K. Molecular digital data storage using DNA. Nat. Rev. Genet. 20, 456–466 (2019).

    Google Scholar 

  16. Czernecki, D. et al. How cyanophage S-2L rejects adenine and incorporates 2-aminoadenine to saturate hydrogen bonding in its DNA. Nat. Commun. 12, 2420 (2021).

    Google Scholar 

  17. Tong, Y. et al. Alternative Z-genome biosynthesis pathway shows evolutionary progression from Archaea to phage. Nat. Microbiol. 8, 1330–1338 (2023).

    Google Scholar 

  18. Grome, M. W. & Isaacs, F. J. ZTCG: Viruses expand the genetic alphabet. Science 372, 460–461 (2021).

    Google Scholar 

  19. Rhoads, A. & Au, K. F. PacBio sequencing and its applications. Genom. Proteom. Bioinform. 13, 278–289 (2015).

    Google Scholar 

  20. Fuller, C. W. et al. The challenges of sequencing by synthesis. Nat. Biotechnol. 27, 1013–1023 (2009).

    Google Scholar 

  21. Rand, A. C. et al. Mapping DNA methylation with high-throughput nanopore sequencing. Nat. Methods 14, 411–413 (2017).

    Google Scholar 

  22. Wang, Y., Zhao, Y., Bollas, A., Wang, Y. & Au, K. F. Nanopore sequencing technology, bioinformatics and applications. Nat. Biotechnol. 39, 1348–1365 (2021).

    Google Scholar 

  23. Flusberg, B. A. et al. Direct detection of DNA methylation during single-molecule, real-time sequencing. Nat. Methods 7, 461–465 (2010).

    Google Scholar 

  24. Feng, Z. et al. Detecting DNA modifications from SMRT sequencing data by modeling sequence context dependence of polymerase kinetic. PLoS Comput. Biol. 9, e1002935 (2013).

    Google Scholar 

  25. Tse, O. Y. O. et al. Genome-wide detection of cytosine methylation by single molecule real-time sequencing. Proc. Natl. Acad. Sci. USA. 118, e2019768118 (2021).

  26. Ni, P. et al. DNA 5-methylcytosine detection and methylation phasing using PacBio circular consensus sequencing. Nat. Commun. 14, 4054 (2023).

    Google Scholar 

  27. Zhang, J. et al. 6mA-Sniper: quantifying 6mA sites in eukaryotes at single-nucleotide resolution. Sci. adv. 9, eadh7912 (2023).

    Google Scholar 

  28. Kong, Y. et al. Critical assessment of DNA adenine methylation in eukaryotes using quantitative deconvolution. Science 375, 515–522 (2022).

    Google Scholar 

  29. Jha, A. et al. DNA-m6A calling and integrated long-read epigenetic and genetic analysis with fibertools. Genome Res. 34, 1976–1986 (2024).

    Google Scholar 

  30. Rhie, A., Walenz, B. P., Koren, S. & Phillippy, A. M. Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies. Genome Biol. 21, 245 (2020).

    Google Scholar 

  31. Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems. 4768–4777 (Curran Associates Inc. 2017).

  32. Ehrlich, M. & Wang, R. Y. 5-Methylcytosine in eukaryotic DNA. Science 212, 1350–1357 (1981).

    Google Scholar 

  33. Luo, G.-Z., Blanco, M. A., Greer, E. L., He, C. & Shi, Y. DNA N6-methyladenine: a new epigenetic mark in eukaryotes? Nat. Rev. Mol. Cell Biol. 16, 705–710 (2015).

    Google Scholar 

  34. Wang, T. et al. The Human Pangenome Project: a global resource to map genomic diversity. Nature 604, 437–446 (2022).

    Google Scholar 

  35. Chen, Y. et al. High accuracy methylation identification tools on single molecular level for PacBio HiFi data. Preprint at https://www.biorxiv.org/content/10.1101/2024.08.14.607879v1 (2024).

  36. Chen, H. X. et al. Accurate cross-species 5mC detection for Oxford Nanopore sequencing in plants with DeepPlant. Nat. Commun. 16, 3227 (2025).

    Google Scholar 

  37. dos Santos, G. et al. FlyBase: introduction of the Drosophila melanogaster Release 6 reference genome assembly and large-scale migration of genome annotations. Nucleic Acids Res. 43, D690–D697 (2014).

    Google Scholar 

  38. Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc., B: Stat. Methodol. 57, 289–300 (1995).

    Google Scholar 

  39. Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).

    Google Scholar 

  40. Danecek, P. et al. Twelve years of SAMtools and BCFtools. GigaScience 10, giab008 (2021).

  41. Wu, B. Z-Calling Release v1.0.0, https://doi.org/10.5281/zenodo.17840213 (2025).

  42. Baid, G. et al. DeepConsensus improves the accuracy of sequences with a gap-aware sequence transformer. Nat. Biotechnol. 41, 232–238 (2023).

    Google Scholar 

  43. Wu, B. Figure source data of Z-Calling manuscript, https://doi.org/10.6084/m9.figshare.31281748 (2026).

Download references

Acknowledgements

We acknowledge financial support from the National Key R&D Program of China (2022YFF1201900 to C.-L.X.), the National Natural Science Foundation of China (no. 32270713, 62350004 to C.-L.X. and no. 32522004 and 32200051 to Y.Zhou); Guangdong Basic and Applied Basic Research Foundation (2020B1515020057 to C.-L.X.); Distinguished Young Scholars of China (no. 32125002 to Y.Zhang); the New Cornerstone Science Foundation (NCI2002321 to Y.Zhang); Natural Science Foundation of Jiangsu Province (BK20220591 to Y.Zhou); Key Project Fund of National Natural Science Foundation (no. 82230031 to W.C.); the Regional Innovation and Development Joint Fund of the National Natural Science Foundation of China (U24A20706 to W.C.); the Key Special Project of ‘Cutting-Edge Biotechnology’ in the National Key Research and Development Program of China (2024YFC3406200 to W.C.); Sanming Project of Medicine in Shenzhen (No. SZSM202411007 to W.C.); Guangdong Basic and Applied Basic Research Foundation Regional Joint Fund Key Program (2023B1515120051).

Author information

Author notes

  1. These authors contributed equally: Bo Wu, Ying Chen, Yan Zhou, Longjian Niu, He-Xu Chen.

Authors and Affiliations

  1. Shenzhen Eye Hospital, Shenzhen Eye Medical Center, Southern Medical University, 18 Zetian Road, Futian District, Shenzhen, China

    Bo Wu, Ying Chen, Longjian Niu, Jia-Yong Zhong & Wei Chi

  2. State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangzhou, China

    Bo Wu & Chuan-Le Xiao

  3. Jiangsu Key Laboratory of Zoonosis, Yangzhou University, Yangzhou, China

    Yan Zhou & Yating Li

  4. School of Artificial Intelligence, Sun Yat-Sen University, Zhuhai, China

    He-Xu Chen

  5. iHuman Institute and School of Life Science and Technology, ShanghaiTech University, Shanghai, China

    Suwen Zhao

  6. Shanghai Key Laboratory of High-resolution Electron Microscopy, ShanghaiTech University, Shanghai, China

    Suwen Zhao

  7. Shanghai Clinical Research and Trial Center, Shanghai, China

    Suwen Zhao

  8. Frontiers Science Center for Synthetic Biology (Ministry of Education), Tianjin University, Tianjin, China

    Yan Zhang

  9. New Cornerstone Science Laboratory, School of Pharmaceutical Science and Technology, Tianjin University, Tianjin, China

    Yan Zhang

  10. Key Laboratory of Systems Bioengineering (Ministry of Education), School of Chemical Engineering and Technology, Tianjin University, Tianjin, China

    Yan Zhang

  11. State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China

    Yan Zhang

Authors

  1. Bo Wu
  2. Ying Chen
  3. Yan Zhou
  4. Longjian Niu
  5. He-Xu Chen
  6. Yating Li
  7. Jia-Yong Zhong
  8. Suwen Zhao
  9. Wei Chi
  10. Yan Zhang
  11. Chuan-Le Xiao

Contributions

C.-L.X., Y.Zhang, W.C., and S.Z. conceived the study. B.W., Y.C., C.-L.X., and H.-X.C. implemented the algorithms of Z-Calling. H.-X.C., Y.C., and B.W. wrote the codes of Z-Calling. L.N., Y.Zhou, and Y.L. carried out experiments. B.W., Y.Zhou, and J.-Y.Z. carried out data analysis. B.W., Y.Zhang, Y.C., Y.Zhou, L.N., and H.-X.C. wrote the manuscript. S.Z., W.C, C.-L.X., J.-Y.Z., and Y.L. modified and improved the manuscript. All authors read and approved the final version of the manuscript.

Corresponding authors

Correspondence to Suwen Zhao, Wei Chi, Yan Zhang or Chuan-Le Xiao.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Communications Biology thanks Shanmuga Sozhamannan who co-reviewed with Rachael Sparklin; Osman Doluca and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editors: Rosie Bunton-Stasyshyn. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, B., Chen, Y., Zhou, Y. et al. Z-Calling: a tool for A/Z (2,6-diaminopurine) base calling and dZ-DNA detection using PacBio HiFi reads. Commun Biol (2026). https://doi.org/10.1038/s42003-026-09849-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1038/s42003-026-09849-8