Genome assembly of two allotetraploid cotton germplasms reveals mechanisms of somatic embryogenesis and enables precise genome editing

genome-assembly-of-two-allotetraploid-cotton-germplasms-reveals-mechanisms-of-somatic-embryogenesis-and-enables-precise-genome-editing
Genome assembly of two allotetraploid cotton germplasms reveals mechanisms of somatic embryogenesis and enables precise genome editing
  • Bhatia, S., Sharma, K., Dahiya, R. & Bera, T. (eds). Modern Applications of Plant Biotechnology in Pharmaceutical Sciences, pp. 209–230 (Academic Press, 2015).

  • Zheng, Q. & Perry, S. E. Alterations in the transcriptome of Soybean in response to enhanced somatic embryogenesis promoted by orthologs of AGAMOUS-like15 and AGAMOUS-like18. Plant Physiol. 164, 1365–1377 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  • Horstman, A., Bemer, M. & Boutilier, K. A transcriptional view on somatic embryogenesis. Regeneration (Oxf.) 4, 201–216 (2017).

    PubMed  Google Scholar 

  • Wang, K. et al. The gene TaWOX5 overcomes genotype dependency in wheat genetic transformation. Nat. Plants 8, 110–117 (2022).

    PubMed  Google Scholar 

  • Chen, Z., Debernardi, J. M., Dubcovsky, J. & Gallavotti, A. Recent advances in crop transformation technologies. Nat. Plants 8, 1343–1351 (2022).

    CAS  PubMed  Google Scholar 

  • Li, J. et al. Multi-omics analyses reveal epigenomics basis for cotton somatic embryogenesis through successive regeneration acclimation process. Plant Biotechnol. J. 17, 435–450 (2019).

    CAS  PubMed  Google Scholar 

  • Iwase, A. et al. WIND1-based acquisition of regeneration competency in Arabidopsis and rapeseed. J. Plant Res. 128, 389–397 (2015).

    CAS  PubMed  Google Scholar 

  • Lowe, K. et al. Morphogenic regulators Baby boom and Wuschel improve monocot transformation. Plant Cell 28, 1998 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  • Debernardi, J. M. et al. A GRF–GIF chimeric protein improves the regeneration efficiency of transgenic plants. Nat. Biotechnol. 38, 1274–1279 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  • Wang, M. et al. Reference genome sequences of two cultivated allotetraploid cottons, Gossypium hirsutum and Gossypium barbadense. Nat. Genet. 51, 224–229 (2019).

    PubMed  Google Scholar 

  • Yang, Z. et al. Extensive intraspecific gene order and gene structural variations in upland cotton cultivars. Nat. Commun. 10, 2989 (2019).

    PubMed  PubMed Central  Google Scholar 

  • Conover, J. L. & Wendel, J. F. Deleterious mutations accumulate faster in allopolyploid than diploid cotton (Gossypium) and unequally between subgenomes. Mol. Biol. Evol. 39, msac024 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  • Sun, C. et al. Precise integration of large DNA sequences in plant genomes using PrimeRoot editors. Nat. Biotechnol. 42, 316–327 (2024).

    CAS  PubMed  Google Scholar 

  • Chen, K., Wang, Y., Zhang, R., Zhang, H. & Gao, C. CRISPR/Cas genome editing and precision plant breeding in agriculture. Annu. Rev. Plant Biol. 70, 667–697 (2019).

    CAS  PubMed  Google Scholar 

  • Wang, G. et al. Precise fine-turning of GhTFL1 by base editing tools defines ideal cotton plant architecture. Genome Biol. 25, 59 (2024).

    PubMed  PubMed Central  Google Scholar 

  • Jiang, T., Zhang, X.-O., Weng, Z. & Xue, W. Deletion and replacement of long genomic sequences using prime editing. Nat. Biotechnol. 40, 227–234 (2022).

    CAS  PubMed  Google Scholar 

  • Fernie, A. R. & Yan, J. De novo domestication: an alternative route toward new crops for the future. Mol. Plant 12, 615–631 (2019).

    CAS  PubMed  Google Scholar 

  • Shoemaker, R., Couche, L. & Galbraith, D. Characterization of somatic embryogenesis and plant regeneration in cotton (Gossypium hirsutum L.). Plant Cell Rep. 5, 178–181 (1986).

    CAS  PubMed  Google Scholar 

  • Jin, S. et al. Identification of a novel elite genotype for in vitro culture and genetic transformation of cotton. Biol. Plant. 50, 519–524 (2006).

    CAS  Google Scholar 

  • Wang, L. et al. The GhmiR157aGhSPL10 regulatory module controls initial cellular dedifferentiation and callus proliferation in cotton by modulating ethylene-mediated flavonoid biosynthesis. J. Exp. Bot. 69, 1081–1093 (2017).

  • Xu, J. GhL1L1 affects cell fate specification by regulating GhPIN1-mediated auxin distribution. Plant Biotechnol. J. 17, 63–74 (2019).

    CAS  PubMed  Google Scholar 

  • Deng, J. et al. GhTCE1–GhTCEE1 dimers regulate transcriptional reprogramming during wound-induced callus formation in cotton. Plant Cell 34, 4554–4568 (2022).

    PubMed  PubMed Central  Google Scholar 

  • Yuan, J. et al. GhRCD1 regulates cotton somatic embryogenesis by modulating the GhMYC3–GhMYB44–GhLBD18 transcriptional cascade. New Phytol. 240, 207–223 (2023).

    CAS  PubMed  Google Scholar 

  • Guo, H. et al. Somatic embryogenesis critical initiation stage-specific mCHH hypomethylation reveals epigenetic basis underlying embryogenic redifferentiation in cotton. Plant Biotechnol. J. 18, 1648–1650 (2020).

    PubMed  PubMed Central  Google Scholar 

  • Ge, X. et al. Efficient genotype-independent cotton genetic transformation and genome editing. J. Integr. Plant Biol. 65, 907–917 (2023).

    CAS  PubMed  Google Scholar 

  • Liu, Y. et al. Cloning and preliminary verification of telomere-associated sequences in upland cotton. Comp. Cytogenet. 14, 183–195 (2020).

    PubMed  PubMed Central  Google Scholar 

  • Wang, P. & Wang, F. A proposed metric set for evaluation of genome assembly quality. Trends Genet. 39, 175–186 (2023).

    CAS  PubMed  Google Scholar 

  • Nurk, S. et al. The complete sequence of a human genome. Science 376, 44–53 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  • Vollger, M. R. et al. Long-read sequence and assembly of segmental duplications. Nat. Methods 16, 88–94 (2019).

    CAS  PubMed  Google Scholar 

  • Luo, S. et al. The cotton centromere contains a Ty3-Gypsy-like LTR retroelement. PLoS ONE 7, e35261 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  • Gorinšek, B., Gubenšek, F. & Kordiš, D. A. Evolutionary genomics of chromoviruses in eukaryotes. Mol. Biol. Evol. 21, 781–798 (2004).

    PubMed  Google Scholar 

  • Naish, M. et al. The genetic and epigenetic landscape of the Arabidopsis centromeres. Science 374, eabi7489 (2021).

    PubMed  PubMed Central  Google Scholar 

  • Schmitz, R. J., Grotewold, E. & Stam, M. Cis-regulatory sequences in plants: their importance, discovery, and future challenges. Plant Cell 34, 718–741 (2021).

    PubMed Central  Google Scholar 

  • Zhu, X. et al. Single-cell resolution analysis reveals the preparation for reprogramming the fate of stem cell niche in cotton lateral meristem. Genome Biol. 24, 194 (2023).

    CAS  PubMed  PubMed Central  Google Scholar 

  • Braybrook, S. A. & Harada, J. J. LECs go crazy in embryo development. Trends Plant Sci. 13, 624–630 (2008).

    CAS  PubMed  Google Scholar 

  • Ji, J. et al. WOX4 promotes procambial development. Plant Physiol. 152, 1346–1356 (2009).

    PubMed  PubMed Central  Google Scholar 

  • Wang, F. et al. Chromatin accessibility dynamics and a hierarchical transcriptional regulatory network structure for plant somatic embryogenesis. Dev. Cell 54, 742–757 (2020).

    CAS  PubMed  Google Scholar 

  • Izhaki, A. & Bowman, J. L. KANADI and Class III HD-Zip gene families regulate embryo patterning and modulate auxin flow during embryogenesis in Arabidopsis. Plant Cell 19, 495–508 (2007).

    CAS  PubMed  PubMed Central  Google Scholar 

  • Wang, G. et al. Development of an efficient and precise adenine base editor (ABE) with expanded target range in allotetraploid cotton (Gossypium hirsutum). BMC Biol. 20, 45 (2022).

    PubMed  PubMed Central  Google Scholar 

  • Li, C. et al. Targeted, random mutagenesis of plant genes with dual cytosine and adenine base editors. Nat. Biotechnol. 38, 875–882 (2020).

    CAS  PubMed  Google Scholar 

  • Xue, C. et al. Tuning plant phenotypes by precise, graded downregulation of gene expression. Nat. Biotechnol. 41, 1758–1764 (2023).

    CAS  PubMed  Google Scholar 

  • Xu, M., Du, Q., Tian, C., Wang, Y. & Jiao, Y. Stochastic gene expression drives mesophyll protoplast regeneration. Sci. Adv. 7, eabg8466 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  • Zhang, L. et al. A high-quality apple genome assembly reveals the association of a retrotransposon and red fruit colour. Nat. Commun. 10, 1494 (2019).

    PubMed  PubMed Central  Google Scholar 

  • Qin, L. et al. High-efficient and precise base editing of C·G to T·A in the allotetraploid cotton (Gossypium hirsutum) genome using a modified CRISPR/Cas9 system. Plant Biotechnol. J. 18, 45–56 (2020).

    CAS  PubMed  Google Scholar 

  • Jin, S. et al. Cytosine, but not adenine, base editors induce genome-wide off-target mutations in rice. Science 364, 292–295 (2019).

    CAS  PubMed  Google Scholar 

  • Hirano, H. et al. Structure and engineering of Francisella novicida Cas9. Cell 164, 950–961 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  • Huang, G. et al. Genome sequence of Gossypium herbaceum and genome updates of Gossypium arboreum and Gossypium hirsutum provide insights into cotton A-genome evolution. Nat. Genet. 52, 516–524 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  • Chen, Z. J. et al. Genomic diversifications of five Gossypium allopolyploid species and their impact on cotton improvement. Nat. Genet. 52, 525–533 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  • Sreedasyam, A. et al. Genome resources for three modern cotton lines guide future breeding efforts. Nat. Plants 10, 1039–1051 (2024).

    PubMed  PubMed Central  Google Scholar 

  • Han, J. et al. Rapid proliferation and nucleolar organizer targeting centromeric retrotransposons in cotton. Plant J. 88, 992–1005 (2016).

    CAS  PubMed  Google Scholar 

  • Wick, R. R., Judd, L. M. & Holt, K. E. Performance of neural network basecalling tools for Oxford nanopore sequencing. Genome Biol. 20, 129 (2019).

    PubMed  PubMed Central  Google Scholar 

  • Hu, J. et al. NextDenovo: an efficient error correction and accurate assembly tool for noisy long reads. Genome Biol. 25, 107 (2024).

    PubMed  PubMed Central  Google Scholar 

  • Hu, J., Fan, J., Sun, Z. & Liu, S. NextPolish: a fast and efficient genome polishing tool for long-read assembly. Bioinformatics 36, 2253–2255 (2019).

    Google Scholar 

  • Cheng, H., Concepcion, G. T., Feng, X., Zhang, H. & Li, H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat. Methods 18, 170–175 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  • Kirov, I., Gilyok, M., Knyazev, A. & Fesenko, I. Pilot satellitome analysis of the model plant, Physcomitrella patens, revealed a transcribed and high-copy IGS related tandem repeat. Comp. Cytogenet. 12, 493–513 (2018).

    PubMed  PubMed Central  Google Scholar 

  • Stovner, E. B. & Sætrom, P. epic2 efficiently finds diffuse domains in ChIP–seq data. Bioinformatics 35, 4392–4393 (2019).

    CAS  PubMed  Google Scholar 

  • Marçais, G. & Kingsford, C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics 27, 764 (2011).

    PubMed  PubMed Central  Google Scholar 

  • Liu, J. et al. Gapless assembly of maize chromosomes using long-read technologies. Genome Biol. 21, 121 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  • Ramírez, F., Dündar, F., Diehl, S., Grüning, B. A. & Manke, T. deepTools: a flexible platform for exploring deep-sequencing data. Nucleic Acids Res. 42, W187–W191 (2014).

    PubMed  PubMed Central  Google Scholar 

  • Quinlan, A. R. BEDTools: the Swiss‐army tool for genome feature analysis. Curr. Protoc. Bioinformatics 47, 11.12. 1–11.12. 34 (2014).

    PubMed  Google Scholar 

  • Mikheenko, A., Bzikadze, A. V., Gurevich, A., Miga, K. H. & Pevzner, P. A. TandemTools: mapping long reads and assessing/improving assembly quality in extra-long tandem repeats. Bioinformatics 36, i75–i83 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  • Ou, S. et al. Benchmarking transposable element annotation methods for creation of a streamlined, comprehensive pipeline. Genome Biol. 20, 275–292 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  • Novák, P., Neumann, P., Pech, J., Steinhaisl, J. & Macas, J. RepeatExplorer: a Galaxy-based web server for genome-wide characterization of eukaryotic repetitive elements from next-generation sequence reads. Bioinformatics 29, 792–793 (2013).

    PubMed  Google Scholar 

  • Vollger, M. R., Kerpedjiev, P., Phillippy, A. M. & Eichler, E. E. StainedGlass: interactive visualization of massive tandem repeat structures with identity heatmaps. Bioinformatics 38, 2049–2051 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  • Peng, R. et al. Evolutionary divergence of duplicated genomes in newly described allotetraploid cottons. Proc. Natl Acad. Sci. USA 119, e2208496119 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  • Gel, B. & Serra, E. karyoploteR: an R/Bioconductor package to plot customizable genomes displaying arbitrary data. Bioinformatics 33, 3088–3090 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  • Hu, L. et al. The chromosome-scale reference genome of black pepper provides insight into piperine biosynthesis. Nat. Commun. 10, 4702 (2019).

    PubMed  PubMed Central  Google Scholar 

  • Flynn, J. M. et al. RepeatModeler2 for automated genomic discovery of transposable element families. Proc. Natl Acad. Sci. USA 117, 9451–9457 (2020).

  • Bao, W., Kojima, K. K. & Kohany, O. Repbase Update, a database of repetitive elements in eukaryotic genomes. Mob. DNA 6, 11 (2015).

    PubMed  PubMed Central  Google Scholar 

  • Paterson, A. H. et al. Repeated polyploidization of Gossypium genomes and the evolution of spinnable cotton fibres. Nature 492, 423–427 (2012).

    CAS  PubMed  Google Scholar 

  • Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  • Goel, M., Sun, H., Jiao, W.-B. & Schneeberger, K. SyRI: finding genomic rearrangements and local sequence differences from whole-genome assemblies. Genome Biol. 20, 277 (2019).

    PubMed  PubMed Central  Google Scholar 

  • Cingolani, P. et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin) 6, 80–92 (2012).

  • Giordano, F., Stammnitz, M. R., Murchison, E. P. & Ning, Z. scanPAV: a pipeline for extracting presence–absence variations in genome pairs. Bioinformatics 34, 3022–3024 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  • Li, H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. Preprint at https://arxiv.org/abs/1303.3997 (2013).

  • Li, H. et al. The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

    PubMed  PubMed Central  Google Scholar 

  • McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  • Birney, E., Clamp, M. & Durbin, R. GeneWise and genomewise. Genome Res. 14, 988–995 (2004).

    CAS  PubMed  PubMed Central  Google Scholar 

  • Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  • Kim, D., Langmead, B. & Salzberg, S. L. HISAT: a fast spliced aligner with low memory requirements. Nat. Methods 12, 357–360 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  • Pertea, M. et al. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat. Biotechnol. 33, 290–295 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  • Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).

    PubMed  PubMed Central  Google Scholar 

  • Langfelder, P. & Horvath, S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics 9, 559 (2008).

    PubMed  PubMed Central  Google Scholar 

  • Shannon, P. et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498–2504 (2003).

    CAS  PubMed  PubMed Central  Google Scholar 

  • Bu, D. et al. KOBAS-i: intelligent prioritization and exploratory visualization of biological functions for gene enrichment analysis. Nucleic Acids Res. 49, W317–W325 (2021).

    CAS  PubMed  PubMed Central  Google Scholar 

  • Grandi, F. C., Modi, H., Kampman, L. & Corces, M. R. Chromatin accessibility profiling by ATAC–seq. Nat. Protoc. 17, 1518–1552 (2022).

    CAS  PubMed  PubMed Central  Google Scholar 

  • Chen, S., Zhou, Y., Chen, Y. & Gu, J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34, i884–i890 (2018).

    PubMed  PubMed Central  Google Scholar 

  • Faust, G. G. & Hall, I. M. SAMBLASTER: fast duplicate marking and structural variant read extraction. Bioinformatics 30, 2503–2505 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  • Zhang, Y. et al. Model-based analysis of ChIP–seq (MACS). Genome Biol. 9, R137 (2008).

    PubMed  PubMed Central  Google Scholar 

  • Li, Q., Brown, J. B., Huang, H. & Bickel, P. J. Measuring reproducibility of high-throughput experiments. Preprint at https://arxiv.org/abs/1110.4705 (2011).

  • Heinz, S. et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38, 576–589 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  • Machlab, D. et al. monaLisa: an R/Bioconductor package for identifying regulatory motifs. Bioinformatics 38, 2624–2625 (2022).

    PubMed  PubMed Central  Google Scholar 

  • Castro-Mondragon, J. A. et al. JASPAR 2022: the 9th release of the open-access database of transcription factor binding profiles. Nucleic Acids Res. 50, D165–D173 (2021).

    PubMed Central  Google Scholar 

  • Tan, G. & Lenhard, B. TFBSTools: an R/bioconductor package for transcription factor binding site analysis. Bioinformatics 32, 1555–1556 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  • Concordet, J.-P. & Haeussler, M. CRISPOR: intuitive guide selection for CRISPR/Cas9 genome editing experiments and screens. Nucleic Acids Res. 46, W242–W245 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  • Xu, Z. Scripts used in ‘Genome assembly of two allotetraploid cotton germplasms reveals mechanisms of somatic embryogenesis and enables precise genome editing’. Zenodo https://doi.org/10.5281/zenodo.15035095 (2025).