A chromosome-level genome of Astilbe chinensis unveils the evolution of a terpene biosynthetic gene cluster

a-chromosome-level-genome-of-astilbe-chinensis-unveils-the-evolution-of-a-terpene-biosynthetic-gene-cluster
A chromosome-level genome of Astilbe chinensis unveils the evolution of a terpene biosynthetic gene cluster

Introduction

The horticultural plant Astilbe chinensis (Saxifragaceae family) is renowned for its vibrant and diverse flower colors. In addition to its ornamental value, it is also recognized as a medicinal plant with various therapeutic properties, owing to its richness in secondary metabolites like astilbin, bergenin, flavonoids, triterpenes, and phytosterols1,2. The Saxifragaceae family, comprising approximately 640 species and 33 genera, exhibits remarkable ecological diversity—ranging from herbaceous plants to shrubs, trees, aquatic species, and even saxicolous plants3,4. Phylogenetically, this family represents a crucial evolutionary node between Dillenianae and Rosids, though the exact relationships remain unresolved5. Despite its significance, genomic studies of Saxifragaceae are limited, with only four species sequenced to date6,7,8,9. Therefore, the A. chinensis genome sequence will help clarify Saxifragales phylogeny and provide a crucial genomic resource for this understudied plant family.

As the number of sequenced plant genomes continues to grow, we are gaining substantial insights into the genetic blueprints of these organisms. Biosynthetic gene clusters (BGCs) are being increasingly identified within plant genomes, with many implicated in terpenoid biosynthetic pathways10. Terpenes and their oxygenated derivatives, terpenoids, represent one of the largest and most structurally diverse classes of plant metabolites, serving critical ecological functions11. In terpenoid biosynthesis, terpene synthases (TPS) exhibit remarkable catalytic versatility, enabling the formation of thousands of distinct compounds12,13. Although the protein structures and catalytic mechanisms of several plant TPS enzymes have been characterized, predicting their functions and products remains challenging due to extensive variation within their substrate-binding pockets14. Therefore, exploring genetic resources from understudied plants such as A. chinensis offers opportunities to discover TPS genes or gene clusters. Investigating TPS diversity across different plant lineages can further clarify their contributions to species-specific metabolic profiles and ecological adaptations.

Although terpene BGCs are commonly reported for triterpenoids and diterpenoids biosynthesis in plants, such as avenacins, cucurbitacins, momilactones, and casbene, functional BGCs involved in monoterpene and sesquiterpene biosynthesis remain relatively rare15. Genomic studies, however, reveal that many TPS genes are organized in tandem arrays, suggesting these regions may serve as evolutionary hotspots for metabolic diversification16,17. For example, 13 of the 32 TPS genes in Arabidopsis thaliana are arranged in tandem18, and three tandem TPS genes are located on rice chromosome 8: Os080 (Os08g07080), Os100 (Os08g07100), and Os120 (Os08g07120). While Os080 is non-functional, Os100 and Os120 encode sesquiterpene synthases with divergent activities16. Nevertheless, the evolutionary mechanisms underlying the formation of terpene BGCs, including gene duplication, sequence variation, and functional divergence, remain poorly understood.

In previous work, using transcriptome data from A. chinensis, we elucidated the biosynthetic pathways of the flavonoid compounds neodiosmin and salidroside19,20. In the present study, the complete genome sequence of A. chinensis provides a foundation for genetic and evolutionary research in the Saxifragaceae family. Furthermore, the identification of a terpene BGC in this genome has led to the discovery of a eudesma-5,7-diene synthase, and genomic collinearity analysis has unveiled the potential formation process of the terpene gene cluster during plant evolution.

Results

Genome sequencing, assembly, and annotation of A. chinensis

The genome of A. chinensis (2n  =  2x  =  14)21 was sequenced and assembled using a combination of Nanopore long reads, Illumina short reads, and Hi-C data (Fig. 1A, Table 1, Supplementary Data 1, and Supplementary Methods 1 and 2). An initial genome survey estimated the genome size to be 314.7 Mb, with a high heterozygosity rate of 3.9% (Supplementary Fig. 1A and Supplementary Table 1). The final genome assembly achieved a total length of 335.3 Mb, consisting of 7 chromosome-level scaffolds with a N50 size of 42.1 Mb (Fig. 1B, Supplementary Fig. 2, and Supplementary Table 2), which is consistent with the estimated genome size of 366.9 Mb obtained through flow cytometry (Supplementary Fig. 3). Benchmarking Universal Single-Copy Orthologs analysis revealed that 98.0% of universal single-copy genes were fully annotated using the eudicots_odb10 database (Supplementary Fig. 1B and Supplementary Table 2). The assembly exhibited excellent continuity, with only 10.5 Kb of total genomic gap (Supplementary Fig. 4). Furthermore, telomere integrity analysis identified 13 out of 14 telomeric structures (92.9%) (Supplementary Fig. 4), indicating high quality and completeness of the assembly. The LTR Assembly Index index score for A. chinensis was 20.5, comparable to those of Medicago sativa (22.3) and Echinochloa colona (22.5), supporting the qualification of this assembly as a reference genome22,23,24.

Fig. 1: Landscape of A. chinensis morphology, genome features, and synteny.
figure 1

A Morphology of A. chinensis. B Distribution of A. chinensis genomic features. The linking lines in the circle represent synteny of paralogous sequences in the genome. Outermost to innermost tracks indicate the (1) pseudochromosomes, (2) GC content density, (3) gene density, (4) tandem or proximal duplicated (TD/PD) genes density, (5) TE density, (6) Copia LTR density, (7) Gypsy LTR density, (8) DNA TE density, and (9) LINE TE density.

Full size image

Table 1 Overview of genome sequencing, assembly, and annotation statistics

Full size table

Through a combination of de novo annotation, homology-based, and transcriptome-assisted gene identification, a total of 21,436 protein-coding genes were annotated (Supplementary Table 3). Functional annotation indicated that 97.69% of these genes had matches in at least one public database, including NR (94.37%), Swissprot (75.14%), PFAM (80.08%), KEGG (43.81%), TrEMBL (94.48%), and Interpro (96.20%) (Supplementary Fig. 5A and Supplementary Table 4). Additionally, we annotated 664 tRNAs, 401 rRNAs (including 211 8S, 94 18S, and 96 28S RNAs), and 589 other non-coding RNAs (114 miRNAs and 475 snRNAs) in the assembled A. chinensis genome (Supplementary Fig. 5B, Supplementary Table 5, and Supplementary Method 3).

Using de novo and homology-based approaches, we identified approximately 150.01 Mb transposable elements (TEs), accounting for 44.74% of the assembled A. chinensis genome (Supplementary Fig. 5C, Supplementary Data 2, and Supplementary Method 4). Long terminal repeat retrotransposons (LTR-RTs) constituted the largest proportion, covering 15.25% (approximately 51.13 Mb) of the total genome. Ty1/Copia and Ty3/Gypsy elements were the two main classes of LTR-RTs, accounting for 6.41% and 7.11% of the genome, respectively (Supplementary Fig. 6). We further compared TE content across other Saxifragales species and the closely related Vitaceae species, Vitis vinifera. TE proportions were 53.01% in V. vinifera, 37.70% in Kalanchoe fedtschenkoi, 40.04% in Kalanchoe laxiflora, and 51.47% in Rhodiola crenulata, suggesting relatively conserved TE proportions across these species without significant divergence (Supplementary Data 2).

Comparative genomic analysis revealed a whole-genome triplication (γ-WGT) event in A. chinensis

To identify whole-genome duplication (WGD) events in A. chinensis, we performed a genome-wide collinearity analysis using Amborella trichopoda and V. vinifera as references. The A. trichopoda genome serves as a unique reference, being the sister lineage to all other living angiosperms, while the V. vinifera genome represents the ancestral eudicot karyotype25,26. Comparative genomic analysis between A. chinensis and A. trichopoda or V. vinifera revealed syntenic depth ratios of 3:1 (A. chinensis: A. trichopoda) and 3:3 (A. chinensis: V. vinifera), respectively (Fig. 2A, B and Supplementary Fig. 7). Consistently, further analysis of the homologous gene of AmTrH2.05G047500.1 from A. trichopoda identified three homologous genes in both A. chinensis and V. vinifera, confirming the presence of 1:3:3 orthologous regions (A. trichopoda: V. vinifera: A. chinensis) in the comparisons (Fig. 2C). It had been established that γ-WGT event occurred in the V. vinifera, whereas no evidence supports lineage-specific polyploidy events in A. trichopoda. Thus, it was inferred that A. chinensis was similar to V. vinifera in that they underwent only the γ-WGT event without additional whole-genome replication events.

Fig. 2: Comparative genome analysis of A. chinensis with other species.
figure 2

Syntenic dot plots between the A. chinensis genome and the A. trichopoda genome (A) and the V. vinifera genome (B). Each dot represents a homologous gene pair retained in a synteny block. C Macrosynteny patterns between A. chinensis, A. trichopoda, and V. vinifera. Matching gene pairs are displayed as connecting shades and highlighted by one syntenic set shown in color. D Chronogram shows divergence times and genome duplications in Superasterids and Superrosids with node age and the 95% confidence intervals labeled. Resolved polyploidization events are shown with blue (duplications) and red (triplications) translucent dots. Pie charts show the proportions of gene families that underwent expansion or contraction. E Ks age distributions for paralogues found in collinear regions (anchor pairs) of A. chinensis and V. vinifera and for orthologues between A. chinensis and V. vinifera. Source data are provided as a Source Data file.

Full size image

To elucidate the phylogenetic relationship of A. chinensis among angiosperms, we constructed a phylogenetic tree of 291 low-copy ortholog sets from 14 species across Malvids, Fabids, Saxifragales, Vitales, and Lamiids (Supplementary Table 6 and Supplementary Method 5). Both merged and concatenated methods yielded an identical and highly supported topology, placing A. chinensis as a sister group to other Saxifraga plants within Saxifragales, with Saxifragales forming a sister clade to other Rosids (Vitales, Fabids, and Malvids) within the Superrosids (Supplementary Fig. 8). Predicted gene models for the 15 species clustered to 24,884 orthogroups, among which 756 were expanded and 5123 were contracted in A. chinensis (Fig. 2D).

To further investigate the evolutionary history of the Saxifragales, we estimated intragenomic and interspecific homolog Ks (synonymous substitutions per site) distributions. A. chinensis paralogues showed a signature peak Ks value at approximately 1.35, similar to V. vinifera at 1.25 (Fig. 2E). Analysis of Ks distribution across 14 representative plant species confirmed that all underwent a γ-WGT event around 122–164 million years ago, which aligns with previous reports27,28. In contrast to some plants, such as Gossypium hirsutum, A. thaliana, and other Saxifragales members, which experienced one or two additional WGD events after the γ-WGT event, A. chinensis exhibited no further WGD events (Fig. 2D and Supplementary Fig. 9). Molecular dating analysis suggested that A. chinensis diverged from the other Saxifragales species approximately 86.18–110.51 Mya, following the divergence between Saxifragales and Vitales around 105.17–120.05 Mya (Fig. 2D).

Gene duplication analysis identified a terpene biosynthetic gene cluster

Gene duplication, by generating redundant gene copies and creating genetic novelty in organisms, serves as a crucial evolutionary force driving species formation, adaptation, and diversification29. We thus focused on characterizing duplicated genes in A. chinensis. By identifying distinct duplication modes of gene pairs30, we detected a total of 16,062 duplicated genes, which were categorized into five types based on their duplication origin: 3894 from WGDs, 1963 from tandem duplications (TD), 892 from proximal duplications (PD), 6099 from transposed duplications (TRD), and 5097 from dispersed duplications (DSD) (Fig. 3A, Supplementary Table 7, and Supplementary Method 6). We further compared the Ka/Ks ratio (ratio of the non-synonymous to synonymous substitution) and Ks distribution across these duplication modes. Among these modes, TD and PD gene pairs exhibited higher Ka/Ks ratios and smaller Ks values, indicating an ongoing duplication process for TD and PD, alongside more rapid sequence divergence and stronger positive selection (Fig. 3B and Supplementary Table 8).

Fig. 3: Gene duplication analysis identified a terpene synthase gene cluster in A. chinensis.
figure 3

A Gene upset plot of gene duplication types. WGD whole-genome duplications, TRD transposed duplications, TD tandem duplications, PD proximal duplications, DSD dispersed duplications. B The Ka/Ks ratio distributions and the Ks ratio distributions of gene pairs derived from different modes of duplication. Gaussian kernel estimates of Ka/Ks and Ks for different duplicated groups are shown as violins. The box center line represents the median, the box edges indicate the first and third quartiles, and the whiskers extend to 1.5× the interquartile range. Data were analyzed by one-way ANOVA with two-tailed Tukey’s honestly significant difference (HSD) multiple comparison test (sample sizes: DSD = 6917, PD = 440, TD = 1127, TRD = 4464, WGD = 2212). Statistically significant differences (P < 0.05) are indicated by different lowercase letters. Exact P-values are available in Supplementary Table 8. C Venn diagram illustrates the potential logical relations between members of expanded gene families and duplication modes. EGs, expansion genes. D Terpene synthase gene cluster in A. chinensis genome. Genes are represented with arrows. The function of each gene product is indicated by colors: red, terpene synthase (TPS); blue, cytochrome P450; green, truncated cytochrome P450; orange, truncated terpene synthase; purple, cis-prenyltransferase (cis-PT); magenta, methyltransferase (MT); grey, protein of other types. Source data are provided as a Source Data file.

Full size image

We identified 3097 expanded genes in 756 orthogroups (Fig. 2D). Of these expanded genes, 617 and 316 overlapped with TD or PD genes, respectively (Fig. 3C and Supplementary Table 7). We performed Gene Ontology (GO) analysis on these overlapping genes, which showed enrichment in key GO terms related to “terpene synthase activity”, “enzyme activity”, and “binding” (Supplementary Fig. 10A). For Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis, the genes exhibited enrichment in pathways including “plant self-defense”, “plant adaptation”, “cytochrome P450”, and “sesquiterpenoid and triterpenoid biosynthesis” (Supplementary Fig. 10B). In summary, newly formed tandem and PD have significantly contributed to gene family expansion in A. chinensis, playing crucial roles in plant metabolic pathways, particularly the biosynthesis of terpenoids.

To systematically explore the expanded and duplicated genes associated with secondary metabolism in the A. chinensis genome, we employed PlantiSMASH for analysis31. A total of 46 biosynthetic gene clusters were identified, encompassing those involved in the biosynthesis of saccharides, terpenes, alkaloids, polyketides, and lignans (Supplementary Data 3). One significant gene cluster spans approximately 469.8 Kb and comprises multiple genes encoding TPS, cytochrome P450, cis-prenyltransferase (cis-PT), and methyltransferase (MT) (Fig. 3D). It is worth mentioning that this gene cluster contains nine TPS genes, eight of which are expansion genes and categorized as either TD or PD genes, with the exception of AcTPS1 (Supplementary Table 9). TPS enzymes are vital for terpenoid skeleton biosynthesis in plants and are present in almost all plant species, including lower plants32.

Identification of an eudesma-5,7-diene synthase from the terpene biosynthetic gene cluster

Terpenes and their derived terpenoids represent the largest class of specialized metabolites in plants, and many terpene biosynthesis pathways are often associated with biosynthetic gene clusters33. To identify the TPS genes from A. chinensis, we screened the assembled gene models for those containing both the PF01397 and PF03936 motifs, corresponding to the N– and C– terminal domains of TPS enzymes. A total of 38 genes were identified, with nine TPS genes within this cluster belonging to the TPS-a subfamily and forming three subclades (Fig. 4A and Supplementary Method 7). By analysing the selection pressure of TPS genes during evolution, we found the eight TPS genes within the TPS gene cluster showed a sign of positive selection among the TPS-a branch (Supplementary Table 10). These findings suggest that this TPS gene cluster may play a significant role in plant adaptive evolution, as the TPS genes likely undergone neofunctionalization.

Fig. 4: Identification of an eudesma-5,7-diene synthase.
figure 4

A Phylogenetic analysis showcasing the classification and relationship of terpene synthase (TPS) genes in A. chinensis. Genes marked in red are those found within the TPS biosynthesis gene cluster identified in this study. B Expression levels of nine TPS genes in seven tissues. C GC chromatograms of extracts from the yeast cultures expressing AcTPS2 and AcTPS5. Genes were co-expressed with ERG20 (yeast FPPS, NP_012368). Products are identified as: F1, germacrene C; F2, β-elemene; F4, α-selinene; F5, trans-nerolidol. “Empty vector” indicates a negative control. D Mass spectra comparison of products and the authorized standards. E The relative configuration, 1H–1H COSY, the key HMBC and NOESY correlations of eudesma-5,7-diene (F3). F LC chromatograms and mass spectra at a retention time of 9.37 min of the eudesma-5,7-diene standard and extracts from different tissues of A. chinensis. Source data are provided as a Source Data file.

Full size image

Transcriptome analysis of the TPS gene family revealed distinct expression patterns for different TPS genes, four of which exhibited high expression levels in the rhizomes and roots (Fig. 4B). Sequence analysis revealed these TPS genes share limited similarity with previously characterized TPSs, with the highest degree of similarity to LfTPS02 (AIO10965.1) from Liquidambar formosana in the NCBI database (58.08% identity). To characterize their catalytic activities, we successfully amplified and cloned four TPS genes—AcTPS1 (Asch_Chr1_01883.1), AcTPS2 (Asch_Chr1_01886.1), AcTPS5 (Asch_Chr1_01889.1), and AcTPS6 (Asch_Chr1_01911.1). Using a terpene precursor-supplied yeast JCR27 strain34,35, we co-expressed these genes with the yeast farnesyl diphosphate synthase gene (ERG20) and analyzed their sesquiterpene production.

AcTPS2 catalyzed the formation of five sesquiterpenes (F1–F5), while AcTPS5 generated four sesquiterpene compounds (F1–F4) (Fig. 4C). AcTPS6 was exclusively responsible for F5 biosynthesis (Supplementary Fig. 11). Additionally, AcTPS1 catalyzed the biosynthesis of three sesquiterpenes, specifically F2, F6, and F7 (Supplementary Fig. 11). Among them, F1, F2, F4, F5, F6, and F7 were identified as germacrene C (F1), β-elemene (F2), α-selinene (F4), trans-nerolidol (F5), β-caryophyllene (F6), and α-humulene (F7) through comparisons with standards, and F3 was initially identified as an unknown compound (Fig. 4D and Supplementary Fig. 12). These sesquiterpene products were further validated via transient expression of the corresponding TPS genes in Nicotiana benthamiana leaves (Supplementary Fig. 13 and Supplementary Method 8).

Following large-scale fermentation and purification from yeast strain, we obtained 2.3 mg of F3, whose chemical structure was further elucidated by nuclear magnetic resonance (NMR) spectroscopy (1H NMR and 13C NMR). Detailed comparison of spectra with δ-selinene, also named eudesma-4,6-diene (P2), revealed that they share the same structure (Supplementary Fig. 14)36. Inadvertently, we observed that the spectra of gas chromatography–mass spectrometry (GC-MS) fore-and-aft NMR were completely different, indicating that the product may be unstable in CDCl3 (Supplementary Fig. 15). We then tested and found that the compound remained stable for NMR without changes by using acetone-d6 and CH3OD as solvents (Supplementary Fig. 15). Ultimately, F3 was isolated as a pale yellow oil. Its molecular formula was determined to be C15H24 via high-resolution electrospray ionization mass spectrometry (HR ESI–MS). Through extensive analysis of NMR spectra (1H NMR, 13C NMR, 1H–1H COSY, HMBC, and NOESY) and comparison with previously reported literature, F3 was eventually identified as eudesma-5,7-diene (Fig. 4E, Supplementary Figs. 1618, and Supplementary Tables 11 and 12)37,38.

Eudesma-5,7-diene has previously been detected in only a few plants, including Vetiveria zizanioides, Croton eluteria, and Preissia quadrata38,39,40. To investigate the tissue-specific distribution of eudesma-5,7-diene in A. chinensis, extracts from rhizomes, leaves, stems, and roots were analyzed using high-performance liquid chromatography–mass spectrometry (HPLC–MS). Although no discernible absorption peaks were detected in root extracts at the retention time (RT) of 9.37 min, the other tissues showed targeted MS/MS fragmentation with characteristic fragment ions that matched authentic standards. These results indicate that eudesma-5,7-diene is specifically localized in the rhizomes, leaves, and stems (Fig. 4F).

We then purified the AcTPS2 protein from heterologous expression in Escherichia coli, and determined its optimal catalytic conditions in vitro. Assays with multiple substrates, including geranyl diphosphate (GPP), farnesyl diphosphate (FPP), and geranylgeranyl diphosphate (GGPP), demonstrated that AcTPS2 exhibits strict specificity for FPP (Supplementary Figs. 1921). Enzyme kinetic analyses revealed a substrate concentration-dependent activity profile, with maximal velocity (Vmax) of 0.323 nmol·h−1·μg−1 observed at 200 μM substrate (Supplementary Fig. 22). Michaelis–Menten parameters were quantified as follows: Km = 23.84 μM, Kcat = 0.323 min−1, and catalytic efficiency Kcat/Km = 0.0135 min−1·μM−1.

Gene duplication of terpene synthase genes in the biosynthetic gene cluster occurred subsequent to speciation

In a synteny analysis, this terpene gene cluster in A. chinensis demonstrated collinearity with that of various plant species (Supplementary Table 13). Both the upstream and downstream regions of the gene cluster showed sequence conservation, despite variations in TPS gene copy number. Specifically, within the corresponding genomic segments, Coffea canephora, Aquilaria sinensis, and Ipomoea triloba each contained two TPS genes, V. vinifera contained four TPS genes, and Sesamum indicum had six TPS genes, while A. thaliana completely lacked the corresponding region (Fig. 5A). Phylogenetic analysis of these TPS sequences revealed lineage-specific conservation, with TPS genes from the same species clustering together to form distinct subclades, except AcTPS1 (Fig. 5B).

Fig. 5: Gene duplication of terpene synthase genes in the biosynthetic gene cluster occurred subsequent to speciation.
figure 5

A Comparison of the TPS biosynthetic gene clusters from six plant species with A. chinensis. Synteny between each species is shown with grey lines and TPS genes are marked with a red block. The genes marked with an asterisk have been functionally identified in vitro. B Phylogenetic tree of TPS genes found within TPS biosynthetic genes clusters identified in this study. C GC chromatograms of extracts from the yeast cultures expressing CcTPS2, ItTPS1, and SiTPS2. Genes were co-expressed with ERG20 (yeast FPPS, NP_012368). “Empty vector” indicates negative control. D Chemical structure of sesquiterpene compounds determined by GC chromatograms. Products were identified by comparison to standards or NIST17 library. E A proposed model for the evolutionary trajectory of the TPS gene cluster in plants. Genes annotated with an asterisk have undergone functional identification. Source data are provided as a Source Data file.

Full size image

The functions of TPS from V. vinifera and Aquilaria sinensis have been characterized previously, with VvTPS2 mainly producing cubebol (F13) and δ-cadinene (F18), and AsTPS1 producing α-humulene (F7)41,42. To investigate functional variability among other TPS, we synthesized cDNAs of CcTPS2, ItTPS1, and SiTPS2 and performed functional characterization studies (Fig. 5C, D and Supplementary Fig. 11). CcTPS2 catalyzed the formation of six sesquiterpene products, including germacrene-D-4-ol (F15), germacradien-6-ol (F17), α-maaliene (F14), β-elemene (F2), β-selinene (F12), and one unknown compound. ItTPS1 produced seven sesquiterpene products, with cyperene (F8) as the major component. SiTPS2 exclusively produced a single product, pogostol (F16) (Supplementary Data 4). Overall, TPS genes in this gene cluster not only exhibit significant sequence divergence across different species but also produce markedly different arrays of products, suggesting substantial functional diversity.

Therefore, we propose that the initial divergence of plant TPS sequences likely took place between 89 and 125 Mya, during the process of angiosperm genome differentiation. Following this, the duplication of TPS genes happened subsequent to speciation, which ultimately led to the formation of the TPS biosynthesis gene cluster (Fig. 5E). These duplication events likely facilitated the expansion and further functional specialization of TPS genes, allowing plants to explore distinct ecological niches and adapt to environmental changes through the development of diverse secondary metabolites. This sequence of evolutionary events underscores the complexity and dynamism of plant secondary metabolism.

Discussion

Here, we conducted a chromosome-level genome sequencing of A. chinensis, an ornamental plant belonging to the Saxifragaceae family. This genomic resource not only sheds light on the plant’s evolutionary history and reveals its genetic diversity, but also uncovers genes involved in secondary metabolite biosynthesis. The development of sequencing technology has led to the revelation of an increasing amount of information on plant genomes. For non-model plants, however, current genomic research primarily focuses on gene evolution and natural variation, with exploration and application of these genomic resources remaining very limited. The advancement of synthetic biology provides a good opportunity for the application of plant genomic resources, without being constrained by the unpredictability and complexity of plant growth and genetic transformation. Given the extensive plant genomic data already available in public databases, adopting this approach can accelerate the discovery of more valuable plant secondary metabolites.

Moreover, through the analysis of the A. chinensis genome, a terpene synthase gene cluster was discovered, and a eudesma-5,7-diene synthase was identified using a yeast chassis for heterologous expression. Eudesma-5,7-diene belongs to the eudesmane-type sesquiterpenoids, which are a class of natural compounds with a wide range of biological activities, especially prevalent in plants of the Asteraceae family, and are also important components of agarwood essential oil43. Eudesmane-type sesquiterpenoids exhibit diverse chemical structures and pharmacological effects, including anti-inflammatory, anti-tumor, neuroprotective, hepatoprotective, antibacterial, and antiviral activities44. Eudesma-5,7-diene is relatively rare in nature; hence, research on its biological activity is very limited. Here, we utilize synthetic biology methods for heterologous expression, enabling large-scale fermentation and extraction from yeast, thus paving the way for determining its biological effects and pharmacological properties.

Despite the presence of genes encoding modification enzymes (e.g., AcCYPs, AcMT, and AcPT) within the cluster, our co-expression of AcTPS2 with these genes in both tobacco (Nicotiana benthamiana) and yeast failed to yield any detectable modified derivatives of its primary product, eudesma-5,7-diene (Supplementary Figs. 2325). This unexpected result suggests that this locus may not constitute a complete, autonomous biosynthetic pathway. The maturation of the terpene skeleton into a final natural product may require the assistance of auxiliary enzymes encoded elsewhere in the genome. Alternatively, the cluster’s modification enzymes might target alternative products of AcTPS2. These possibilities necessitate further experimental validation to determine the precise functional context of this putative biosynthetic cluster.

Previous studies have demonstrated that TPS gene sequences are mostly lineage-specific in angiosperms. Our evolutionary model (Fig. 5E) traces their origin to a common ancestor of seven analyzed species, followed by lineage-specific duplications and functional divergence. While A. thaliana lost all TPS genes through pseudogenization events, the six other species retained expanded TPS gene families via repeated duplication. Terpenoid compounds are crucial for the environmental adaptability of plants and their interactions with other organisms. Functional characterization demonstrated that cluster-encoded TPS enzymes exhibit distinct catalytic specificities, with detectable signatures of positive selection, suggesting their metabolic diversification contributed to ecological adaptation.

The plant kingdom offers a wealth of genomic resources, with a multitude of TPS genes exhibiting substantial functional diversity. However, the intricate links between the protein sequences, structural conformations, and specific catalytic products of these TPS genes remain largely enigmatic. This complexity indicates a rich potential for further research into the functions of these genes and the biochemical processes they mediate. In summary, we underscore the feasibility of integrating genomic data with evolutionary gene functional analysis and synthetic biology approaches. This integration can unlock the medicinal and ecological potential of plant secondary metabolites, ultimately contributing to a deeper understanding and application of plant metabolic pathways across various fields.

Methods

Genome sequencing and assembly

A. chinensis was purchased from Tianjin Lanxiu Gardening Co., Ltd (Tianjin, China), and fresh leaves were collected for subsequent experiments. For genomic sequencing, high molecular weight genomic DNA was extracted from fresh leaf tissue and subjected to long-read sequencing on the Oxford Nanopore PromethION platform (Oxford Nanopore Technologies, Oxford, UK), short-read sequencing on the Illumina HiSeq 2000 (Illumina, San Diego, CA, USA), and Hi-C sequencing on the BGI MGISEQ platform (MGI Tech, Shenzhen, China). Separately, flow cytometry analysis was performed on fresh leaf samples to estimate the genome size.

The A. chinensis genome was assembled de novo using Canu (v2.1.1)45 based on clean Oxford Nanopore reads. To improve assembly accuracy, the contigs were refined sequentially using Racon (v1.4.17)46 for initial polishing, NextPolish (v1.4.1)47 for short-read-based error correction, and HaploMerger2 (Release 20180603)48 for haplotype merging, all with default parameters. The scaffolds were further anchored into chromosome-level assemblies using Hi-C data via Juicer (v1.6)49 and 3D-DNA (v180922)50. Detailed methodology is provided in the Supplementary Methods 1 and 2.

Functional characterization of sesquiterpene synthase

The open reading frames of AcTPS1, AcTPS2, AcTPS5, and AcTPS6 were cloned from the rhizome cDNA and introduced into the yeast expression vector, as described previously34,35. Primers used for PCR amplification were synthesized by GeneCreate Biological Engineering Co., Ltd. (Wuhan, China), and their sequences are detailed in Supplementary Table 14. A comprehensive list of the plasmids and strains used is provided in Supplementary Table 15. The codon-optimized sequences of CcTPS2, ItTPS1, and SiTPS2 were synthesized by GenScript Biotech Corporation (Nanjing, China). Expression plasmids were individually transformed into JCR27 strain. The yeast clone was precultured in the SC medium with uracil dropout supplemented with 1% glucose at 28 °C for 48 h at 220 × g. Then the culture was inoculated into YPD medium with 1% glucose and 1% galactose, covered with isopropyl myristate (IPM) at 28 °C for 72 h at 250 × g. The organic phase from the biphasic culture was harvested and diluted with hexane for GC-MS analysis.

The samples of sesquiterpenes profile were analyzed by GC-MS using GCMS-TQ8040 mass spectrometer (Shimadzu, Kyoto, Japan) with a TR-5MS column (30 m × 0.25 mm × 0.25 μm). The GC oven temperature was initially set at 80 °C for 1 min. The temperature was then ramped up to 280 °C at a rate of 10 °C min−1 and sustained for an additional 7 min. Terpenoid compounds were characterized within a mass-to-charge ratio (m/z) range of 45–500. The compounds were identified by comparison with our local library of standards, GroupLiu 6.051.

Isolation and structural identification of eudesma-5,7-diene

The IPM was collected from a 14 L culture medium to centrifuge for 50 min at 8000 × g after fermentation. Subsequently, the product was separated through vacuum distillation, and the residual IPM was removed by silica gel column chromatography (CC, 500–800 mesh) with petroleum ether as the eluent to yield 2.3 mg of eudesma-5,7-diene. The structure and purity of the product were determined by NMR and GC-MS analysis35,52. Nuclear magnetic resonance (NMR) spectra data were recorded using a Bruker AVANCE NEO 600 spectrometer (151 MHz or 600 MHz) at 298 K.

Extraction and liquid chromatography–mass spectrometry (LC–MS) analysis of the extracts from A. chinensis tissues

A total of 5 g of tissue was frozen in liquid nitrogen for 5 min and then ground into powder. An adequate amount of ethyl acetate was added by ultrasonic extraction for 40 min. The extract was centrifuged to obtain the supernatant, and the excess solvent was removed using a freeze-dryer. The extract was then re-dissolved in methanol for LC-MS analysis. For LC-MS analysis, an LTQ-Orbitrap-XL mass spectrometer (Thermo Fisher Scientific, USA) was used, coupled with an Accela ultra-high-pressure liquid chromatograph and a TSQ Quantum Ultra triple-quadrupole mass spectrometer equipped with an ESI source. Solvent A (0.1% formic acid in water) and solvent B (0.1% formic acid in acetonitrile) served as mobile phases. The flow rate was 0.3 mL/min and the injection volume was 1 μL. The gradient elution procedure was as follows: 10% B for 1 min; 10–100% B for 10 min; 100% B for 5 min. The column temperature was maintained at 25 °C. Mass spectrometry was performed in positive ion mode as follows: vaporizer temperature, 400 °C; source voltage, 3 kV; sheath gas, 60 au; auxiliary gas, 20 au; capillary temperature, 380 °C; capillary voltage, 6 V; tube lens, 45 V, with a scan of the mass range: 100–800 Da. The compounds were analyzed using the QualBrowser feature of Xcalibur software (version 2.1.0.1140).

Recombinant protein expression in E. coli and purification

The target TPS gene was cloned and constructed into the pET28a(+) vector with a C-terminal 6 × His tag using homologous recombination (Supplementary Tables 14 and 15). The recombinant plasmid was transformed into E. coli Rosetta 2 (DE3) cells for heterologous expression. The suspension was sonicated to obtain soluble cellular components. The supernatant was loaded on a Ni-NTA affinity column (GenScript Biotech Corporation, Nanjing, China) and eluted with an imidazole gradient. The eluted protein was further concentrated using a 30 kDa Millipore Ultrafiltration centrifugal filter (Merck KGaA, Darmstadt, Germany). Protein concentration was determined using the BCA Protein Assay Kit (Beyotime Biotechnology Co., Ltd., Shanghai, China).

Enzymatic assays

A total of 15 µg purified protein was incubated with 50 mM Tris-HCl (pH 7.5) containing 1 mM MgCl₂, 2 mM DTT, 12.5% glycerol, 0.1% Tween 20, 1 mM sodium ascorbate, and varying substrates GPP, FPP, and GGPP (Sigma-Aldrich Chemical Co., St. Louis, MO) at 30 °C for 1 h. An equal volume of ethyl acetate was then added to the reaction mixture for extraction. All reactions were performed in triplicate. Enzymatic kinetic parameters were calculated using GraphPad Prism software (version 6.0).

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

All raw sequence and genome assembly data have been deposited in the National Genomics Data Center (https://ngdc.cncb.ac.cn) with BioProject accession PRJCA027049. Functional characterized TPS sequences have deposited in the GenBase; the accession numbers are C_AA107105.1 (AcTPS1) [https://ngdc.cncb.ac.cn/genbase/search/gb/C_AA107105.1]; C_AA107106.1 (AcTPS2) [https://ngdc.cncb.ac.cn/genbase/search/gb/C_AA107106.1]; C_AA107107.1 (AcTPS5) [https://ngdc.cncb.ac.cn/genbase/search/gb/C_AA107107.1]; C_AA107108.1 (AcTPS6) [https://ngdc.cncb.ac.cn/genbase/search/gb/C_AA107108.1]; C_AA107109.1 (CcTPS2) [https://ngdc.cncb.ac.cn/genbase/search/gb/C_AA107109.1]; C_AA107110.1 (ItTPS1) [https://ngdc.cncb.ac.cn/genbase/search/gb/C_AA107110.1]; C_AA107111.1 (SiTPS2) [https://ngdc.cncb.ac.cn/genbase/search/gb/C_AA107111.1]. Additional data, including genome assemblies and annotations, GC-MS raw data, LC-MS raw data, and NMR raw data can be found in the Figshare database [https://doi.org/10.6084/m9.figshare.28748501]. Source data are provided with this paper.

References

  1. Sun, H. X., Ye, Y. P. & Yang, K. Studies on the chemical constituents in radix Astilbes chinensis. Zhongguo Zhong Yao Za Zhi 27, 751–754 (2002).

    PubMed  Google Scholar 

  2. Xue, Y., Xu, X. M., Yan, J. F., Deng, W. L. & Liao, X. Chemical constituents from Astilbe chinensis. J. Asian Nat. Prod. Res. 13, 188–191 (2011).

    Article  PubMed  Google Scholar 

  3. Deng, J. -b. et al. Phylogeny, divergence times, and historical biogeography of the angiosperm family Saxifragaceae. Mol. Phylogenet. Evol. 83, 86–98 (2015).

    Article  PubMed  Google Scholar 

  4. Tkach, N. et al. Molecular phylogenetics, morphology and a revised classification of the complex genus Saxifraga (Saxifragaceae). Taxon 64, 1159–1187 (2015).

    Article  Google Scholar 

  5. Zeng, L. et al. Resolution of deep eudicot phylogeny and their temporal diversification using nuclear genes from transcriptomic and genomic datasets. N. Phytol. 214, 1338–1354 (2017).

    Article  Google Scholar 

  6. Liu, X.-D. et al. A Chromosome-level genome assembly of the alpine medicinal plant Bergenia purpurascens (Saxifragaceae). Sci. Data 12, 121 (2025).

    Article  PubMed  PubMed Central  Google Scholar 

  7. Yang, Y.-X. et al. The chromosome-level genome assembly of an endangered herb Bergenia scopulosa provides insights into local adaptation and genomic vulnerability under climate change. GigaScience 13, giae091 (2024).

    Article  PubMed  PubMed Central  Google Scholar 

  8. Liu, S. et al. The Chrysosplenium sinicum genome provides insights into adaptive evolution of shade plants. Commun. Biol. 7, 1004 (2024).

    Article  PubMed  PubMed Central  Google Scholar 

  9. Liu, L. et al. Phylogenomic and syntenic data demonstrate complex evolutionary processes in early radiation of the rosids. Mol. Ecol. Resour. 23, 1673–1688 (2023).

    Article  PubMed  Google Scholar 

  10. Bryson, A. E. et al. Uncovering a miltiradiene biosynthetic gene cluster in the Lamiaceae reveals a dynamic evolutionary trajectory. Nat. Commun. 14, 343 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  11. Pichersky, E. & Raguso, R. A. Why do plants produce so many terpenoid compounds?. N. Phytol. 220, 692–702 (2018).

    Article  Google Scholar 

  12. Nagegowda, D. A. & Gupta, P. Advances in biosynthesis, regulation, and metabolic engineering of plant specialized terpenoids. Plant Sci. 294, 110457 (2020).

    Article  PubMed  Google Scholar 

  13. Gao, Y., Honzatko, R. B. & Peters, R. J. Terpenoid synthase structures: a so far incomplete view of complex catalysis. Nat. Prod. Rep. 29, 1153–1175 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  14. Zhou, F. & Pichersky, E. More is better: the diversity of terpene metabolism in plants. Curr. Opin. Plant Biol. 55, 1–10 (2020).

    Article  PubMed  Google Scholar 

  15. Zhan, C. et al. Plant metabolic gene clusters in the multi-omics era. Trends Plant Sci. 27, 981–1001 (2022).

    Article  PubMed  Google Scholar 

  16. Chen, H. et al. Combinatorial evolution of a terpene synthase gene cluster explains terpene variations in Oryza. Plant Physiol. 182, 480–492 (2020).

    Article  PubMed  Google Scholar 

  17. Qiao, D. et al. A monoterpene synthase gene cluster of tea plant (Camellia sinensis) potentially involved in constitutive and herbivore-induced terpene formation. Plant Physiol. Biochem. 184, 1–13 (2022).

    Article  PubMed  Google Scholar 

  18. Aubourg, S., Lecharny, A. & Bohlmann, J. Genomic analysis of the terpenoid synthase (AtTPS) gene family of Arabidopsis thaliana. Mol. Genet. Genomics 267, 730–745 (2002).

    Article  PubMed  Google Scholar 

  19. Chang, X. et al. Identification and characterization of glycosyltransferases involved in the biosynthesis of neodiosmin. J. Agric. Food Chem. 72, 4348–4357 (2024).

    Article  PubMed  Google Scholar 

  20. Yao, Y. et al. Structure-based virtual screening aids the identification of glycosyltransferases in the biosynthesis of salidroside. Plant Biotechnol. J. 23, 1725–1735 (2025).

    Article  PubMed  PubMed Central  Google Scholar 

  21. Rice, A. et al. The Chromosome Counts Database (CCDB) – a community resource of plant chromosome numbers. N. Phytol. 206, 19–26 (2015).

    Article  Google Scholar 

  22. Ou, S., Chen, J. & Jiang, N. Assessing genome assembly quality using the LTR Assembly Index (LAI). Nucleic Acids Res. 46, e126 (2018).

    PubMed  PubMed Central  Google Scholar 

  23. Shen, C. et al. The chromosome-level genome sequence of the autotetraploid alfalfa and resequencing of core germplasms provide genomic resources for alfalfa research. Mol. Plant. 13, 1250–1261 (2020).

    Article  PubMed  Google Scholar 

  24. Wu, D. et al. Genomic insights into the evolution of Echinochloa species as weed and orphan crop. Nat. Commun. 13, 689 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  25. Project, A. G. et al. The Amborella genome and the evolution of flowering plants. Science 342, 1241089 (2013).

    Article  Google Scholar 

  26. Zhou, Y., Massonnet, M., Sanjak, J. S., Cantu, D. & Gaut, B. S. Evolutionary genomics of grape (Vitis vinifera ssp. vinifera) domestication. Proc. Natl. Acad. Sci. USA 114, 11715–11720 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  27. Badouin, H. et al. The sunflower genome provides insights into oil metabolism, flowering and Asterid evolution. Nature 546, 148–152 (2017).

    Article  PubMed  Google Scholar 

  28. Jiao, Y. et al. Ancestral polyploidy in seed plants and angiosperms. Nature 473, 97–100 (2011).

    Article  PubMed  Google Scholar 

  29. Panchy, N., Lehti-Shiu, M. & Shiu, S. H. Evolution of gene duplication in plants. Plant Physiol. 171, 2294–2316 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  30. Yang, F. S. et al. Chromosome-level genome assembly of a parent species of widely cultivated azaleas. Nat. Commun. 11, 5269 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  31. Kautsar, S. A., Suarez Duran, H. G., Blin, K., Osbourn, A. & Medema, M. H. plantiSMASH: automated identification, annotation and expression analysis of plant biosynthetic gene clusters. Nucleic Acids Res. 45, W55–W63 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  32. Chen, F., Tholl, D., Bohlmann, J. & Pichersky, E. The family of terpene synthases in plants: a mid-size family of genes for specialized metabolism that is highly diversified throughout the kingdom. Plant J. 66, 212–229 (2011).

    Article  PubMed  Google Scholar 

  33. Tholl, D. Terpene synthases and the regulation, diversity and biological roles of terpene metabolism. Curr. Opin. Plant Biol. 9, 297–304 (2006).

    Article  PubMed  Google Scholar 

  34. Deng, X. et al. Systematic identification of Ocimum sanctum sesquiterpenoid synthases and (−)-eremophilene overproduction in engineered yeast. Metab. Eng. 69, 122–133 (2022).

    Article  PubMed  Google Scholar 

  35. Deng, X. et al. Complete pathway elucidation and heterologous reconstitution of (+)-nootkatone biosynthesis from Alpinia oxyphylla. N. Phytol. 241, 779–792 (2024).

    Article  Google Scholar 

  36. Pai¯s, M., Fontaine, C., Lauren, D., La Barre, S. & Guittet, E. Stylotelline, a new sesquiterpene isocyanide from the spongeStylotella sp. application of 2D-NMR in structure determination. Tetrahedron Lett. 28, 1409–1412 (1987).

    Article  Google Scholar 

  37. Lago, J. H., Brochini, C. B. & Roque, N. F. Terpenoids from Guarea guidonia. Phytochemistry 60, 333–338 (2002).

    Article  PubMed  Google Scholar 

  38. Weyerstahl, P., Marschall, H., Splittgerber, U., Wolf, D. & Surburg, H. Constituents of Haitian vetiver oil. Flavour Frag. J. 15, 395–412 (2000).

    Google Scholar 

  39. Hagedorn, M. L. & Brown, S. M. The constituents of cascarilla oil (croton eluteria bennett). Flavour Frag. J. 6, 193–204 (1991).

    Article  Google Scholar 

  40. König, W. A. et al. The sesquiterpene constituents of the liverwort Preissia quadrata. Phytochemistry 43, 629–633 (1996).

    Article  Google Scholar 

  41. Martin, D. M. et al. Functional annotation, genome organization and phylogeny of the grapevine (Vitis vinifera) terpene synthase gene family based on genome assembly, FLcDNA cloning, and enzyme assays. BMC Plant Biol. 10, 226 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  42. Ran, J. et al. Identification of sesquiterpene synthase genes in the genome of Aquilaria sinensis and characterization of an α-humulene synthase. J. For. Res. 34, 1117–1131 (2022).

    Article  Google Scholar 

  43. Chen, X. et al. Chemical composition and potential properties in mental illness (anxiety, depression and insomnia) of agarwood essential oil: a review. Molecules 27, 4528 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  44. Wu, Q. X., Shi, Y. P. & Jia, Z. J. Eudesmane sesquiterpenoids from the Asteraceae family. Nat. Prod. Rep. 23, 699–734 (2006).

    Article  PubMed  Google Scholar 

  45. Koren, S. et al. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 27, 722–736 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  46. Vaser, R., Sovic, I., Nagarajan, N. & Sikic, M. Fast and accurate de novo genome assembly from long uncorrected reads. Genome Res. 27, 737–746 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  47. Hu, J., Fan, J., Sun, Z. & Liu, S. NextPolish: a fast and efficient genome polishing tool for long-read assembly. Bioinformatics 36, 2253–2255 (2020).

    Article  PubMed  Google Scholar 

  48. Huang, S., Kang, M. & Xu, A. HaploMerger2: rebuilding both haploid sub-assemblies from high-heterozygosity diploid genome assembly. Bioinformatics 33, 2577–2579 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  49. Durand, N. C. et al. Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Syst. 3, 95–98 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  50. Dudchenko, O. et al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science 356, 92–95 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  51. Zhi, Y. et al. Gene-directed in vitro mining uncovers the insect-repellent constituent from mugwort (Artemisia argyi). J. Am. Chem. Soc. 146, 30883–30892 (2024).

    Article  PubMed  Google Scholar 

  52. Ye, Z. et al. Coupling cell growth and biochemical pathway induction in Saccharomyces cerevisiae for production of (+)-valencene and its chemical conversion to (+)-nootkatone. Metab. Eng. 72, 107–115 (2022).

    Article  PubMed  Google Scholar 

Download references

Acknowledgements

This work is supported by grants from the National Key Research and Development Program (2022YFA0912100) to Li Lu. The numerical calculations in this paper have been done on the supercomputing system in the Supercomputing Center of Wuhan University. We extend our sincere gratitude to Professor Ying-Xiong Qiu from the Wuhan Botanical Garden for his valuable suggestions.

Author information

Author notes

  1. These authors contributed equally: Fangfang Chen, Yan Yao.

Authors and Affiliations

  1. Department of Urology, Zhongnan Hospital of Wuhan University, Hubei Provincial Research Center for Basic Biological Science, School of Pharmaceutical Sciences, Wuhan University, Wuhan, Hubei, China

    Fangfang Chen, Yan Yao, Hangzhi Zhu, Jie Hu, Aohan Geng, Zhenni Xu, Xueting Fang, Zixin Deng & Li Lu

  2. State Key Laboratory of Hybrid Rice, Hubei Hongshan Laboratory, School of Pharmaceutical Sciences, Wuhan University, Wuhan, Hubei, China

    Fangfang Chen, Yan Yao, Hangzhi Zhu, Jie Hu, Aohan Geng & Li Lu

  3. Department of Pharmacy, Renmin Hospital of Wuhan University, Wuhan, Hubei, China

    Weijia Cheng

  4. Wuhan Hesheng Technology Co., Ltd., Wuhan, Hubei, China

    Yao Zhi

  5. Laboratory of Medicinal Plant, Hubei University of Medicine, Shiyan, Hubei, China

    Yonghong Zhang

  6. School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China

    Zixin Deng & Tiangang Liu

Authors

  1. Fangfang Chen
  2. Yan Yao
  3. Hangzhi Zhu
  4. Jie Hu
  5. Aohan Geng
  6. Zhenni Xu
  7. Xueting Fang
  8. Weijia Cheng
  9. Yao Zhi
  10. Yonghong Zhang
  11. Zixin Deng
  12. Tiangang Liu
  13. Li Lu

Contributions

F.C. analyzed the data. Y.Y., H.Z., J.H., A.G., Z.X., X.F., W.C., Y. Zhi, and Y. Zhang performed the experiments. F.C. and L.L. conceived the project, designed the experiments, and prepared the manuscript. Z.D. and T.L. discussed the results and contributed to the final manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Li Lu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks Jianquan Liu, Philipp Zerbe and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Source data

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, F., Yao, Y., Zhu, H. et al. A chromosome-level genome of Astilbe chinensis unveils the evolution of a terpene biosynthetic gene cluster. Nat Commun 16, 9869 (2025). https://doi.org/10.1038/s41467-025-64842-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Version of record:

  • DOI: https://doi.org/10.1038/s41467-025-64842-9