Gene-set analysis seeks to identify the biological mechanisms underlying groups of genes with shared functions. Large language models (LLMs) have shown promise in generating functional descriptions for input gene sets, but may produce factually incorrect statements, commonly referred to as hallucinations. In addition, LLMs are prone to circular reasoning, in which generated results are fact-checked against the model’s own data, which promotes further confidence in false outputs.
In a new study published in Nature Methods titled “GeneAgent: self-verification language agent for gene-set analysis using domain databases,” researchers at the National Institutes of Health (NIH) introduced GeneAgent, an LLM-based model for gene-set analysis that reduces hallucinations by autonomously interacting with biological databases to verify output. GeneAgent cross-checks its own initial predictions for accuracy against information from established, expert-curated databases and returns a verification report detailing its successes and failures.
Gene-set enrichment analysis (GSEA) is a cornerstone of functional genomics and measures the representation of biological functions associated with a set of genes or proteins by building on mRNA expression experiments and proteomics studies.
GSEA typically compares clusters against predefined categories in manually curated databases, such as Gene Ontology (GO) and the Molecular Signatures Database (MSigDB). As gene sets exhibiting strong enrichment in the existing databases have often been well analyzed, an increasing number of recent studies have focused on gene sets that only marginally overlap with known functions.
GeneAgent mitigates the hallucination issue by independently comparing its claims to established knowledge compiled in external, expert-curated databases. The research team first tested GeneAgent on 1,106 gene sets sourced from existing databases with known functions and process names. For each gene set, GeneAgent generated an initial list of functional claims and independently used a self-verification agent module to cross-check against the curated databases.
To determine accuracy, the researchers brought in two human experts to manually judge whether GeneAgent’s self-verification reports were correct, partially correct, or incorrect for 10 randomly selected gene sets with 132 claims. Results demonstrated that 92% of GeneAgent’s decisions were correct, indicating high performance in the model’s ability to conduct self-verification in comparison to GPT-4.
As a real-world utilization case, the authors applied GeneAgent to seven novel gene sets derived from mouse B2905 melanoma cell lines. Two gene sets (mmu04015 (HA-S) and mmu05100 (HA-S) were assigned with process names that exhibit perfect alignment with the ground truth established by the domain experts. Additionally, GeneAgent revealed novel biological insights for specific genes in the gene set.
In one example, mmu05022 (LA-S), GeneAgent suggested gene functions related to subunits of complexes I, IV, and V in the mitochondrial respiratory chain complexes, and further summarizes the “respiratory chain complex” for these genes. In contrast, GPT-4 was only able to categorize the same genes as “oxidative phosphorylation,” a high-level biological process based on the mitochondrial respiratory chain complexes, without including the gene Ndufa10, representing a NADH subunit, in this process.
The results suggest that GeneAgent is applicable to nonhuman genes and more robust than GPT-4 for novel gene sets. These functional insights could fuel knowledge discovery for potential new drug targets for diseases, such as cancer.
The post GeneAgent Reduces AI Hallucinations in Gene Function Prediction appeared first on GEN – Genetic Engineering and Biotechnology News.