A research team co-led by scientists at the Netherlands Cancer Institute (NKI) and Oncode Institute has developed a deep learning model, PARM (promoter activity regulatory model) that offers up new insight into the regulation of human promoters by transcription factors, and so how genes know when to switch on or off.
The researchers say scientists can now start to use the tool for reading these genetic instructions, creating leads for new cancer diagnostics, patient stratification, and future therapies. They also suggest that their findings indicate that gene regulation is far more predictable than previously believed.
“We can now actually read the language of the gene control system,” said Bas van Steensel, PhD, group leader at the Netherlands Cancer Institute (NKI) and Oncode Institute, who is co-senior and co-corresponding author of the team’s published paper in Nature. “Our PARM model allows us to uncover these rules at scale, so we can now understand, and even predict, how regulatory DNA controls gene activity.” In their paper, titled “Regulatory grammar in human promoters uncovered by MPRA-based deep learning,” the authors concluded, “Our approach provides a highly economic strategy towards a deeper understanding of the dynamic regulation of human promoters by transcription factors.”
Promoters are the core regulatory elements of all genes, the investigators wrote. “Their activity ensures the correct transcription level of each individual gene, which is essential for cellular homeostasis and responses to a wide range of signals.” Van Steensel further explained, “The classical genetic code explains how genes in our DNA encode proteins, but for most genes, we honestly didn’t understand how they are regulated. We know that the DNA between our genes contains regulatory elements such as promotors. However, the language of this control system that decides whether a gene turns on or off, in which cell, and how strongly was largely unknown.”
At the same time, most cancer related mutations are located in the non-coding part of our genome, and until now, interpreting such mutations has been extremely difficult. “The construction of computational models that can predict promoter activity from a DNA sequence is challenging,” the scientists pointed out.
Development of PARM results from a bold mission to decode the genome’s operating system, and involved seven research groups joining forces in Oncode Institute’s PERICODE project. The work to develop PARM involved a combination of lab experimentation and computation.
![Hatice Yücel and Max Trauernicht from Bas van Steensel's research group at the Netherlands Cancer Institute, where the technology underlying the new AI model PARM was developed. [©Netherlands Cancer Institute / Sanne Hijlkema]](https://www.genengnews.com/wp-content/uploads/2026/02/Low-Res_Pericode-Hatice-Yucel-Max-Trauernicht-0269-300x211.jpg)
The researchers describe the PARM model as “… a cell-type-specific deep-learning model trained on specially designed massively parallel reporter assays (MPRAs) that query human promoter sequences.” The MPRA technology, developed in the van Steensel lab at the NKI, allowed researchers to measure gene regulation at an unprecedented scale. But data alone doesn’t necessarily provide insight, so scientists in the lab of Jeroen de Ridder, PhD, from UMC Utrecht and Oncode Institute, then entered the picture.
The volume of data specifically targeted to gene regulation was used to train AI models that truly captured the biological rules underlying gene activation. “… we present a platform that combines an optimized MPRA with deep learning to efficiently construct sequence-to-activity models of all human promoters,” the authors noted. De Ridder added, “Most AI models learn from whatever data happens to exist. Here, the measurements and the AI were designed together. This allowed us to make super-efficient models for specific cell types that could be applied at a scale previously unthinkable.”
The new model enabled the team to predict how gene regulation differs between cell types and how it changes when cells are exposed to stimuli such as specific drugs. Moreover, the model revealed in extreme detail what the architecture of the ‘on and off buttons’ of each gene is. “We leveraged PARM to systematically identify binding sites of transcription factors that probably contribute to the activity of each natural human promoter and to detect the rewiring of these regulatory interactions after various stimuli to the cells,” the investigators explained. Crucially, the team did not stop at prediction. Every model output was subjected to rigorous experimental testing to make sure that these predictions were indeed correct.
![This screenshot of the PARM model shows one of the genes described in the Nature paper (APOC2). Several DNA-letters standing upright next to each other usually means that a transcription factor binds there and activates the gene. [©Netherlands Cancer Institute]](https://www.genengnews.com/wp-content/uploads/2026/02/Low-Res_Screenshot_PARM_viewer-300x170.jpg)
Despite notable progress in the field, existing AI models may be either too heavy to be applied to the vast numbers of mutations that exist, or are too generic and do not adequately capture cell type variability. The PARM model changes that. It allows researchers to predict the functional impact of regulatory mutations in specific cell types and under specific conditions, such as drug treatments, opening new paths for cancer diagnostics, patient stratification, and future therapies.
In their newly published paper van Steensel, de Ridder and colleagues stated, “With this platform, named PARM, both data generation and computational modeling are highly economical. This development enabled us to construct sequence-to-activity models for all human promoters in ten different cell types and after exposure of cells to several stimuli.”
Van Steensel referred to Google DeepMind, which recently published details in Nature about its AlphaGenome model, aimed at understanding gene regulation. “This is a great model,” van Steensel noted. “However, PARM is more flexible and it is experimentally and computationally lightweight. The tool requires around 1000 times less computing power than AlphaGenome, making it far more feasible for academic researchers around the world. With this model you only need one petri dish of cells and one day of computing to see in detail how a particular cell type, such as a tumor cell, uses its DNA code to respond to a signal such as a hormone, nutrient or drug.”
In their newly reported paper, van Steensel and colleagues concluded, “PARM complements other deep-learning approaches to model the grammar of enhancer elements or to design artificial promoters … and demonstrates that lightweight models trained on small functional genomic datasets are a viable and powerful alternative to massive modelling efforts.”
The post Deep Learning Model Developed to Help Predict Functional Impact of Regulatory Mutations appeared first on GEN – Genetic Engineering and Biotechnology News.
