Simultaneous epigenomic profiling and regulatory activity measurement using e2MPRA

simultaneous-epigenomic-profiling-and-regulatory-activity-measurement-using-e2mpra
Simultaneous epigenomic profiling and regulatory activity measurement using e2MPRA

Data availability

The e2MPRA sequencing data generated in this study, including association barcode sequencing data and barcode sequencing data for lentiMPRA, as well as ATAC-seq and CUT&Tag-seq data, have been deposited in the DDBJ database under accession code PRJDB39977 and at the Zenodo repository46,47. Publicly available H3K27ac ChIP-seq data (ENCFF084DIM, ENCFF515WSE, ENCFF759SNY) and ATAC-seq data (ENCFF622FRD, ENCFF024GLW, ENCFF240VVR, ENCFF782GKX, ENCFF029XKY) of HepG2 were downloaded from ENCODE portal. Source data are provided in this paper.

Code availability

The code used for data processing and analysis in this study is available at: https://github.com/ziczhang/e2MPRA_analysis and at the Zenodo repository48.

References

  1. Buenrostro, J. D., Giresi, P. G., Zaba, L. C., Chang, H. Y. & Greenleaf, W. J. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat. Methods 10, 1213–1218 (2013).

    Google Scholar 

  2. Boyle, A. P. et al. High-resolution mapping and characterization of open chromatin across the genome. Cell 132, 311–322 (2008).

    Google Scholar 

  3. Johnson, D. S., Mortazavi, A., Myers, R. M. & Wold, B. Genome-wide mapping of in vivo protein-DNA interactions. Science 316, 1497–1502 (2007).

    Google Scholar 

  4. Skene, P. J., and Henikoff, S. An efficient targeted nuclease strategy for high-resolution mapping of DNA binding sites. Elife 6, https://doi.org/10.7554/eLife.21856 (2017).

  5. Skene, P. J., Henikoff, J. G. & Henikoff, S. Targeted in situ genome-wide profiling with high efficiency for low cell numbers. Nat. Protoc. 13, 1006–1019 (2018).

    Google Scholar 

  6. Kaya-Okur, H. S. et al. CUT&Tag for efficient epigenomic profiling of small samples and single cells. Nat. Commun. 10, 1930 (2019).

    Google Scholar 

  7. Inoue, F. & Ahituv, N. Decoding enhancers using massively parallel reporter assays. Genomics 106, 159–164 (2015).

    Google Scholar 

  8. Inoue, F. et al. A systematic comparison reveals substantial differences in chromosomal versus episomal encoding of enhancer activity. Genome Res. 27, 38–52 (2017).

    Google Scholar 

  9. Kreimer, A., Yan, Z., Ahituv, N., & Yosef, N. Meta-analysis of massively parallel reporter assays enables prediction of regulatory function across cell types. Hum. Mutat. humu.23820, https://doi.org/10.1002/humu.23820 (2019).

  10. Agarwal, V. et al. Massively parallel characterization of transcriptional regulatory elements. Nature, https://doi.org/10.1038/s41586-024-08430-9 (2025).

  11. Smith, R. P. et al. Massively parallel decoding of mammalian regulatory sequences supports a flexible organizational model. Nat. Genet. 45, 1021–1028 (2013).

    Google Scholar 

  12. Gordon, M. G. et al. lentiMPRA and MPRAflow for high-throughput functional characterization of gene regulatory elements. Nat. Protoc. 15, 2387–2412 (2020).

    Google Scholar 

  13. Bernstein, B. E. et al. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).

    Google Scholar 

  14. Georgakopoulos-Soares, I. et al. Transcription factor binding site orientation and order are major drivers of gene regulatory activity. Nat. Commun. 14, 2333 (2023).

    Google Scholar 

  15. Beucher, A. et al. The HASTER lncRNA promoter is a cis-acting transcriptional stabilizer of HNF1A. Nat. Cell Biol. 24, 1528–1540 (2022).

    Google Scholar 

  16. Heller, S. et al. Transcriptional changes and the role of ONECUT1 in hPSC pancreatic differentiation. Commun. Biol. 4, 1298 (2021).

    Google Scholar 

  17. Viswakarma, N. et al. Coactivators in PPAR-regulated gene expression. PPAR Res. 2010, 1–21 (2010).

    Google Scholar 

  18. Lutz, M. Transcriptional repression by the insulator protein CTCF involves histone deacetylases. Nucleic Acids Res. 28, 1707–1713 (2000).

    Google Scholar 

  19. Chong, J. A. et al. REST: A mammalian silencer protein that restricts sodium channel gene expression to neurons. Cell 80, 949–957 (1995).

    Google Scholar 

  20. Schoenherr, C. J. & Anderson, D. J. The neuron-restrictive silencer factor (NRSF): a coordinate repressor of multiple neuron-specific genes. Science 267, 1360–1363 (1995).

    Google Scholar 

  21. Chew, J.-L. et al. Reciprocal transcriptional regulation of Pou5f1 and Sox2 via the Oct4/Sox2 complex in embryonic stem cells. Mol. Cell Biol. 25, 6031–6046 (2005).

    Google Scholar 

  22. Zaret, K. S., Lerner, J. & Iwafuchi-Doi, M. Chromatin scanning by dynamic binding of pioneer factors. Mol. Cell 62, 665–667 (2016).

    Google Scholar 

  23. Pop, R. T. et al. Identification of mammalian transcription factors that bind to inaccessible chromatin. Nucleic Acids Res. 51, 8480–8495 (2023).

    Google Scholar 

  24. Qureshi, I. A., Gokhan, S. & Mehler, M. F. REST and CoREST are transcriptional and epigenetic regulators of seminal neural fate decisions. Cell Cycle 9, 4477–4486 (2010).

    Google Scholar 

  25. Griffith, E. C., Cowan, C. W. & Greenberg, M. E. REST acts through multiple deacetylase complexes. Neuron 31, 339–340 (2001).

    Google Scholar 

  26. Yoo, W. et al. Molecular basis for SOX2-dependent regulation of super-enhancer activity. Nucleic Acids Res. 51, 11999–12019 (2023).

    Google Scholar 

  27. Rodda, D. J. et al. Transcriptional regulation of nanog by OCT4 and SOX2. J. Biol. Chem. 280, 24731–24737 (2005).

    Google Scholar 

  28. Wang, J. et al. YY1 positively regulates transcription by targeting promoters and super-enhancers through the BAF complex in embryonic stem cells. Stem Cell Rep. 10, 1324–1339 (2018).

    Google Scholar 

  29. Zou, Z., Ohta, T., Miura, F. & Oki, S. ChIP-Atlas 2021 update: a data-mining suite for exploring epigenomic landscapes by fully integrating ChIP-seq, ATAC-seq and Bisulfite-seq data. Nucleic Acids Res. 50, W175–W182 (2022).

    Google Scholar 

  30. Grant, C. E., Bailey, T. L. & Noble, W. S. FIMO: scanning for occurrences of a given motif. Bioinformatics 27, 1017–1018 (2011).

    Google Scholar 

  31. Corces, M. R. et al. An improved ATAC-seq protocol reduces background and enables interrogation of frozen tissues. Nat. Methods 14, 959–962 (2017).

    Google Scholar 

  32. Chen, S. Ultrafast one-pass FASTQ data preprocessing, quality control, and deduplication using fastp. iMeta 2, https://doi.org/10.1002/imt2.107 (2023).

  33. Aronesty, E. Comparison of sequencing utility programs. Open Bioinforma. J. 7, 1–8 (2013).

    Google Scholar 

  34. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

    Google Scholar 

  35. Smith, T., Heger, A. & Sudbery, I. UMI-tools: modeling sequencing errors in Unique Molecular Identifiers to improve quantification accuracy. Genome Res 27, 491–499 (2017).

    Google Scholar 

  36. Danecek, P. et al. Twelve years of SAMtools and BCFtools. Gigascience 10, https://doi.org/10.1093/gigascience/giab008 (2021).

  37. Robinson, M. D. & Oshlack, A. A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol. 11, R25 (2010).

    Google Scholar 

  38. Wang, Q. S. et al. Statistically and functionally fine-mapped blood eQTLs and pQTLs from 1,405 humans reveal distinct regulation patterns and disease relevance. Nat. Genet. https://doi.org/10.1038/s41588-024-01896-3 (2024).

  39. Meuleman, W. et al. Index and biological spectrum of human DNase I hypersensitive sites. Nature 584, 244–251 (2020).

    Google Scholar 

  40. Seabold, S., and Perktold, J. (2010). Statsmodels: Econometric and Statistical Modeling with Python. In, pp. 92–96.

  41. Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2012).

  42. Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B Stat. Methodol. 57, 289–300 (1995).

    Google Scholar 

  43. Kircher, M. et al. Saturation mutagenesis of twenty disease-associated regulatory elements at single base-pair resolution. Nat. Commun. 10, 3583 (2019).

    Google Scholar 

  44. Ashuach, T. et al. MPRAnalyze: statistical framework for massively parallel reporter assays. Genome Biol. 20, 183 (2019).

    Google Scholar 

  45. Canny, J. A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. PAMI 8, 679–698 (1986).

    Google Scholar 

  46. Zhang, Z. e2MPRA reads for HepG2 cell line [Data set]. Zenodo https://doi.org/10.5281/zenodo.15428846 (2025).

  47. Zhang, Z. e2MPRA reads for WTC11 cell line [Data set]. Zenodo https://doi.org/10.5281/zenodo.15469962 (2025).

  48. Zhang, Z. Code archive for generating figures for the e2MPRA paper. Zenodo https://doi.org/10.5281/zenodo.18052569 (2025).

  49. Andrews, G. et al. Mammalian evolution of human cis-regulatory elements and transcription factor binding sites. Science 380, eabn7930 (2023).

Download references

Acknowledgements

This work was supported by the World Premier International Research Center Initiative (WPI), MEXT Japan, MEXT KAKENHI Grant Numbers JP24K02004 (F.I.), JP24K18101 (Z.Z.), and AMED under Grant Number JP24gm7010002 (F.I.). This work was funded in part by the National Human Genome Research Institute grant numbers 1R21HG010683 (N.A.), 1UM1HG009408 (N.A.) and 1UM1HG011966 (N.A.). We thank the Single-Cell Genome Information Analysis Core (SignAC) at WPI-ASHBi, Kyoto University, for their support. The WTC11 cell line was kindly provided by Dr. Bruce R. Conklin (The Gladstone Institutes and UCSF).

Author information

Authors and Affiliations

  1. Institute for the Advanced Study of Human Biology (WPI-ASHBi), Kyoto University, Kyoto, Japan

    Zicong Zhang, Guillaume Bourque & Fumitaka Inoue

  2. Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA

    Ilias Georgakopoulos-Soares

  3. Department of Human Genetics, McGill University, Montréal, QC, Canada

    Guillaume Bourque

  4. Victor Phillip Dahdaleh Institute of Genomic Medicine at McGill University, Montréal, QC, Canada

    Guillaume Bourque

  5. Canadian Center for Computational Genomics, McGill University, Montréal, QC, Canada

    Guillaume Bourque

  6. Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA

    Nadav Ahituv

  7. Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA

    Nadav Ahituv

Authors

  1. Zicong Zhang
  2. Ilias Georgakopoulos-Soares
  3. Guillaume Bourque
  4. Nadav Ahituv
  5. Fumitaka Inoue

Contributions

F.I. and N.A. conceived the study. Z.Z., I.G. and F.I. designed the e2MPRA library. Z.Z. and F.I. performed experiments. Z.Z. analyzed data. Z.Z., F.I. and N.A. wrote the paper. I.G. and G.B. assisted with manuscript writing and editing.

Corresponding authors

Correspondence to Nadav Ahituv or Fumitaka Inoue.

Ethics declarations

Competing interests

F.I. receives funding from Relation Therapeutics. N.A. is a Cofounder and on the scientific advisory board of Regel Therapeutics Inc. N.A. received funding from BioMarin Pharmaceutical Inc. The remaining authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks Nicolae Radu Zabet, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Source data

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, Z., Georgakopoulos-Soares, I., Bourque, G. et al. Simultaneous epigenomic profiling and regulatory activity measurement using e2MPRA. Nat Commun (2026). https://doi.org/10.1038/s41467-026-68422-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1038/s41467-026-68422-3