Active learning-guided optimization of cell-free biosensors for lead testing in drinking water

active-learning-guided-optimization-of-cell-free-biosensors-for-lead-testing-in-drinking-water
Active learning-guided optimization of cell-free biosensors for lead testing in drinking water

References

  1. Libis, V., Delépine, B. & Faulon, J.-L. Sensing new chemicals with bacterial transcription factors. Curr. Opin. Microbiol. 33, 105–112 (2016).

    Google Scholar 

  2. Ding, N., Zhou, S. & Deng, Y. Transcription-factor-based biosensor engineering for applications in synthetic biology. ACS Synth. Biol. 10, 911–922 (2021).

    Google Scholar 

  3. De Paepe, B., Peters, G., Coussement, P., Maertens, J. & De Mey, M. Tailor-made transcriptional biosensors for optimizing microbial cell factories. J. Ind. Microbiol. Biotechnol. 44, 623–645 (2017).

    Google Scholar 

  4. Englund, E. et al. Biosensor guided polyketide synthases engineering for optimization of domain exchange boundaries. Nat. Commun. 14, 4871 (2023).

    Google Scholar 

  5. Thavarajah, W. et al. A primer on emerging field-deployable synthetic biology tools for global water quality monitoring. NPJ Clean. Water 3, 18 (2020).

    Google Scholar 

  6. Silverman, A. D., Akova, U., Alam, K. K., Jewett, M. C. & Lucks, J. B. Design and optimization of a cell-free atrazine biosensor. ACS Synth. Biol. 9, 671–677 (2020).

    Google Scholar 

  7. Corbisier, P. et al. Whole cell- and protein-based biosensors for the detection of bioavailable heavy metals in environmental samples. Anal. Chim. Acta 387, 235–244 (1999).

    Google Scholar 

  8. Thavarajah, W. et al. Point-of-use detection of environmental fluoride via a cell-free riboswitch-based biosensor. ACS Synth. Biol. 9, 10–18 (2019).

    Google Scholar 

  9. Jung, J. K. et al. Cell-free biosensors for rapid detection of water contaminants. Nat. Biotechnol. 38, 1451–1459 (2020).

    Google Scholar 

  10. Cao, J. et al. Harnessing a previously unidentified capability of bacterial allosteric transcription factors for sensing diverse small molecules in vitro. Sci. Adv. 4, eaau4602 (2018).

    Google Scholar 

  11. Grazon, C. et al. A progesterone biosensor derived from microbial screening. Nat. Commun. 11, 1276 (2020).

    Google Scholar 

  12. Taylor, N. D. et al. Engineering an allosteric transcription factor to respond to new ligands. Nat. Methods 13, 177–183 (2015).

    Google Scholar 

  13. Voyvodic, P. L. et al. Plug-and-play metabolic transducers expand the chemical detection space of cell-free biosensors. Nat. Commun. 10, 1697 (2019).

    Google Scholar 

  14. Pardee, K. et al. Paper-based synthetic gene networks. Cell 159, 940–954 (2014).

    Google Scholar 

  15. Pardee, K. et al. Rapid, low-cost detection of Zika virus using programmable biomolecular components. Cell 165, 1255–1266 (2016).

    Google Scholar 

  16. Nguyen, P. Q. et al. Wearable materials with embedded synthetic biology sensors for biomolecule detection. Nat. Biotechnol. 39, 1366–1374 (2021).

    Google Scholar 

  17. Hossain, G. S., Saini, M., Miyake, R., Ling, H. & Chang, M. W. Genetic biosensor design for natural product biosynthesis in microorganisms. Trends Biotechnol. 38, 797–810 (2020).

    Google Scholar 

  18. Landry, B. P., Palanki, R., Dyulgyarov, N., Hartsough, L. A. & Tabor, J. J. Phosphatase activity tunes two-component system sensor detection threshold. Nat. Commun. 9, 1433 (2018).

    Google Scholar 

  19. Meyer, A. J., Segall-Shapiro, T. H., Glassey, E., Zhang, J. & Voigt, C. A. Escherichia coli “Marionette” strains with 12 highly optimized small-molecule sensors. Nat. Chem. Biol. 15, 196–204 (2018).

    Google Scholar 

  20. Chemla, Y. et al. Hyperspectral reporters for long-distance and wide-area detection of gene expression in living bacteria. Nat. Biotechnol. https://doi.org/10.1038/s41587-025-02622-y (2025).

  21. Wen, K. Y. et al. A cell-free biosensor for detecting quorum sensing molecules in P. aeruginosa-infected respiratory samples. ACS Synth. Biol. 6, 2293–2301 (2017).

    Google Scholar 

  22. Boyd, M. A., Thavarajah, W., Lucks, J. B. & Kamat, N. P. Robust and tunable performance of a cell-free biosensor encapsulated in lipid vesicles. Sci. Adv. 9, eadd6605 (2023).

    Google Scholar 

  23. Gambill, L., Staubus, A., Mo, K. W., Ameruoso, A. & Chappell, J. A split ribozyme that links detection of a native RNA to orthogonal protein outputs. Nat. Commun. 14, 543 (2023).

    Google Scholar 

  24. McSweeney, M. A. et al. A modular cell-free protein biosensor platform using split T7 RNA polymerase. Sci. Adv. 11, eado6280 (2025).

    Google Scholar 

  25. Lubkowicz, D. et al. Reprogramming probiotic Lactobacillus reuteri as a biosensor for Saphylococcus aureus derived AIP-I detection. ACS Synth. Biol. 7, 1229–1237 (2018).

    Google Scholar 

  26. Leander, M., Yuan, Y., Meger, A., Cui, Q. & Raman, S. Functional plasticity and evolutionary adaptation of allosteric regulation. Proc. Natl. Acad. Sci. USA 117, 25445–25454 (2020).

    Google Scholar 

  27. Süel, G. M. et al. Evolutionarily conserved networks of residues mediate allosteric communication in proteins. Nat. Struct. Biol. 10, 59–69 (2002).

    Google Scholar 

  28. Nishikawa, K. K. et al. Highly multiplexed design of an allosteric transcription factor to sense new ligands. Nat. Commun. 15, 10001 (2024).

    Google Scholar 

  29. d’Oelsnitz, S. et al. Using fungible biosensors to evolve improved alkaloid biosyntheses. Nat. Chem. Biol. 18, 981–989 (2022).

    Google Scholar 

  30. F. M. Machado, L., Currin, A. & Dixon, N. Directed evolution of the PcaV allosteric transcription factor to generate a biosensor for aromatic aldehydes. J. Biol. Eng. 13, 91 (2019).

    Google Scholar 

  31. Yang, J. et al. Active learning-assisted directed evolution. Nat. Commun. 16, 714 (2025).

    Google Scholar 

  32. Vidal, L. S., Isalan, M., Heap, J. T. & Ledesma-Amaro, R. A primer to directed evolution: current methodologies and future directions. RSC Chem. Biol. 4, 271–291 (2023).

    Google Scholar 

  33. Wu, Z., Kan, S. B. J., Lewis, R. D., Wittmann, B. J. & Arnold, F. H. Machine learning-assisted directed protein evolution with combinatorial libraries. Proc. Natl. Acad. Sci. USA 116, 8852–8858 (2019).

    Google Scholar 

  34. Biswas, S., Khimulya, G., Alley, E. C., Esvelt, K. M. & Church, G. M. Low-N protein engineering with data-efficient deep learning. Nat. Methods 18, 389–396 (2021).

    Google Scholar 

  35. Qiu, Y., Hu, J. & Wei, G.-W. Cluster learning-assisted directed evolution. Nat. Comput. Sci. 1, 809–818 (2021).

    Google Scholar 

  36. Zhang, Q. et al. Integrating protein language models and automatic biofoundry for enhanced protein evolution. Nat. Commun. 16, 1553 (2025).

    Google Scholar 

  37. Huang, C. et al. Application of directed evolution and machine learning to enhance the diastereoselectivity of ketoreductase for dihydrotetrabenazine synthesis. JACS Au 4, 2547–2556 (2024).

    Google Scholar 

  38. Lobzaev, E., Herrera, M. A., Kasprzyk, M. & Stracquadanio, G. Protein engineering using variational free energy approximation. Nat. Commun. 15, 10447 (2024).

    Google Scholar 

  39. Hie, B., Bryson, B. D. & Berger, B. Leveraging uncertainty in machine learning accelerates biological discovery and design. Cell Syst. 11, 461–477 (2020).

    Google Scholar 

  40. Hayes, T. et al. Simulating 500 million years of evolution with a language model. Science 387, 850–858 (2025).

    Google Scholar 

  41. Chen, B. et al. xTrimoPGLM: unified 100-billion-parameter pretrained transformer for deciphering the language of proteins. Nat. Methods 22, 1028–1039 (2025).

    Google Scholar 

  42. Ferruz, N., Schmidt, S. & Höcker, B. ProtGPT2 is a deep unsupervised language model for protein design. Nat. Commun. 13, 4348 (2022).

    Google Scholar 

  43. Nijkamp, E., Ruffolo, J. A., Weinstein, E. N., Naik, N. & Madani, A. ProGen2: Exploring the boundaries of protein language models. Cell Syst. 14, 968–978 (2023).

    Google Scholar 

  44. Nguyen, E. et al. Sequence modeling and design from molecular to genome scale with Evo. Science 386, eado9336 (2024).

    Google Scholar 

  45. Lin, Z. et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 379, 1123–1130 (2023).

    Google Scholar 

  46. Hie, B. L. et al. Efficient evolution of human antibodies from general protein language models. Nat. Biotechnol. 42, 275–283 (2023).

    Google Scholar 

  47. Yang, J., Li, F.-Z. & Arnold, F. H. Opportunities and challenges for machine learning-assisted enzyme engineering. ACS Cent. Sci. 10, 226–241 (2024).

    Google Scholar 

  48. Saito, Y. et al. Machine-learning-guided library design cycle for directed evolution of enzymes: the effects of training data composition on sequence space exploration. ACS Catal. 11, 14615–14624 (2021).

    Google Scholar 

  49. Landwehr, G. M. et al. Accelerated enzyme engineering by machine-learning guided cell-free expression. Nat. Commun. 16, 865 (2025).

    Google Scholar 

  50. Ding, K. et al. Machine learning-guided co-optimization of fitness and diversity facilitates combinatorial library design in enzyme engineering. Nat. Commun. 15, 6392 (2024).

    Google Scholar 

  51. Kim, G. B., Gao, Y., Palsson, B. O. & Lee, S. Y. DeepTFactor: A deep learning-based tool for the prediction of transcription factors. Proc. Natl. Acad. Sci. USA 118, e2021171118 (2021).

    Google Scholar 

  52. Zeng, W., Dou, Y., Pan, L., Xu, L. & Peng, S. Improving prediction performance of general protein language model by domain-adaptive pretraining on DNA-binding protein. Nat. Commun. 15, 7838 (2024).

    Google Scholar 

  53. Yang, K. K., Wu, Z. & Arnold, F. H. Machine-learning-guided directed evolution for protein engineering. Nat. Methods 16, 687–694 (2019).

    Google Scholar 

  54. Watson, J. L. et al. De novo design of protein structure and function with RFdiffusion. Nature 620, 1089–1100 (2023).

    Google Scholar 

  55. Sumida, K. H. et al. Improving protein expression, stability, and function with proteinMPNN. J. Am. Chem. Soc. 146, 2054-2061 (2024).

  56. Khersonsky, O. et al. Automated design of efficient and functionally diverse enzyme repertoires. Mol. Cell 72, 178–186 (2018).

    Google Scholar 

  57. Widatalla, T., Rafailov, R. & Hie, B. Aligning protein generative models with experimental fitness via Direct Preference Optimization. Preprint at https://doi.org/10.1101/2024.05.20.595026 (2024).

  58. Padmakumar, V., Pang, R. Y., He, H. & Parikh, A. P. Extrapolative Controlled Sequence Generation via Iterative Refinement. Preprint at https://doi.org/10.48550/arXiv.2303.04562 (2023).

  59. Yang, Z. et al. Does negative sampling matter? a review with insights into its theory and applications. IEEE Trans. Pattern Anal. Mach. Intell. 46, 5692–5711 (2024).

    Google Scholar 

  60. Carlson, E. D., Gan, R., Hodgman, C. E. & Jewett, M. C. Cell-free protein synthesis: Applications come of age. Biotechnol. Adv. 30, 1185–1194 (2012).

    Google Scholar 

  61. Silverman, A. D., Karim, A. S. & Jewett, M. C. Cell-free gene expression: an expanded repertoire of applications. Nat. Rev. Genet. 21, 151–170 (2019).

    Google Scholar 

  62. Hunt, A. C. et al. Cell-free gene expression: methods and applications. Chem. Rev. 125, 91–149 (2024).

    Google Scholar 

  63. Ekas, H. M. et al. An automated cell-free workflow for transcription factor engineering. ACS Synth. Biol. 13, 3389–3399 (2024).

    Google Scholar 

  64. Ekas, H. M. et al. Engineering a PbrR-based biosensor for cell-free detection of lead at the legal limit. ACS Synth. Biol. 13, 3003–3012 (2024).

    Google Scholar 

  65. Monchy, S. B. et al. Plasmids pMOL28 and pMOL30 of Cupriavidus metallidurans are specialized in the maximal viable response to heavy metals. J. Bacteriol. 189, 7417–7425 (2007).

    Google Scholar 

  66. Jarvis, P. & Fawell, J. Lead in drinking water – An ongoing public health concern? Curr. Opin. Environ. Sci. Health 20, 100239 (2021).

    Google Scholar 

  67. Zietz, B. P., Laß, J., Suchenwirth, R. & Dunkelberg, H. Lead in drinking water as a public health challenge. Environ. Health Perspect. 118, a154–a155 (2010).

    Google Scholar 

  68. WHO. Lead poisoning, https://www.who.int/news-room/fact-sheets/detail/lead-poisoning-and-health (2024).

  69. EPA. Lead Service Lines, https://www.epa.gov/ground-water-and-drinking-water/lead-service-lines (2025).

  70. Borremans, B., Hobman, J. L., Provoost, A., Brown, N. L. & Lelie, D. vd Cloning and Functional Analysis of the pbr Lead Resistance Determinant of Ralstonia metallidurans CH34. J. Bacteriol. 183, 5651–5658 (2001).

    Google Scholar 

  71. EPA. Basic Information about Lead in Drinking Water, https://www.epa.gov/ground-water-and-drinking-water/basic-information-about-lead-drinking-water (2025).

  72. Jia, X., Ma, Y., Bu, R., Zhao, T. & Wu, K. Directed evolution of a transcription factor PbrR to improve lead selectivity and reduce zinc interference through dual selection. AMB Express 10, 67 (2020).

    Google Scholar 

  73. EPA. Drinking Water Regulations and Contaminants, https://www.epa.gov/sdwa/drinking-water-regulations-and-contaminants (2025).

  74. Liu, X. et al. Design of a transcriptional biosensor for the portable, on-demand detection of cyanuric acid. ACS Synth. Biol. 9, 84–94 (2019).

    Google Scholar 

  75. Thavarajah, W. et al. The accuracy and usability of point-of-use fluoride biosensors in rural Kenya. npj Clean. Water 6, 5 (2023).

    Google Scholar 

  76. Maret, W. & Li, Y. Coordination dynamics of Zinc in Proteins. Chem. Rev. 109, 4682–4707 (2009).

    Google Scholar 

  77. Cangelosi, V., Ruckthong, L. & Pecoraro, V. L. Lead: Its Effects on Environment and Health Ch.10 (De Gruyter, Berlin, 2017).

  78. Aravind, L., Anantharaman, V., Balaji, S., Babu, M. M. & Iyer, L. M. The many faces of the helix-turn-helix domain: Transcription regulation and beyond. FEMS Microbiol. Rev. 29, 231–262 (2005).

    Google Scholar 

  79. Hastings, R., Aditham, A. K., DelRosso, N., Suzuki, P. H. & Fordyce, P. M. Mutations to transcription factor MAX allosterically increase DNA selectivity by altering folding and binding pathways. Nat. Commun. 16, 636 (2025).

    Google Scholar 

  80. Warfel, K. F. et al. A low-cost, thermostable, cell-free protein synthesis platform for on-demand production of conjugate vaccines. ACS Synth. Biol. 12, 95–107 (2022).

    Google Scholar 

  81. Stark, J. C. et al. On-demand biomanufacturing of protective conjugate vaccines. Sci. Adv. 7, eabe9444 (2021).

    Google Scholar 

  82. Pardee, K. et al. Portable, on-demand biomolecular manufacturing. Cell 167, 248–259.e212 (2016).

    Google Scholar 

  83. Collins, M. et al. A frugal CRISPR kit for equitable and accessible education in gene editing and synthetic biology. Nat. Commun. 15, 6563 (2024).

    Google Scholar 

  84. Jung, J. K. et al. At-home, cell-free synthetic biology education modules for transcriptional regulation and environmental water quality monitoring. ACS Synth. Biol. 12, 2909–2921 (2023).

    Google Scholar 

  85. Stark, J. C. et al. BioBits™ Bright: A fluorescent synthetic biology education kit. Sci. Adv. 4, eaat5107 (2018).

    Google Scholar 

  86. Huang, A. et al. BioBits™ Explorer: A modular synthetic biology education kit. Sci. Adv. 4, eaat5105 (2018).

    Google Scholar 

  87. Elnaggar, A. et al. ProtTrans: toward understanding the language of life through self-supervised learning. IEEE Trans. Pattern Anal. Mach. Intell. 44, 7112–7127 (2022).

    Google Scholar 

  88. Post, M. A Call for Clarity in Reporting BLEU Scores. Proc. Third Conf. Mach. Transl. 1, 186–191 (2018).

    Google Scholar 

  89. Hu, E. J. et al. LoRA: Low-Rank Adaptation of Large Language Models. Preprint at https://doi.org/10.48550/arXiv.2106.09685 (2021).

  90. Silverman, A. D., Kelley-Loughnane, N., Lucks, J. B. & Jewett, M. C. Deconstructing cell-free extract preparation for in vitro activation of transcriptional genetic circuitry. ACS Synth. Biol. 8, 403–414 (2018).

    Google Scholar 

  91. Stark, J. C. et al. Rapid biosynthesis of glycoprotein therapeutics and vaccines from freeze-dried bacterial cell lysates. Nat. Protoc. 18, 2374–2398 (2023).

    Google Scholar 

  92. Kwon, Y.-C. & Jewett, M. C. High-throughput preparation methods of crude extract for robust cell-free protein synthesis. Sci. Rep. 5, 8663 (2015).

    Google Scholar 

  93. Jewett, M. C. & Swartz, J. R. Mimicking the Escherichia coli cytoplasmic environment activates long-lived and efficient cell-free protein synthesis. Biotechnol. Bioeng. 86, 19–26 (2004).

    Google Scholar 

  94. Jewett, M. C. & Swartz, J. R. Substrate replenishment extends protein synthesis with an in vitro translation system designed to mimic the cytoplasm. Biotechnol. Bioeng. 87, 465–471 (2004).

    Google Scholar 

  95. Jewett, M. C., Calhoun, K. A., Voloshin, A., Wuu, J. J. & Swartz, J. R. An integrated cell-free metabolic platform for protein production and synthetic biology. Mol. Syst. Biol. 4, 51–59 (2008).

    Google Scholar 

  96. E. P. A. Method 200.8: Determination of Trace Elements in Waters and Wastes by Inductively Coupled Plasma-Mass Spectrometry. (Cincinnati, OH, 1994).

  97. Yeghicheyan, D. et al. Collaborative determination of trace element mass fractions and isotope ratios in AQUA-1 drinking water certified reference material. Anal. Bioanal. Chem. 413, 4959–4978 (2021).

    Google Scholar 

  98. Meng, E. C. et al. UCSF ChimeraX: Tools for structure building and analysis. Protein Sci. 32, e4792 (2023).

    Google Scholar 

  99. Discovery, C. et al. Chai-1: Decoding the molecular interactions of life. Preprint at https://doi.org/10.1101/2024.10.10.615955 (2024).

  100. Wang, B. M. et al. Active learning-guided optimization of cell-free biosensors for lead testing in drinking water. Zenodo. https://doi.org/10.5281/zenodo.17351710 (2025).

Download references