References
-
Kreier, F. The myriad ways sewage surveillance is helping fight COVID around the world. Nature https://doi.org/10.1038/d41586-021-01234-1 (2021).
-
Collins, F. S. & Varmus, H. A New Initiative on Precision Medicine. N. Engl. J. Med. 372, 793–795 (2015).
-
Vargas, A. J. & Harris, C. C. Biomarker development in the precision medicine era: lung cancer as a case study. Nat. Rev. Cancer 16, 525–537 (2016).
-
Tarazona, S., Arzalluz-Luque, A. & Conesa, A. Undisclosed, unmet and neglected challenges in multi-omics studies. Nat. Comput. Sci. 1, 395–402 (2021).
-
Lee, S. B. et al. Assessing a novel room temperature DNA storage medium for forensic biological samples. Forensic Sci. Int. Genet. 6, 31–40 (2012).
-
Ryder, O. A., McLaren, A., Brenner, S., Zhang, Y.-P. & Benirschke, K. DNA Banks for endangered animal species. Science 288, 275–277 (2000).
-
Brandies, P., Peel, E., Hogg, C. J. & Belov, K. The value of reference genomes in the conservation of threatened species. Genes 10, 846 (2019).
-
Kieffer, C., Genot, A. J., Rondelez, Y. & Gines, G. Molecular computation for molecular classification. Adv. Biol. 7, 2200203 (2023).
-
Zhang, D. Y. & Seelig, G. Dynamic DNA nanotechnology using strand-displacement reactions. Nat. Chem. 3, 103–113 (2011).
-
Lopez, R., Wang, R. & Seelig, G. A molecular multi-gene classifier for disease diagnostics. Nat. Chem. 10, 746–754 (2018).
-
Zhang, C. et al. Cancer diagnosis with DNA molecular computation. Nat. Nanotechnol. 15, 709–715 (2020).
-
Yin, F. et al. DNA-framework-based multidimensional molecular classifiers for cancer diagnosis. Nat. Nanotechnol. 18, 677–686 (2023).
-
Roundtree, I. A. & He, C. RNA epigenetics—chemical messages for posttranscriptional gene regulation. Curr. Opin. Chem. Biol. 30, 46–51 (2016).
-
Kan, R. L., Chen, J. & Sallam, T. Crosstalk between epitranscriptomic and epigenetic mechanisms in gene regulation. Trends Genet. 38, 182–193 (2022).
-
Helm, M. & Motorin, Y. Detecting RNA modifications in the epitranscriptome: predict and validate. Nat. Rev. Genet. 18, 275–291 (2017).
-
Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).
-
Elliott, P., Peakman, T. C. & Biobank, U. K. The UK Biobank sample handling and storage protocol for the collection, processing and archiving of human blood and urine. Int. J. Epidemiol. 37, 234–244 (2008).
-
Bull, R. A. et al. Analytical validity of nanopore sequencing for rapid SARS-CoV-2 genome analysis. Nat. Commun. 11, 6272 (2020).
-
Minogue, T. D., Koehler, J. W., Stefan, C. P. & Conrad, T. A. Next-generation sequencing for biodefense: biothreat detection, forensics, and the clinic. Clin. Chem. 65, 383–392 (2019).
-
Whitmore, L. et al. Inadvertent human genomic bycatch and intentional capture raise beneficial applications and ethical concerns with environmental DNA. Nat. Ecol. Evol. 7, 873–888 (2023).
-
Opitz, L. et al. Impact of RNA degradation on gene expression profiling. BMC Med. Genomics 3, 36 (2010).
-
Gallego Romero, I., Pai, A. A., Tung, J. & Gilad, Y. RNA-seq: impact of RNA degradation on transcript quantification. BMC Biol. 12, 42 (2014).
-
Mendy, M. et al. Biospecimens and Biobanking in Global Health. Glob. Health Pathol. 38, 183–207 (2018).
-
Ziyatdinov, A. et al. Genotyping, sequencing and analysis of 140,000 adults from Mexico City. Nature 622, 784–793 (2023).
-
Wall, J. D. et al. The GenomeAsia 100K project enables genetic discoveries across Asia. Nature 576, 106–111 (2019).
-
Naslavsky, M. S. et al. Whole-genome sequencing of 1,171 elderly admixed individuals from Brazil. Nat. Commun. 13, 1004 (2022).
-
Danecek, P. et al. Twelve years of SAMtools and BCFtools. Gigascience 10, giab008 (2021).
-
Bick, A. G. et al. Genomic data in the All of Us Research Program. Nature 627, 340–346 (2024).
-
Organick, L. et al. Random access in large-scale DNA data storage. Nat. Biotechnol. 36, 242–248 (2018).
-
Tomek, K. J. et al. Driving the scalability of DNA-based information storage systems. ACS Synth. Biol. 8, 1241–1248 (2019).
-
Banal, J. L. & Bathe, M. Scalable nucleic acid storage and retrieval using barcoded microcapsules. ACS Appl. Mater. Interfaces 13, 49729–49736 (2021).
-
Banal, J. L. et al. Random access DNA memory using Boolean search in an archival file storage system. Nat. Mater. 20, 1272–1280 (2021).
-
Organick, L. et al. Probing the physical limits of reliable DNA data retrieval. Nat. Commun. 11, 616 (2020).
-
Xu, Q., Schlabach, M. R., Hannon, G. J. & Elledge, S. J. Design of 240,000 orthogonal 25mer DNA barcode probes. Proc. Natl. Acad. Sci. USA 106, 2289–2294 (2009).
-
Porichis, F. et al. High-throughput detection of miRNAs and gene-specific mRNA at the single-cell level by flow cytometry. Nat. Commun. 5, 5641 (2014).
-
Goldstein, E., Lipsitch, M. & Cevik, M. On the effect of age on the transmission of SARS-CoV-2 in households, schools, and the community. J. Infect. Dis. 223, 362–369 (2021).
-
Fauver, J. R. et al. Coast-to-coast spread of SARS-CoV-2 during the Early Epidemic in the United States. Cell 181, 990–996 (2020).
-
Kishi, J. Y. et al. SABER amplifies FISH: enhanced multiplexed imaging of RNA and DNA in cells and tissues. Nat. Methods 16, 533–544 (2019).
-
Player, A. N., Shen, L.-P., Kenny, D., Antao, V. P. & Kolberg, J. A. Single-copy gene detection using branched DNA (bDNA) in situ hybridization. J. Histochem. Cytochem. 49, 603–611 (2001).
-
Tao, K. et al. The biological and clinical significance of emerging SARS-CoV-2 variants. Nat. Rev. Genet. 22, 757–773 (2021).
-
Bei, Y. et al. Overcoming variant mutation-related impacts on viral sequencing and detection methodologies. Front. Med. 9, 989913 (2022).
-
Karthikeyan, S. et al. Wastewater sequencing reveals early cryptic SARS-CoV-2 variant transmission. Nature 609, 101–108 (2022).
-
Lagerborg, K. A. et al. Synthetic DNA spike-ins (SDSIs) enable sample tracking and detection of inter-sample contamination in SARS-CoV-2 sequencing workflows. Nat. Microbiol. 7, 108–119 (2022).
-
Kubik, S. et al. Recommendations for accurate genotyping of SARS-CoV-2 using amplicon-based sequencing of clinical samples. Clin. Microbiol. Infect. 27, 1036.e1–1036.e8 (2021).
-
Rosenthal, S. H. et al. Development and validation of a high throughput SARS-CoV-2 whole genome sequencing workflow in a clinical laboratory. Sci. Rep. 12, 2054 (2022).
-
BigQuery public datasets. Google Cloud https://cloud.google.com/bigquery/public-data.
-
Open Datasets Documentation – Tutorials, API reference – Azure – Azure Open Datasets. https://learn.microsoft.com/en-us/azure/open-datasets/.
-
Open Data on AWS. https://aws.amazon.com/opendata/.
-
The Nucleic Acid Observatory Consortium. A global nucleic acid observatory for biodefense and planetary health. Preprint at arXiv:2108.02678 (2021).
-
Azenta Life Sciences. Cryogenic Storage Solutions in Life Sciences. https://www.azenta.com/learning-center/resources/cryogenic-storage-solutions-life-sciences-comprehensive-guide-decision-making (2024).
-
Bee, C. et al. Molecular-level similarity search brings computing to DNA data storage. Nat. Commun. 12, 4764 (2021).
-
Eldjarn, G. H. et al. Large-scale plasma proteomics comparisons through genetics and disease associations. Nature 622, 348–358 (2023).
-
Zhao, T. et al. Spatial genomics enables multi-modal study of clonal heterogeneity in tissues. Nature 601, 85–91 (2022).
-
Hunter, J. D. Matplotlib: A 2D Graphics Environment. Comput. Sci. Eng. 9, 90–95 (2007).
-
Altschul, S. F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997).
-
Knuth, D. E. The Art of Computer Programming, Volume 4, Fascicle 2: Generating All Tuples and Permutations. (Addison-Wesley, 2005).
-
Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).
-
Li, H. et al. The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
-
Wilm, A. et al. LoFreq: a sequence-quality aware, ultra-sensitive variant caller for uncovering cell-population heterogeneity from high-throughput sequencing datasets. Nucleic Acids Res. 40, 11189–11201 (2012).
-
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
-
McKenna, A. et al. The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).
-
Aksamentov, I., Roemer, C., Hodcroft, E. & Neher, R. Nextclade: clade assignment, mutation calling and quality control for viral genomes. J. Open Source Softw. 6, 3773 (2021).
-
Berleant, J. D., Banal, J. L., Rao, D. K. & Bathe, M. Enabling global-scale nucleic acid repositories through versatile, scalable biochemical selection from room-temperature archives. Zenodo https://doi.org/10.5281/ZENODO.10501347 (2025).
-
Berleant, J. D., Banal, J. L., Rao, D. K. & Bathe, M. Full datasets from: Enabling global-scale nucleic acid repositories through versatile, scalable biochemical selection from room-temperature archives. Zenodo https://doi.org/10.5281/ZENODO.17516191 (2025).
-
Berleant, J. D., Banal, J. L., Rao, D. K. & Bathe, M. lcbb/BiosampleSQL: Publication release. Zenodo https://doi.org/10.5281/ZENODO.17402438 (2025).
-
NIAID Visual & Medical Arts. Eppendorf Tube. NIAID NIH BIOART Source. bioart.niaid.nih.gov/bioart/143 (2024).
-
NIAID Visual & Medical Arts. 96 Well Plate. NIAID NIH BIOART source. bioart.niaid.nih.gov/bioart/7 (2024).
-
NIAID Visual & Medical Arts. Next gen sequencer. NIAID NIH BIOART source. bioart.niaid.nih.gov/bioart/386 (2024).
