Data availability
The raw and processed RNA-seq data analysed is available at Gene Expression Omnibus (GEO; accession number PRJNA1364186).
References
-
Thind, A. S. et al. Demystifying emerging bulk RNA-Seq applications: the application and utility of bioinformatic methodology. Brief. Bioinform. 22, bbab259 (2021).
-
Mendelsohn, S. C., Verhage, S., Mulenga, H., Scriba, T. J. & Hatherill, M. Systematic review of diagnostic and prognostic host blood transcriptomic signatures of tuberculosis disease in people living with HIV. Gates Open. Res. 7, 27 (2023).
-
Gupta, R. K. et al. Concise whole blood transcriptional signatures for incipient tuberculosis: a systematic review and patient-level pooled meta-analysis. Lancet Respir Med. 8, 395–406 (2020).
-
Mulenga, H. et al. Performance of diagnostic and predictive host blood transcriptomic signatures for tuberculosis disease: A systematic review and meta-analysis. PLOS ONE. 15, e0237574 (2020).
-
Kaforou, M. et al. Transcriptomics for child and adolescent tuberculosis*. Immunol. Rev. 309, 97–122 (2022).
-
Darboe, F. et al. Detection of tuberculosis Recurrence, diagnosis and treatment response by a blood transcriptomic risk signature in HIV-Infected persons on antiretroviral therapy. Front. Microbiol. 10, 1441 (2019).
-
Gebremicael, G. et al. Gene expression profiles classifying clinical stages of tuberculosis and monitoring treatment responses in Ethiopian HIV-negative and HIV-positive cohorts. PLOS ONE. 14, e0226137 (2019).
-
Tornheim, J. A. et al. Transcriptomic profiles of confirmed pediatric tuberculosis patients and household contacts identifies active tuberculosis, Infection, and treatment response among Indian children. J. Infect. Dis. 221, 1647–1658 (2020).
-
Thompson, E. G. et al. Host blood RNA signatures predict the outcome of tuberculosis treatment. Tuberculosis 107, 48–58 (2017).
-
Vargas, R. et al. Gene signature discovery and systematic validation across diverse clinical cohorts for TB prognosis and response to treatment. PLOS Comput. Biol. 19, e1010770 (2023).
-
O’Neil, D., Glowatz, H. & Schlumpberger, M. Ribosomal RNA depletion for efficient use of RNA-Seq capacity. Curr Protoc. Mol. Biol. 103, (2013).
-
Harrington, C. A. et al. RNA-Seq of human whole blood: evaluation of globin RNA depletion on Ribo-Zero library method. Sci. Rep. 10, 6271 (2020).
-
Dahlgren, A. R. et al. Comparison of Poly-A + Selection and rRNA depletion in detection of LncRNA in two equine tissues using RNA-seq. Non-Coding RNA. 6, 32 (2020).
-
Zhao, W. et al. Comparison of RNA-Seq by Poly (A) capture, ribosomal RNA depletion, and DNA microarray for expression profiling. BMC Genom. 15, 419 (2014).
-
Kumar, A. et al. The impact of RNA sequence library construction protocols on transcriptomic profiling of leukemia. BMC Genom. 18, 629 (2017).
-
Schuierer, S. et al. A comprehensive assessment of RNA-seq protocols for degraded and low-quantity samples. BMC Genom. 18, 442 (2017).
-
Zhao, S., Zhang, Y., Gamini, R. & Zhang, B. Von Schack, D. Evaluation of two main RNA-seq approaches for gene quantification in clinical RNA sequencing: polyA + selection versus rRNA depletion. Sci. Rep. 8, 4781 (2018).
-
Sultan, M. et al. Influence of RNA extraction methods and library selection schemes on RNA-seq data. BMC Genom. 15, 675 (2014).
-
Cui, P. et al. A comparison between ribo-minus RNA-sequencing and polyA-selected RNA-sequencing. Genomics 96, 259–265 (2010).
-
Chen, L. et al. Paired rRNA-depleted and polyA-selected RNA sequencing data and supporting multi-omics data from human T cells. Sci. Data. 7, 376 (2020).
-
Jaksik, R., Drobna-Śledzińska, M. & Dawidowska, M. RNA-seq library Preparation for comprehensive transcriptome analysis in cancer cells: the impact of insert size. Genomics 113, 4149–4162 (2021).
-
Guo, Y. et al. RNAseq by Total RNA Library Identifies Additional RNAs Compared to Poly(A) RNA Library. BioMed Res. Int. 1–9 (2015).
-
Chao, H. P. et al. Systematic evaluation of RNA-Seq Preparation protocol performance. BMC Genom. 20, 571 (2019).
-
Gunda, R. et al. Cohort profile: the Vukuzazi (‘Wake up and know yourself’ in isiZulu) population science programme. Int. J. Epidemiol. 51, e131–e142 (2022).
-
Mendelsohn, S. C. et al. Prospective multicentre head-to-head validation of host blood transcriptomic biomarkers for pulmonary tuberculosis by real-time PCR. Commun. Med. 2, 26 (2022).
-
Wang, X. M. et al. Global transcriptomic characterization of T cells in individuals with chronic HIV-1 infection. Cell. Discov. 8, 29 (2022).
-
Herberg, J. A. et al. Diagnostic test accuracy of a 2-Transcript host RNA signature for discriminating bacterial vs viral infection in febrile children. JAMA 316, 835 (2016).
-
Penn-Nicholson, A. et al. RISK6, a 6-gene transcriptomic signature of TB disease risk, diagnosis and treatment response. Sci. Rep. 10, 8629 (2020).
-
Roe, J. et al. Blood transcriptomic stratification of Short-term risk in contacts of tuberculosis. Clin. Infect. Dis. ciz252 https://doi.org/10.1093/cid/ciz252 (2019).
-
Duffy, F. J. et al. Use of a contained Mycobacterium tuberculosis mouse infection model to predict active disease and containment in humans. J. Infect. Dis. 225, 1832–1840 (2022).
-
Yaari, G., Bolen, C. R., Thakar, J. & Kleinstein, S. H. Quantitative set analysis for gene expression: a method to quantify gene set differential expression including gene-gene correlations. Nucleic Acids Res. 41, e170–e170 (2013).
-
Liberzon, A. et al. The molecular signatures database hallmark gene set collection. Cell. Syst. 1, 417–425 (2015).
-
Wong, E. B. et al. Convergence of infectious and non-communicable disease epidemics in rural South africa: a cross-sectional, population-based Multimorbidity study. Lancet Glob Health. 9, e967–e976 (2021).
-
Andrews, S. & FastQC A Quality Control Tool for High Throughput Sequence Data [Online]. Available online at: (2010). http://www.bioinformatics.babraham.ac.uk/projects/fastqc/.
-
Ewels, P., Magnusson, M., Lundin, S. & Käller, M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics 32, 3047–3048 (2016).
-
Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J. 17, 10 (2011).
-
Kim, D., Langmead, B. & Salzberg, S. L. HISAT: a fast spliced aligner with low memory requirements. Nat. Methods. 12, 357–360 (2015).
-
DeLuca, D. S. et al. RNA-SeQC: RNA-seq metrics for quality control and process optimization. Bioinformatics 28, 1530–1532 (2012).
-
Liao, Y., Smyth, G. K. & Shi, W. FeatureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923–930 (2014).
-
Law, C. W., Chen, Y., Shi, W. & Smyth, G. K. voom: precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biol. 15, R29 (2014).
Acknowledgements
We thank the University of Cape Town’s high performance computing (HPC) centre, and the South African Centre for High Performance Computing (CHPC) for providing the computing resources for data transfer and analyses. We acknowledge the contributions of the Vukuzazi Study Team, the Vukuzazi participants, the AHRI Community Advisory Board, the KwaZulu-Natal Department of Health and the National Health Laboratory Service (NHLS).
Funding
This study is part of the Immune Mechanisms of Protection against Mycobacterium Tuberculosis Center (IMPAc-TB) (NIAID Contract 75N93019C00070). The baseline survey was supported by Wellcome Strategic Core award: [227167/A/23/Z]. EBW is supported by the Burroughs Wellcome Fund Pathogenesis of Infectious Diseases Award (1022002). For the purpose of open access, the author has applied a CC BY public copyright licence to any Author Accepted Manuscript version arising from this submission. The content is solely the responsibility of the authors and does not necessarily represent the official views of the funders.
Ethics declarations
Competing interests
TJS is co-inventor of two patents of host-blood transcriptomic signatures of TB risk, RISK6 (Penn-Nicholson6) and RISK4 (Suliman4). The other authors disclose no competing interests.
Ethics approval and consent to participate
Ethical approval for the baseline and follow-up studies was obtained from the Ethics Committees of the University of KwaZulu-Natal, London School of Hygiene & Tropical Medicine, the Partners Institutional Review Board, the University of Alabama at Birmingham, and University College London. All study activities were conducted in accordance with the Declaration of Helsinki. All participants provided informed consent.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Awany, D., Claassen, H., Carstens, N. et al. Comparison of automated and manual mRNA enrichment to automated rRNA depletion for whole-blood RNA-sequencing. Sci Rep (2025). https://doi.org/10.1038/s41598-025-32961-4
-
Received:
-
Accepted:
-
Published:
-
DOI: https://doi.org/10.1038/s41598-025-32961-4
