Kleio-Maria Verrou, Nikolaos I Vlachogiannis, Argyrios N Theofilopoulos, Maria Tektonidou, Georgios Kollias, Christoforos Nikolaou, Petros P Sfikakis
{"title":"基于机器学习的血液转录组特征鉴别系统自身免疫和感染。","authors":"Kleio-Maria Verrou, Nikolaos I Vlachogiannis, Argyrios N Theofilopoulos, Maria Tektonidou, Georgios Kollias, Christoforos Nikolaou, Petros P Sfikakis","doi":"10.1016/j.medj.2025.100840","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Pathogenic responses against self and foreign antigens in systemic autoimmunity and infection, respectively, engage similar immunologic components, thus lacking distinguishing diagnostic biomarkers. Herein, we tested whether whole-blood transcriptome analysis discriminates autoimmune from infectious diseases.</p><p><strong>Methods: </strong>We applied nested cross-validation methodology to tune and validate random forests, k-nearest neighbors, and support vector machines, using a new preprocessing method on 22 publicly available datasets, including 594 patients with a broad spectrum of systemic autoimmune diseases and 615 patients with diverse viral, bacterial, and parasitic infections.</p><p><strong>Findings: </strong>Our preprocessing method tackled RNA sequencing batch effects by sorting the genes within each sample according to individual relative expression values and discriminated between the corresponding pathologies with 98% accuracy versus 63% when using raw values. This model was further tested in external datasets comprising various autoimmune diseases and infections new to its training process, yielding accuracies ranging between 80% and 96%. Enrichment analyses of 457 of the most informative genes identified SAP1, ELF1/4, and FLI1 transcription factors among the significant upstream regulators and revealed several key processes and pathways, such as autophagy, DNA damage response, and NOTCH signaling. A subset of 24 genes, including the inflammation-related genes RPL7, TLK2, and ANK2, distinguished between autoimmune and infectious diseases with 89% accuracy.</p><p><strong>Conclusions: </strong>Using a novel batch-correction algorithm, this analysis may provide a new mechanistic understanding of the pathogenic autoimmune response, as well as biomarkers for differential diagnoses of the corresponding pathologies in patients presenting with inflammatory disorders.</p><p><strong>Funding: </strong>This work was funded by the European Regional Development Fund NSRF 2014-2020, no. MIS5002802.</p>","PeriodicalId":29964,"journal":{"name":"Med","volume":" ","pages":"100840"},"PeriodicalIF":11.8000,"publicationDate":"2025-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Machine learning-based identification of a transcriptomic blood signature discriminating between systemic autoimmunity and infection.\",\"authors\":\"Kleio-Maria Verrou, Nikolaos I Vlachogiannis, Argyrios N Theofilopoulos, Maria Tektonidou, Georgios Kollias, Christoforos Nikolaou, Petros P Sfikakis\",\"doi\":\"10.1016/j.medj.2025.100840\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Pathogenic responses against self and foreign antigens in systemic autoimmunity and infection, respectively, engage similar immunologic components, thus lacking distinguishing diagnostic biomarkers. Herein, we tested whether whole-blood transcriptome analysis discriminates autoimmune from infectious diseases.</p><p><strong>Methods: </strong>We applied nested cross-validation methodology to tune and validate random forests, k-nearest neighbors, and support vector machines, using a new preprocessing method on 22 publicly available datasets, including 594 patients with a broad spectrum of systemic autoimmune diseases and 615 patients with diverse viral, bacterial, and parasitic infections.</p><p><strong>Findings: </strong>Our preprocessing method tackled RNA sequencing batch effects by sorting the genes within each sample according to individual relative expression values and discriminated between the corresponding pathologies with 98% accuracy versus 63% when using raw values. This model was further tested in external datasets comprising various autoimmune diseases and infections new to its training process, yielding accuracies ranging between 80% and 96%. Enrichment analyses of 457 of the most informative genes identified SAP1, ELF1/4, and FLI1 transcription factors among the significant upstream regulators and revealed several key processes and pathways, such as autophagy, DNA damage response, and NOTCH signaling. A subset of 24 genes, including the inflammation-related genes RPL7, TLK2, and ANK2, distinguished between autoimmune and infectious diseases with 89% accuracy.</p><p><strong>Conclusions: </strong>Using a novel batch-correction algorithm, this analysis may provide a new mechanistic understanding of the pathogenic autoimmune response, as well as biomarkers for differential diagnoses of the corresponding pathologies in patients presenting with inflammatory disorders.</p><p><strong>Funding: </strong>This work was funded by the European Regional Development Fund NSRF 2014-2020, no. MIS5002802.</p>\",\"PeriodicalId\":29964,\"journal\":{\"name\":\"Med\",\"volume\":\" \",\"pages\":\"100840\"},\"PeriodicalIF\":11.8000,\"publicationDate\":\"2025-08-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Med\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1016/j.medj.2025.100840\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MEDICINE, RESEARCH & EXPERIMENTAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Med","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.medj.2025.100840","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}
Machine learning-based identification of a transcriptomic blood signature discriminating between systemic autoimmunity and infection.
Background: Pathogenic responses against self and foreign antigens in systemic autoimmunity and infection, respectively, engage similar immunologic components, thus lacking distinguishing diagnostic biomarkers. Herein, we tested whether whole-blood transcriptome analysis discriminates autoimmune from infectious diseases.
Methods: We applied nested cross-validation methodology to tune and validate random forests, k-nearest neighbors, and support vector machines, using a new preprocessing method on 22 publicly available datasets, including 594 patients with a broad spectrum of systemic autoimmune diseases and 615 patients with diverse viral, bacterial, and parasitic infections.
Findings: Our preprocessing method tackled RNA sequencing batch effects by sorting the genes within each sample according to individual relative expression values and discriminated between the corresponding pathologies with 98% accuracy versus 63% when using raw values. This model was further tested in external datasets comprising various autoimmune diseases and infections new to its training process, yielding accuracies ranging between 80% and 96%. Enrichment analyses of 457 of the most informative genes identified SAP1, ELF1/4, and FLI1 transcription factors among the significant upstream regulators and revealed several key processes and pathways, such as autophagy, DNA damage response, and NOTCH signaling. A subset of 24 genes, including the inflammation-related genes RPL7, TLK2, and ANK2, distinguished between autoimmune and infectious diseases with 89% accuracy.
Conclusions: Using a novel batch-correction algorithm, this analysis may provide a new mechanistic understanding of the pathogenic autoimmune response, as well as biomarkers for differential diagnoses of the corresponding pathologies in patients presenting with inflammatory disorders.
Funding: This work was funded by the European Regional Development Fund NSRF 2014-2020, no. MIS5002802.
期刊介绍:
Med is a flagship medical journal published monthly by Cell Press, the global publisher of trusted and authoritative science journals including Cell, Cancer Cell, and Cell Reports Medicine. Our mission is to advance clinical research and practice by providing a communication forum for the publication of clinical trial results, innovative observations from longitudinal cohorts, and pioneering discoveries about disease mechanisms. The journal also encourages thought-leadership discussions among biomedical researchers, physicians, and other health scientists and stakeholders. Our goal is to improve health worldwide sustainably and ethically.
Med publishes rigorously vetted original research and cutting-edge review and perspective articles on critical health issues globally and regionally. Our research section covers clinical case reports, first-in-human studies, large-scale clinical trials, population-based studies, as well as translational research work with the potential to change the course of medical research and improve clinical practice.