Elmira Shajari, David Gagné, Francis Bourassa, Mandy Malick, Patricia Roy, Jean-François Noël, Hugo Gagnon, Maxime Delisle, François-Michel Boisvert, Marie Brunet, Jean-François Beaulieu
{"title":"基于粪便的蛋白质组学特征用于克罗恩病和溃疡性结肠炎的非侵入性分类。","authors":"Elmira Shajari, David Gagné, Francis Bourassa, Mandy Malick, Patricia Roy, Jean-François Noël, Hugo Gagnon, Maxime Delisle, François-Michel Boisvert, Marie Brunet, Jean-François Beaulieu","doi":"10.14309/ctg.0000000000000925","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Crohn's disease and ulcerative colitis have overlapping symptoms, but they differ in pathology and treatment. Currently, distinguishing between these diseases involves invasive procedures such as colonoscopy and histopathology. Fecal proteins, stable and in direct contact with inflammation, offer a non-invasive alternative. This study focuses on using high-throughput data-independent acquisition mass spectrometry and machine learning to develop an accurate biomarker signature from complex stool samples.</p><p><strong>Methods: </strong>Stool samples obtained from 69 active patients were analyzed. Analysis of the stool proteome led to the identification and quantification of approximately 1,250 proteins. The samples were divided into training and testing groups. After data processing, various feature selection algorithms were applied on the training group to determine proteins that were significantly different between the Crohn's disease and ulcerative colitis groups. Additionally, six machine learning algorithms were evaluated to identify the best-performing classifiers.</p><p><strong>Results: </strong>Sixteen proteins were selected based on several feature selection algorithms and six models were trained based on them. According to the performance metrics of each algorithm on the training dataset, the Naïve Bayes model was selected. For performance validation, the final predictive model was applied to 16 blind prospective samples as the test dataset. Notably, the model achieved an AUC of 0.96 on both the training and test datasets, highlighting its robustness and stability.</p><p><strong>Discussion: </strong>This study demonstrates the potential of combining multiple stool protein biomarkers via high-throughput data-independent acquisition mass spectrometry and machine learning tools to develop a predictive model for efficiently distinguishing Crohn's disease from ulcerative colitis.</p>","PeriodicalId":10278,"journal":{"name":"Clinical and Translational Gastroenterology","volume":" ","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2025-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Stool-Based Proteomic Signature for the Non-Invasive Classification of Crohn's Disease and Ulcerative Colitis Using Machine Learning.\",\"authors\":\"Elmira Shajari, David Gagné, Francis Bourassa, Mandy Malick, Patricia Roy, Jean-François Noël, Hugo Gagnon, Maxime Delisle, François-Michel Boisvert, Marie Brunet, Jean-François Beaulieu\",\"doi\":\"10.14309/ctg.0000000000000925\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Introduction: </strong>Crohn's disease and ulcerative colitis have overlapping symptoms, but they differ in pathology and treatment. Currently, distinguishing between these diseases involves invasive procedures such as colonoscopy and histopathology. Fecal proteins, stable and in direct contact with inflammation, offer a non-invasive alternative. This study focuses on using high-throughput data-independent acquisition mass spectrometry and machine learning to develop an accurate biomarker signature from complex stool samples.</p><p><strong>Methods: </strong>Stool samples obtained from 69 active patients were analyzed. Analysis of the stool proteome led to the identification and quantification of approximately 1,250 proteins. The samples were divided into training and testing groups. After data processing, various feature selection algorithms were applied on the training group to determine proteins that were significantly different between the Crohn's disease and ulcerative colitis groups. Additionally, six machine learning algorithms were evaluated to identify the best-performing classifiers.</p><p><strong>Results: </strong>Sixteen proteins were selected based on several feature selection algorithms and six models were trained based on them. According to the performance metrics of each algorithm on the training dataset, the Naïve Bayes model was selected. For performance validation, the final predictive model was applied to 16 blind prospective samples as the test dataset. Notably, the model achieved an AUC of 0.96 on both the training and test datasets, highlighting its robustness and stability.</p><p><strong>Discussion: </strong>This study demonstrates the potential of combining multiple stool protein biomarkers via high-throughput data-independent acquisition mass spectrometry and machine learning tools to develop a predictive model for efficiently distinguishing Crohn's disease from ulcerative colitis.</p>\",\"PeriodicalId\":10278,\"journal\":{\"name\":\"Clinical and Translational Gastroenterology\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2025-10-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Clinical and Translational Gastroenterology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.14309/ctg.0000000000000925\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"GASTROENTEROLOGY & HEPATOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical and Translational Gastroenterology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.14309/ctg.0000000000000925","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GASTROENTEROLOGY & HEPATOLOGY","Score":null,"Total":0}
Stool-Based Proteomic Signature for the Non-Invasive Classification of Crohn's Disease and Ulcerative Colitis Using Machine Learning.
Introduction: Crohn's disease and ulcerative colitis have overlapping symptoms, but they differ in pathology and treatment. Currently, distinguishing between these diseases involves invasive procedures such as colonoscopy and histopathology. Fecal proteins, stable and in direct contact with inflammation, offer a non-invasive alternative. This study focuses on using high-throughput data-independent acquisition mass spectrometry and machine learning to develop an accurate biomarker signature from complex stool samples.
Methods: Stool samples obtained from 69 active patients were analyzed. Analysis of the stool proteome led to the identification and quantification of approximately 1,250 proteins. The samples were divided into training and testing groups. After data processing, various feature selection algorithms were applied on the training group to determine proteins that were significantly different between the Crohn's disease and ulcerative colitis groups. Additionally, six machine learning algorithms were evaluated to identify the best-performing classifiers.
Results: Sixteen proteins were selected based on several feature selection algorithms and six models were trained based on them. According to the performance metrics of each algorithm on the training dataset, the Naïve Bayes model was selected. For performance validation, the final predictive model was applied to 16 blind prospective samples as the test dataset. Notably, the model achieved an AUC of 0.96 on both the training and test datasets, highlighting its robustness and stability.
Discussion: This study demonstrates the potential of combining multiple stool protein biomarkers via high-throughput data-independent acquisition mass spectrometry and machine learning tools to develop a predictive model for efficiently distinguishing Crohn's disease from ulcerative colitis.
期刊介绍:
Clinical and Translational Gastroenterology (CTG), published on behalf of the American College of Gastroenterology (ACG), is a peer-reviewed open access online journal dedicated to innovative clinical work in the field of gastroenterology and hepatology. CTG hopes to fulfill an unmet need for clinicians and scientists by welcoming novel cohort studies, early-phase clinical trials, qualitative and quantitative epidemiologic research, hypothesis-generating research, studies of novel mechanisms and methodologies including public health interventions, and integration of approaches across organs and disciplines. CTG also welcomes hypothesis-generating small studies, methods papers, and translational research with clear applications to human physiology or disease.
Colon and small bowel
Endoscopy and novel diagnostics
Esophagus
Functional GI disorders
Immunology of the GI tract
Microbiology of the GI tract
Inflammatory bowel disease
Pancreas and biliary tract
Liver
Pathology
Pediatrics
Preventative medicine
Nutrition/obesity
Stomach.