Houcong Liu, Loren Hansen, Changpu Song, Haijiu Lin, Dan Chen, Zhufang Chen, Hekai Zhou, Xiao Yang, Wenying Pan, Jihui Du
{"title":"具有临床验证的生物信息学筛选,用于鉴定基于粪便的新型mRNA生物标志物,用于检测包括晚期腺瘤在内的结直肠病变。","authors":"Houcong Liu, Loren Hansen, Changpu Song, Haijiu Lin, Dan Chen, Zhufang Chen, Hekai Zhou, Xiao Yang, Wenying Pan, Jihui Du","doi":"10.1038/s41598-025-13074-4","DOIUrl":null,"url":null,"abstract":"<p><p>Messenger RNA (mRNA) stool based biomarkers represent a promising approach for the diagnosis of colorectal cancer (CRC) and advanced adenoma (AA). But it is unclear which mRNA biomarkers have the most clinical utility. This study aims to partially fill this gap by performing an analysis which first ranks genes based on their expression profile in publicly available RNA-seq tissue datasets. Each gene was ranked based on observed differential expression across the majority of tumors as well as the level of expression in tumor tissue. Those genes with strong differential expression across the majority of tumors that were also highly expressed would have a higher ranking. The top 20 genes as ranked in the bioinformatic analysis of tumor and normal colon tissue gene expression were then tested on 114 clinical stool samples (CRC N = 33, AA N = 28, Controls N = 53). Fourteen of the genes had significant differential expression in the stool of CRC patients compared to controls (false discovery rate or FDR < 0.05). The Pearson correlation coefficient between tissue and stool expression was 0.57 (p-value = 0.007). The combined performance of the 20 genes in clinical stool samples had an area under the receiver operator curve (AUC) of 0.94 for CRC detection (sensitivity 75.5%, specificity 95%) and an AUC of 0.83 (sensitivity 55.8%, specificity 92.6%) for AA detection. The ability to use existing public transcriptomic datasets to identify promising candidate genes can substantially reduce the cost and effort required to screen for clinically useful mRNA biomarkers.</p>","PeriodicalId":21811,"journal":{"name":"Scientific Reports","volume":"15 1","pages":"29397"},"PeriodicalIF":3.9000,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12339692/pdf/","citationCount":"0","resultStr":"{\"title\":\"Bioinformatic screen with clinical validation for the identification of novel stool based mRNA biomarkers for the detection of colorectal lesions including advanced adenoma.\",\"authors\":\"Houcong Liu, Loren Hansen, Changpu Song, Haijiu Lin, Dan Chen, Zhufang Chen, Hekai Zhou, Xiao Yang, Wenying Pan, Jihui Du\",\"doi\":\"10.1038/s41598-025-13074-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Messenger RNA (mRNA) stool based biomarkers represent a promising approach for the diagnosis of colorectal cancer (CRC) and advanced adenoma (AA). But it is unclear which mRNA biomarkers have the most clinical utility. This study aims to partially fill this gap by performing an analysis which first ranks genes based on their expression profile in publicly available RNA-seq tissue datasets. Each gene was ranked based on observed differential expression across the majority of tumors as well as the level of expression in tumor tissue. Those genes with strong differential expression across the majority of tumors that were also highly expressed would have a higher ranking. The top 20 genes as ranked in the bioinformatic analysis of tumor and normal colon tissue gene expression were then tested on 114 clinical stool samples (CRC N = 33, AA N = 28, Controls N = 53). Fourteen of the genes had significant differential expression in the stool of CRC patients compared to controls (false discovery rate or FDR < 0.05). The Pearson correlation coefficient between tissue and stool expression was 0.57 (p-value = 0.007). The combined performance of the 20 genes in clinical stool samples had an area under the receiver operator curve (AUC) of 0.94 for CRC detection (sensitivity 75.5%, specificity 95%) and an AUC of 0.83 (sensitivity 55.8%, specificity 92.6%) for AA detection. The ability to use existing public transcriptomic datasets to identify promising candidate genes can substantially reduce the cost and effort required to screen for clinically useful mRNA biomarkers.</p>\",\"PeriodicalId\":21811,\"journal\":{\"name\":\"Scientific Reports\",\"volume\":\"15 1\",\"pages\":\"29397\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2025-08-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12339692/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Scientific Reports\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://doi.org/10.1038/s41598-025-13074-4\",\"RegionNum\":2,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific Reports","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1038/s41598-025-13074-4","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0
摘要
基于信使RNA (mRNA)粪便的生物标志物代表了一种有希望的诊断结直肠癌(CRC)和晚期腺瘤(AA)的方法。但目前尚不清楚哪种mRNA生物标志物具有最大的临床用途。本研究旨在通过在公开可用的RNA-seq组织数据集中进行基于基因表达谱的分析来部分填补这一空白。每个基因的排名是基于在大多数肿瘤中观察到的差异表达以及肿瘤组织中的表达水平。那些在大多数肿瘤中具有强烈差异表达的基因,也高度表达,将有更高的排名。对114份临床粪便样本(CRC N = 33, AA N = 28,对照N = 53)进行肿瘤和正常结肠组织基因表达生物信息学分析排名前20位的基因检测。与对照组相比,14个基因在结直肠癌患者的粪便中有显著差异表达(错误发现率或FDR)
Bioinformatic screen with clinical validation for the identification of novel stool based mRNA biomarkers for the detection of colorectal lesions including advanced adenoma.
Messenger RNA (mRNA) stool based biomarkers represent a promising approach for the diagnosis of colorectal cancer (CRC) and advanced adenoma (AA). But it is unclear which mRNA biomarkers have the most clinical utility. This study aims to partially fill this gap by performing an analysis which first ranks genes based on their expression profile in publicly available RNA-seq tissue datasets. Each gene was ranked based on observed differential expression across the majority of tumors as well as the level of expression in tumor tissue. Those genes with strong differential expression across the majority of tumors that were also highly expressed would have a higher ranking. The top 20 genes as ranked in the bioinformatic analysis of tumor and normal colon tissue gene expression were then tested on 114 clinical stool samples (CRC N = 33, AA N = 28, Controls N = 53). Fourteen of the genes had significant differential expression in the stool of CRC patients compared to controls (false discovery rate or FDR < 0.05). The Pearson correlation coefficient between tissue and stool expression was 0.57 (p-value = 0.007). The combined performance of the 20 genes in clinical stool samples had an area under the receiver operator curve (AUC) of 0.94 for CRC detection (sensitivity 75.5%, specificity 95%) and an AUC of 0.83 (sensitivity 55.8%, specificity 92.6%) for AA detection. The ability to use existing public transcriptomic datasets to identify promising candidate genes can substantially reduce the cost and effort required to screen for clinically useful mRNA biomarkers.
期刊介绍:
We publish original research from all areas of the natural sciences, psychology, medicine and engineering. You can learn more about what we publish by browsing our specific scientific subject areas below or explore Scientific Reports by browsing all articles and collections.
Scientific Reports has a 2-year impact factor: 4.380 (2021), and is the 6th most-cited journal in the world, with more than 540,000 citations in 2020 (Clarivate Analytics, 2021).
•Engineering
Engineering covers all aspects of engineering, technology, and applied science. It plays a crucial role in the development of technologies to address some of the world''s biggest challenges, helping to save lives and improve the way we live.
•Physical sciences
Physical sciences are those academic disciplines that aim to uncover the underlying laws of nature — often written in the language of mathematics. It is a collective term for areas of study including astronomy, chemistry, materials science and physics.
•Earth and environmental sciences
Earth and environmental sciences cover all aspects of Earth and planetary science and broadly encompass solid Earth processes, surface and atmospheric dynamics, Earth system history, climate and climate change, marine and freshwater systems, and ecology. It also considers the interactions between humans and these systems.
•Biological sciences
Biological sciences encompass all the divisions of natural sciences examining various aspects of vital processes. The concept includes anatomy, physiology, cell biology, biochemistry and biophysics, and covers all organisms from microorganisms, animals to plants.
•Health sciences
The health sciences study health, disease and healthcare. This field of study aims to develop knowledge, interventions and technology for use in healthcare to improve the treatment of patients.