Daniel Rawlinson, Chenxi Zhou, Myrsini Kaforou, Kim-Anh Lê Cao, Lachlan J M Coin
{"title":"一个灵活的框架,从临床组学研究中发现最小的生物标志物特征,没有文库大小标准化。","authors":"Daniel Rawlinson, Chenxi Zhou, Myrsini Kaforou, Kim-Anh Lê Cao, Lachlan J M Coin","doi":"10.1371/journal.pdig.0000780","DOIUrl":null,"url":null,"abstract":"<p><p>Application of transcriptomics, proteomics and metabolomics technologies to clinical cohorts has uncovered a variety of signatures for predicting disease. Many of these signatures require the full 'omics data for evaluation on unseen samples, either explicitly or implicitly through library size normalisation. Translation to low-cost point-of-care tests requires development of signatures which measure as few analytes as possible without relying on direct measurement of library size. To achieve this, we have developed a feature selection method (Forward Selection-Partial Least Squares) which generates minimal disease signatures from high-dimensional omics datasets with applicability to continuous, binary or multi-class outcomes. Through extensive benchmarking, we show that FS-PLS has comparable performance to commonly used signature discovery methods while delivering signatures which are an order of magnitude smaller. We show that FS-PLS can be used to select features predictive of library size, and that these features can be used to normalize unseen samples, meaning that the features in the complete model can be measured in isolation for making new predictions. By enabling discovery of small, high-performance signatures, FS-PLS addresses an important impediment for the further development of precision medical care.</p>","PeriodicalId":74465,"journal":{"name":"PLOS digital health","volume":"4 3","pages":"e0000780"},"PeriodicalIF":0.0000,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11942414/pdf/","citationCount":"0","resultStr":"{\"title\":\"A flexible framework for minimal biomarker signature discovery from clinical omics studies without library size normalisation.\",\"authors\":\"Daniel Rawlinson, Chenxi Zhou, Myrsini Kaforou, Kim-Anh Lê Cao, Lachlan J M Coin\",\"doi\":\"10.1371/journal.pdig.0000780\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Application of transcriptomics, proteomics and metabolomics technologies to clinical cohorts has uncovered a variety of signatures for predicting disease. Many of these signatures require the full 'omics data for evaluation on unseen samples, either explicitly or implicitly through library size normalisation. Translation to low-cost point-of-care tests requires development of signatures which measure as few analytes as possible without relying on direct measurement of library size. To achieve this, we have developed a feature selection method (Forward Selection-Partial Least Squares) which generates minimal disease signatures from high-dimensional omics datasets with applicability to continuous, binary or multi-class outcomes. Through extensive benchmarking, we show that FS-PLS has comparable performance to commonly used signature discovery methods while delivering signatures which are an order of magnitude smaller. We show that FS-PLS can be used to select features predictive of library size, and that these features can be used to normalize unseen samples, meaning that the features in the complete model can be measured in isolation for making new predictions. By enabling discovery of small, high-performance signatures, FS-PLS addresses an important impediment for the further development of precision medical care.</p>\",\"PeriodicalId\":74465,\"journal\":{\"name\":\"PLOS digital health\",\"volume\":\"4 3\",\"pages\":\"e0000780\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-03-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11942414/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"PLOS digital health\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1371/journal.pdig.0000780\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/3/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLOS digital health","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1371/journal.pdig.0000780","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/3/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}
A flexible framework for minimal biomarker signature discovery from clinical omics studies without library size normalisation.
Application of transcriptomics, proteomics and metabolomics technologies to clinical cohorts has uncovered a variety of signatures for predicting disease. Many of these signatures require the full 'omics data for evaluation on unseen samples, either explicitly or implicitly through library size normalisation. Translation to low-cost point-of-care tests requires development of signatures which measure as few analytes as possible without relying on direct measurement of library size. To achieve this, we have developed a feature selection method (Forward Selection-Partial Least Squares) which generates minimal disease signatures from high-dimensional omics datasets with applicability to continuous, binary or multi-class outcomes. Through extensive benchmarking, we show that FS-PLS has comparable performance to commonly used signature discovery methods while delivering signatures which are an order of magnitude smaller. We show that FS-PLS can be used to select features predictive of library size, and that these features can be used to normalize unseen samples, meaning that the features in the complete model can be measured in isolation for making new predictions. By enabling discovery of small, high-performance signatures, FS-PLS addresses an important impediment for the further development of precision medical care.