Haichao Wang, Paulius D. Mennea, Yu Kiu Elkie Chan, Zhao Cheng, Maria C. Neofytou, Arif Anwer Surani, Aadhitthya Vijayaraghavan, Emma-Jane Ditter, Richard Bowers, Matthew D. Eldridge, Dmitry S. Shcherbo, Christopher G. Smith, Florian Markowetz, Wendy N. Cooper, Tommy Kaplan, Nitzan Rosenfeld, Hui Zhao
{"title":"A standardized framework for robust fragmentomic feature extraction from cell-free DNA sequencing data","authors":"Haichao Wang, Paulius D. Mennea, Yu Kiu Elkie Chan, Zhao Cheng, Maria C. Neofytou, Arif Anwer Surani, Aadhitthya Vijayaraghavan, Emma-Jane Ditter, Richard Bowers, Matthew D. Eldridge, Dmitry S. Shcherbo, Christopher G. Smith, Florian Markowetz, Wendy N. Cooper, Tommy Kaplan, Nitzan Rosenfeld, Hui Zhao","doi":"10.1186/s13059-025-03607-5","DOIUrl":null,"url":null,"abstract":"Fragmentomics features of cell-free DNA represent promising non-invasive biomarkers for cancer diagnosis. A lack of systematic evaluation of biases in feature quantification hinders the adoption of such applications. We compare features derived from whole-genome sequencing of ten healthy donors using nine library kits and ten data-processing routes and validated in 1182 plasma samples from published studies. Our results clarify the variations from library preparation and feature quantification methods. We design the Trim Align Pipeline and cfDNAPro R package as unified interfaces for data pre-processing, feature extraction, and visualization to standardize multi-modal feature engineering and integration for machine learning.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":"145 1","pages":""},"PeriodicalIF":10.1000,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genome Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s13059-025-03607-5","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOTECHNOLOGY & APPLIED MICROBIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Fragmentomics features of cell-free DNA represent promising non-invasive biomarkers for cancer diagnosis. A lack of systematic evaluation of biases in feature quantification hinders the adoption of such applications. We compare features derived from whole-genome sequencing of ten healthy donors using nine library kits and ten data-processing routes and validated in 1182 plasma samples from published studies. Our results clarify the variations from library preparation and feature quantification methods. We design the Trim Align Pipeline and cfDNAPro R package as unified interfaces for data pre-processing, feature extraction, and visualization to standardize multi-modal feature engineering and integration for machine learning.
Genome BiologyBiochemistry, Genetics and Molecular Biology-Genetics
CiteScore
21.00
自引率
3.30%
发文量
241
审稿时长
2 months
期刊介绍:
Genome Biology stands as a premier platform for exceptional research across all domains of biology and biomedicine, explored through a genomic and post-genomic lens.
With an impressive impact factor of 12.3 (2022),* the journal secures its position as the 3rd-ranked research journal in the Genetics and Heredity category and the 2nd-ranked research journal in the Biotechnology and Applied Microbiology category by Thomson Reuters. Notably, Genome Biology holds the distinction of being the highest-ranked open-access journal in this category.
Our dedicated team of highly trained in-house Editors collaborates closely with our esteemed Editorial Board of international experts, ensuring the journal remains on the forefront of scientific advances and community standards. Regular engagement with researchers at conferences and institute visits underscores our commitment to staying abreast of the latest developments in the field.