Wanxin Cui , Junjie You , Wenlong Jie , Zihao Li , Xiaoqing Peng
{"title":"cfDNAFE:综合提取游离DNA的多组学特征,用于无创诊断。","authors":"Wanxin Cui , Junjie You , Wenlong Jie , Zihao Li , Xiaoqing Peng","doi":"10.1016/j.ymeth.2025.05.013","DOIUrl":null,"url":null,"abstract":"<div><div>The tissues-of-origin of circulating cell-free DNA (cfDNA) holds great promise for non-invasive diagnosing cancers, monitoring allograft rejection, and prenatal testing. Many features for inferring the tissues-of-origin of cfDNAs are being revealed from different angles, including genetics, epigenetics, and fragmentomics, with whole-genome sequencing (WGS) and whole-genome bisulfite sequencing (WGBS) data of cfDNA. However, it lacks integrative toolkits for automatically extracting the revealed features from the WGS and WGBS data of cfDNA samples. Here, we propose cfDNAFE, a comprehensive and easy-to-use python package for extracting multi-omics features from the aligned cfDNA sequencing data. It covers three aspects: cfDNA genetic features, cfDNA methylation features, and cfDNA fragmentation features, including 13 types of feature profiles. The genetic features include substitution mutations, mutation signatures and copy number variations. The methylation features are the proportions of methylated fragments, unmethylated fragments, and mixed methylated fragments on cell-type-specific markers. The fragmentation features related to the fragment sizes, end/breakpoint motifs, and nucleosome positions are also integrated. To verify the functions of cfDNAFE, we perform analysis on the WGS/WGBS data of cfDNA samples based on the feature profiles extracted by cfDNAFE. The comparison between the cfDNA samples of hepatocellular carcinoma (HCC) patients and normal controls suggests HCC cfDNA samples exhibit significant difference in fragment size related features and breakpoint/end motif patterns, and obtain significant higher OCF values in the liver-specific open regions than the health controls. Conclusively, cfDNAFE is a most comprehensive toolkit which covers the most features for inferring the tissues-of-origin of cfDNAs in existing studies up to date. It will facilitate researchers to build machine learning models for auxiliary diagnosis based on these features. Availability and implementation: <span><span>https://github.com/Cuiwanxin1998/cfDNAFE</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":390,"journal":{"name":"Methods","volume":"241 ","pages":"Pages 163-172"},"PeriodicalIF":4.3000,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"cfDNAFE: Comprehensively extracting multi-omics features of cell-free DNA for noninvasive diagnosis\",\"authors\":\"Wanxin Cui , Junjie You , Wenlong Jie , Zihao Li , Xiaoqing Peng\",\"doi\":\"10.1016/j.ymeth.2025.05.013\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The tissues-of-origin of circulating cell-free DNA (cfDNA) holds great promise for non-invasive diagnosing cancers, monitoring allograft rejection, and prenatal testing. Many features for inferring the tissues-of-origin of cfDNAs are being revealed from different angles, including genetics, epigenetics, and fragmentomics, with whole-genome sequencing (WGS) and whole-genome bisulfite sequencing (WGBS) data of cfDNA. However, it lacks integrative toolkits for automatically extracting the revealed features from the WGS and WGBS data of cfDNA samples. Here, we propose cfDNAFE, a comprehensive and easy-to-use python package for extracting multi-omics features from the aligned cfDNA sequencing data. It covers three aspects: cfDNA genetic features, cfDNA methylation features, and cfDNA fragmentation features, including 13 types of feature profiles. The genetic features include substitution mutations, mutation signatures and copy number variations. The methylation features are the proportions of methylated fragments, unmethylated fragments, and mixed methylated fragments on cell-type-specific markers. The fragmentation features related to the fragment sizes, end/breakpoint motifs, and nucleosome positions are also integrated. To verify the functions of cfDNAFE, we perform analysis on the WGS/WGBS data of cfDNA samples based on the feature profiles extracted by cfDNAFE. The comparison between the cfDNA samples of hepatocellular carcinoma (HCC) patients and normal controls suggests HCC cfDNA samples exhibit significant difference in fragment size related features and breakpoint/end motif patterns, and obtain significant higher OCF values in the liver-specific open regions than the health controls. Conclusively, cfDNAFE is a most comprehensive toolkit which covers the most features for inferring the tissues-of-origin of cfDNAs in existing studies up to date. It will facilitate researchers to build machine learning models for auxiliary diagnosis based on these features. Availability and implementation: <span><span>https://github.com/Cuiwanxin1998/cfDNAFE</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":390,\"journal\":{\"name\":\"Methods\",\"volume\":\"241 \",\"pages\":\"Pages 163-172\"},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2025-06-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Methods\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1046202325001343\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Methods","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1046202325001343","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
cfDNAFE: Comprehensively extracting multi-omics features of cell-free DNA for noninvasive diagnosis
The tissues-of-origin of circulating cell-free DNA (cfDNA) holds great promise for non-invasive diagnosing cancers, monitoring allograft rejection, and prenatal testing. Many features for inferring the tissues-of-origin of cfDNAs are being revealed from different angles, including genetics, epigenetics, and fragmentomics, with whole-genome sequencing (WGS) and whole-genome bisulfite sequencing (WGBS) data of cfDNA. However, it lacks integrative toolkits for automatically extracting the revealed features from the WGS and WGBS data of cfDNA samples. Here, we propose cfDNAFE, a comprehensive and easy-to-use python package for extracting multi-omics features from the aligned cfDNA sequencing data. It covers three aspects: cfDNA genetic features, cfDNA methylation features, and cfDNA fragmentation features, including 13 types of feature profiles. The genetic features include substitution mutations, mutation signatures and copy number variations. The methylation features are the proportions of methylated fragments, unmethylated fragments, and mixed methylated fragments on cell-type-specific markers. The fragmentation features related to the fragment sizes, end/breakpoint motifs, and nucleosome positions are also integrated. To verify the functions of cfDNAFE, we perform analysis on the WGS/WGBS data of cfDNA samples based on the feature profiles extracted by cfDNAFE. The comparison between the cfDNA samples of hepatocellular carcinoma (HCC) patients and normal controls suggests HCC cfDNA samples exhibit significant difference in fragment size related features and breakpoint/end motif patterns, and obtain significant higher OCF values in the liver-specific open regions than the health controls. Conclusively, cfDNAFE is a most comprehensive toolkit which covers the most features for inferring the tissues-of-origin of cfDNAs in existing studies up to date. It will facilitate researchers to build machine learning models for auxiliary diagnosis based on these features. Availability and implementation: https://github.com/Cuiwanxin1998/cfDNAFE.
期刊介绍:
Methods focuses on rapidly developing techniques in the experimental biological and medical sciences.
Each topical issue, organized by a guest editor who is an expert in the area covered, consists solely of invited quality articles by specialist authors, many of them reviews. Issues are devoted to specific technical approaches with emphasis on clear detailed descriptions of protocols that allow them to be reproduced easily. The background information provided enables researchers to understand the principles underlying the methods; other helpful sections include comparisons of alternative methods giving the advantages and disadvantages of particular methods, guidance on avoiding potential pitfalls, and suggestions for troubleshooting.