{"title":"INDELpred: Improving the prediction and interpretation of indel pathogenicity within the clinical genome.","authors":"Yilin Wei, Tongda Zhang, Bangyao Wang, Xiaosen Jiang, Fei Ling, Mingyan Fang, Xin Jin, Yong Bai","doi":"10.1016/j.xhgg.2024.100325","DOIUrl":null,"url":null,"abstract":"<p><p>Small insertions and deletions (indels) are critical yet challenging genetic variations with significant clinical implications. However, the identification of pathogenic indels from neutral variants in clinical contexts remains an understudied problem. Here, we developed INDELpred, a machine-learning-based predictive model for discerning pathogenic from benign indels. INDELpred was established based on key features, including allele frequency, indel length, function-based features, and gene-based features. A set of comprehensive evaluation analyses demonstrated that INDELpred exhibited superior performance over competing methods in terms of computational efficiency and prediction accuracy. Importantly, INDELpred highlighted the crucial role of function-based features in identifying pathogenic indels, with a clear interpretability of the features in understanding the disease-causing variants. We envisage INDELpred as a desirable tool for the detection of pathogenic indels within large-scale genomic datasets, thereby enhancing the precision of genetic diagnoses in clinical settings.</p>","PeriodicalId":34530,"journal":{"name":"HGG Advances","volume":null,"pages":null},"PeriodicalIF":3.3000,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11321314/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"HGG Advances","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.xhgg.2024.100325","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/7/10 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0
Abstract
Small insertions and deletions (indels) are critical yet challenging genetic variations with significant clinical implications. However, the identification of pathogenic indels from neutral variants in clinical contexts remains an understudied problem. Here, we developed INDELpred, a machine-learning-based predictive model for discerning pathogenic from benign indels. INDELpred was established based on key features, including allele frequency, indel length, function-based features, and gene-based features. A set of comprehensive evaluation analyses demonstrated that INDELpred exhibited superior performance over competing methods in terms of computational efficiency and prediction accuracy. Importantly, INDELpred highlighted the crucial role of function-based features in identifying pathogenic indels, with a clear interpretability of the features in understanding the disease-causing variants. We envisage INDELpred as a desirable tool for the detection of pathogenic indels within large-scale genomic datasets, thereby enhancing the precision of genetic diagnoses in clinical settings.