Methods最新文献

筛选
英文 中文
dsRNAPredictor-II: An improved predictor of identifying dsRNA and its silencing efficiency for Tribolium castaneum based on sequence length distribution dsRNAPredictor-II:基于序列长度分布的dsRNA及其沉默效率的改进型预测器。
IF 4.2 3区 生物学
Methods Pub Date : 2024-11-09 DOI: 10.1016/j.ymeth.2024.11.007
Liping Xu, Jia Zheng, Yetong Zhou, Cangzhi Jia
{"title":"dsRNAPredictor-II: An improved predictor of identifying dsRNA and its silencing efficiency for Tribolium castaneum based on sequence length distribution","authors":"Liping Xu,&nbsp;Jia Zheng,&nbsp;Yetong Zhou,&nbsp;Cangzhi Jia","doi":"10.1016/j.ymeth.2024.11.007","DOIUrl":"10.1016/j.ymeth.2024.11.007","url":null,"abstract":"<div><div>RNA interference (RNAi) has been widely utilized to investigate gene functions and has significant potential for control of pest insects. However, recent studies have revealed that the target insect species, dsRNA molecule length, target genes, and other experimental factors can affect the efficiency of RNAi mediated control, restricting the further development and application of this technology. Therefore, the aim of this study was to establish a deep learning model using bioinformatics to help researchers identify dsRNA fragments with the highest RNAi efficiency. In this study, we optimized an existing model, namely, dsRNAPredictor, by designing sub-models based on different sequence lengths. Accordingly, the data were divided into two groups: 130–399 bp and 400–616 bp long sequences. Then, one-hot encoding was employed to extract sequence information. The convolutional neural network framework comprising three convolutional layers, three average pooling layers, a flattened layer, and three dense layers was employed as the classifier. By adjusting the parameters, we established two sub-models for different sequence distributions. Using multiple independent test datasets and conducting hypothesis testing, we demonstrated that our model exhibits superior performance and strong robustness to dsRNAPredictor, respectively. Therefore, our model may help design dsRNAs with pre-screening potential and facilitate further research and applications.</div></div>","PeriodicalId":390,"journal":{"name":"Methods","volume":"232 ","pages":"Pages 129-138"},"PeriodicalIF":4.2,"publicationDate":"2024-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142611539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Prediction of YY1 loop anchor based on multi-omics features 基于多组学特征的 YY1 环锚预测
IF 4.2 3区 生物学
Methods Pub Date : 2024-11-07 DOI: 10.1016/j.ymeth.2024.11.004
Jun Ren , Zhiling Guo , Yixuan Qi , Zheng Zhang , Li Liu
{"title":"Prediction of YY1 loop anchor based on multi-omics features","authors":"Jun Ren ,&nbsp;Zhiling Guo ,&nbsp;Yixuan Qi ,&nbsp;Zheng Zhang ,&nbsp;Li Liu","doi":"10.1016/j.ymeth.2024.11.004","DOIUrl":"10.1016/j.ymeth.2024.11.004","url":null,"abstract":"<div><div>The three-dimensional structure of chromatin is crucial for the regulation of gene expression. YY1 promotes enhancer-promoter interactions in a manner analogous to CTCF-mediated chromatin interactions. However, little is known about which YY1 binding sites can form loop anchors. In this study, the LightGBM model was used to predict YY1-loop anchors by integrating multi-omics data. Due to the large imbalance in the number of positive and negative samples, we use AUPRC to reflect the quality of the classifier. The results show that the LightGBM model exhibits strong predictive performance (<span><math><mrow><mi>A</mi><mi>U</mi><mi>P</mi><mi>R</mi><mi>C</mi><mo>≥</mo><mn>0.93</mn></mrow></math></span>). To verify the robustness of the model, the dataset was divided into training and test sets at a 4:1 ratio. The results show that the model performs well for YY1-loop anchor prediction on both the training and independent test sets. Additionally, we ranked the importance of the features and found that the formation of YY1-loop anchors is primarily influenced by the co-binding of transcription factors CTCF, SMC3, and RAD21, as well as histone modifications and sequence context.</div></div>","PeriodicalId":390,"journal":{"name":"Methods","volume":"232 ","pages":"Pages 96-106"},"PeriodicalIF":4.2,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142611464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HistoSPACE: Histology-inspired spatial transcriptome prediction and characterization engine HistoSPACE:受组织学启发的空间转录组预测和表征引擎。
IF 4.2 3区 生物学
Methods Pub Date : 2024-11-07 DOI: 10.1016/j.ymeth.2024.11.002
Shivam Kumar, Samrat Chatterjee
{"title":"HistoSPACE: Histology-inspired spatial transcriptome prediction and characterization engine","authors":"Shivam Kumar,&nbsp;Samrat Chatterjee","doi":"10.1016/j.ymeth.2024.11.002","DOIUrl":"10.1016/j.ymeth.2024.11.002","url":null,"abstract":"<div><div>Spatial transcriptomics (ST) enables the visualization of gene expression within the context of tissue morphology. This emerging discipline has the potential to serve as a foundation for developing tools to design precision medicines. However, due to the higher costs and expertise required for such experiments, its translation into a regular clinical practice might be challenging. Despite implementing modern deep learning to enhance information obtained from histological images using AI, efforts have been constrained by limitations in the diversity of information. In this paper, we developed a model, HistoSPACE, that explores the diversity of histological images available with ST data to extract molecular insights from tissue images. Further, our approach allows us to link the predicted expression with disease pathology. Our proposed study built an image encoder derived from a universal image autoencoder. This image encoder was connected to convolution blocks to build the final model. It was further fine-tuned with the help of ST-Data. The number of model parameters is small and requires lesser system memory and relatively lesser training time. Making it lightweight in comparison to traditional histological models. Our developed model demonstrates significant efficiency compared to contemporary algorithms, revealing a correlation of 0.56 in leave-one-out cross-validation. Finally, its robustness was validated through an independent dataset, showing similar prediction with predefined disease pathology. Our code is available at <span><span>https://github.com/samrat-lab/HistoSPACE</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":390,"journal":{"name":"Methods","volume":"232 ","pages":"Pages 107-114"},"PeriodicalIF":4.2,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142611442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development and validation of a machine learning model for predicting drug-drug interactions with oral diabetes medications 开发并验证用于预测糖尿病口服药物药物相互作用的机器学习模型。
IF 4.2 3区 生物学
Methods Pub Date : 2024-11-01 DOI: 10.1016/j.ymeth.2024.10.012
Quang-Hien Kha , Ngan Thi Kim Nguyen , Nguyen Quoc Khanh Le , Jiunn-Horng Kang
{"title":"Development and validation of a machine learning model for predicting drug-drug interactions with oral diabetes medications","authors":"Quang-Hien Kha ,&nbsp;Ngan Thi Kim Nguyen ,&nbsp;Nguyen Quoc Khanh Le ,&nbsp;Jiunn-Horng Kang","doi":"10.1016/j.ymeth.2024.10.012","DOIUrl":"10.1016/j.ymeth.2024.10.012","url":null,"abstract":"<div><div>Diabetes management is often complicated by comorbidities, requiring complex medication regimens that increase the risk of drug-drug interactions (DDIs), potentially compromising treatment outcomes or causing toxicity. Although machine learning (ML) models have made strides in DDI prediction, existing approaches lack specificity for oral diabetes medications and face challenges in interpretability. To address these limitations, we propose a novel ML-based framework utilizing the Simplified Molecular Input Line Entry System (SMILES) to encode structural information of oral diabetes drugs. Using this representation, we developed an XGBoost model, selecting molecular features through LASSO. Our dataset, sourced from DrugBank, included 42 oral diabetes drugs and 1,884 interacting drugs, divided into training, validation, and testing sets. The model identified 606 optimal features, achieving an F1-score of 0.8182. SHAP analysis was employed for feature interpretation, enhancing model transparency and clinical relevance. By predicting adverse DDIs, our model offers a valuable tool for clinical decision-making, aiding safer prescription practices. The 606 critical features provide insights into atomic-level interactions, linking computational predictions with biological experiments. We present a classification model specifically designed for predicting DDIs associated with oral diabetes medications, with an openly accessible web application to support diabetes management in multi-drug regimens and comorbidity settings.</div></div>","PeriodicalId":390,"journal":{"name":"Methods","volume":"232 ","pages":"Pages 81-88"},"PeriodicalIF":4.2,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142566761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development of novel digital PCR assays for the rapid quantification of Gram-negative bacteria biomarkers using RUCS algorithm 利用 RUCS 算法开发用于快速量化革兰氏阴性菌生物标志物的新型数字 PCR 检测方法。
IF 4.2 3区 生物学
Methods Pub Date : 2024-10-30 DOI: 10.1016/j.ymeth.2024.10.011
Alexandra Bogožalec Košir , Špela Alič , Viktorija Tomič , Dane Lužnik , Tanja Dreo , Mojca Milavec
{"title":"Development of novel digital PCR assays for the rapid quantification of Gram-negative bacteria biomarkers using RUCS algorithm","authors":"Alexandra Bogožalec Košir ,&nbsp;Špela Alič ,&nbsp;Viktorija Tomič ,&nbsp;Dane Lužnik ,&nbsp;Tanja Dreo ,&nbsp;Mojca Milavec","doi":"10.1016/j.ymeth.2024.10.011","DOIUrl":"10.1016/j.ymeth.2024.10.011","url":null,"abstract":"<div><div>Rapid and accurate identification of bacterial pathogens is crucial for effective treatment and infection control, particularly in hospital settings. Conventional methods like culture techniques and MALDI-TOF mass spectrometry are often time-consuming and less sensitive. This study addresses the need for faster and more precise diagnostic methods by developing novel digital PCR (dPCR) assays for the rapid quantification of biomarkers from three Gram-negative bacteria: <em>Acinetobacter baumannii</em>, <em>Klebsiella pneumoniae</em>, and <em>Pseudomonas aeruginosa</em>.</div><div>Utilizing publicly available genomes and the <em>rapid identification of PCR primers for unique core sequences</em> or RUCS algorithm, we designed highly specific dPCR assays. These assays were validated using synthetic DNA, bacterial genomic DNA, and DNA extracted from clinical samples. The developed dPCR methods demonstrated wide linearity, a low limit of detection (∼30 copies per reaction), and robust analytical performance with measurement uncertainty below 25 %. The assays showed high repeatability and intermediate precision, with no cross-reactivity observed. Comparison with MALDI-TOF mass spectrometry revealed substantial concordance, highlighting the methods’ suitability for clinical diagnostics.</div><div>This study underscores the potential of dPCR for rapid and precise quantification of Gram-negative bacterial biomarkers. The developed methods offer significant improvements over existing techniques, providing faster, more accurate, and SI-traceable measurements. These advancements could enhance clinical diagnostics and infection control practices.</div></div>","PeriodicalId":390,"journal":{"name":"Methods","volume":"232 ","pages":"Pages 72-80"},"PeriodicalIF":4.2,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142556807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MLFA-UNet: A multi-level feature assembly UNet for medical image segmentation MLFA-UNet:用于医学图像分割的多层次特征组合 UNet。
IF 4.2 3区 生物学
Methods Pub Date : 2024-10-29 DOI: 10.1016/j.ymeth.2024.10.010
Anass Garbaz , Yassine Oukdach , Said Charfi , Mohamed El Ansari , Lahcen Koutti , Mouna Salihoun
{"title":"MLFA-UNet: A multi-level feature assembly UNet for medical image segmentation","authors":"Anass Garbaz ,&nbsp;Yassine Oukdach ,&nbsp;Said Charfi ,&nbsp;Mohamed El Ansari ,&nbsp;Lahcen Koutti ,&nbsp;Mouna Salihoun","doi":"10.1016/j.ymeth.2024.10.010","DOIUrl":"10.1016/j.ymeth.2024.10.010","url":null,"abstract":"<div><div>Medical image segmentation is crucial for accurate diagnosis and treatment in medical image analysis. Among the various methods employed, fully convolutional networks (FCNs) have emerged as a prominent approach for segmenting medical images. Notably, the U-Net architecture and its variants have gained widespread adoption in this domain. This paper introduces MLFA-UNet, an innovative architectural framework aimed at advancing medical image segmentation. MLFA-UNet adopts a U-shaped architecture and integrates two pivotal modules: multi-level feature assembly (MLFA) and multi-scale information attention (MSIA), complemented by a pixel-vanishing (PV) attention mechanism. These modules synergistically contribute to the segmentation process enhancement, fostering both robustness and segmentation precision. MLFA operates within both the network encoder and decoder, facilitating the extraction of local information crucial for accurately segmenting lesions. Furthermore, the bottleneck MSIA module serves to replace stacking modules, thereby expanding the receptive field and augmenting feature diversity, fortified by the PV attention mechanism. These integrated mechanisms work together to boost segmentation performance by effectively capturing both detailed local features and a broader range of contextual information, enhancing both accuracy and resilience in identifying lesions. To assess the versatility of the network, we conducted evaluations of MFLA-UNet across a range of medical image segmentation datasets, encompassing diverse imaging modalities such as wireless capsule endoscopy (WCE), colonoscopy, and dermoscopic images. Our results consistently demonstrate that MFLA-UNet outperforms state-of-the-art algorithms, achieving dice coefficients of 91.42%, 82.43%, 90.8%, and 88.68% for the MICCAI 2017 (Red Lesion), ISIC 2017, PH2, and CVC-ClinicalDB datasets, respectively.</div></div>","PeriodicalId":390,"journal":{"name":"Methods","volume":"232 ","pages":"Pages 52-64"},"PeriodicalIF":4.2,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142556808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing Arabidopsis thaliana ubiquitination site prediction through knowledge distillation and natural language processing 通过知识提炼和自然语言处理提高拟南芥泛素化位点预测能力
IF 4.2 3区 生物学
Methods Pub Date : 2024-10-22 DOI: 10.1016/j.ymeth.2024.10.006
Van-Nui Nguyen , Thi-Xuan Tran , Thi-Tuyen Nguyen , Nguyen Quoc Khanh Le
{"title":"Enhancing Arabidopsis thaliana ubiquitination site prediction through knowledge distillation and natural language processing","authors":"Van-Nui Nguyen ,&nbsp;Thi-Xuan Tran ,&nbsp;Thi-Tuyen Nguyen ,&nbsp;Nguyen Quoc Khanh Le","doi":"10.1016/j.ymeth.2024.10.006","DOIUrl":"10.1016/j.ymeth.2024.10.006","url":null,"abstract":"<div><div>Protein ubiquitination is a critical post-translational modification (PTM) involved in diverse biological processes and plays a pivotal role in regulating physiological mechanisms and disease states. Despite various efforts to develop ubiquitination site prediction tools across species, these tools mainly rely on predefined sequence features and machine learning algorithms, with species-specific variations in ubiquitination patterns remaining poorly understood. This study introduces a novel approach for predicting <em>Arabidopsis thaliana</em> ubiquitination sites using a neural network model based on knowledge distillation and natural language processing (NLP) of protein sequences. Our framework employs a multi-species “Teacher model” to guide a more compact, species-specific “Student model”, with the “Teacher” generating pseudo-labels that enhance the “Student” learning and prediction robustness. Cross-validation results demonstrate that our model achieves superior performance, with an accuracy of 86.3 % and an area under the curve (AUC) of 0.926, while independent testing confirmed these results with an accuracy of 86.3 % and an AUC of 0.923. Comparative analysis with established predictors further highlights the model’s superiority, emphasizing the effectiveness of integrating knowledge distillation and NLP in ubiquitination prediction tasks. This study presents a promising and efficient approach for ubiquitination site prediction, offering valuable insights for researchers in related fields. The code and resources are available on GitHub: <span><span>https://github.com/nuinvtnu/KD_ArapUbi</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":390,"journal":{"name":"Methods","volume":"232 ","pages":"Pages 65-71"},"PeriodicalIF":4.2,"publicationDate":"2024-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142492289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development and validation of a new and rapid molecular diagnostic tool based on RT-LAMP for Hepatitis C virus detection at point-of-care 开发并验证基于 RT-LAMP 的新型快速分子诊断工具,用于在护理点检测丙型肝炎病毒。
IF 4.2 3区 生物学
Methods Pub Date : 2024-10-22 DOI: 10.1016/j.ymeth.2024.10.008
Sonia Arca-Lafuente , Cristina Yépez-Notario , Pablo Cea-Callejo , Violeta Lara-Aguilar , Celia Crespo-Bermejo , Luz Martín-Carbonero , Ignacio de los Santos , Verónica Briz , Ricardo Madrid
{"title":"Development and validation of a new and rapid molecular diagnostic tool based on RT-LAMP for Hepatitis C virus detection at point-of-care","authors":"Sonia Arca-Lafuente ,&nbsp;Cristina Yépez-Notario ,&nbsp;Pablo Cea-Callejo ,&nbsp;Violeta Lara-Aguilar ,&nbsp;Celia Crespo-Bermejo ,&nbsp;Luz Martín-Carbonero ,&nbsp;Ignacio de los Santos ,&nbsp;Verónica Briz ,&nbsp;Ricardo Madrid","doi":"10.1016/j.ymeth.2024.10.008","DOIUrl":"10.1016/j.ymeth.2024.10.008","url":null,"abstract":"<div><h3>Purpose</h3><div>Globally, it is estimated that 1.0 million individuals are newly infected by Hepatitis C virus (HCV) every year, and nearly 50 million people live with a chronic infection, according to World Health Organization. To overcome underdiagnosis of HCV infection among hard-to-reach populations, it is essential to develop new rapid and easy-to-use molecular diagnostic systems. In this work, we have developed a pangenotypic diagnostic tool based on Loop-Mediated Isothermal Amplification (LAMP), coupled to a direct sample lysis procedure for molecular detection of HCV at point-of-care (POC).</div></div><div><h3>Methods</h3><div>Procedure validation was performed using 129 different samples from HCV infected patients (116 serum samples, and 13 fresh blood samples), 27 individuals who tested negative for HCV but positive for HIV, and 11 healthy donors. Serum was collected, lysed for 10 min at room temperature, and assayed by RT-LAMP. To achieve this, a set of 9 LAMP-primers was used for the first time. Parallel RT-qPCR assays were conducted for HCV to both validate the procedure and quantify viral loads.</div></div><div><h3>Results</h3><div>HCV was detected by RT-LAMP in 109/116 HCV positive serum samples, and in 11/13 positive blood samples in less than 40 min. Compared to RT-qPCR results, our RT-LAMP procedure showed a sensitivity of 94 %, 100 % specificity, and a limit of detection of 3.26 log<sub>10</sub> IU/mL (10–20 copies per reaction).</div></div><div><h3>Conclusions</h3><div>We have developed an accurate system, more affordable than the current available rapid tests for HCV. Since no prior RNA purification step from capillary blood is required, we strongly recommend our RT-LAMP system as a valuable and rapid tool for the molecular detection of HCV at POC.</div></div>","PeriodicalId":390,"journal":{"name":"Methods","volume":"232 ","pages":"Pages 43-51"},"PeriodicalIF":4.2,"publicationDate":"2024-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142492288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HLA-DR4Pred2: An improved method for predicting HLA-DRB1*04:01 binders HLA-DR4Pred2:预测 HLA-DRB1*04:01 结合者的改进方法。
IF 4.2 3区 生物学
Methods Pub Date : 2024-10-19 DOI: 10.1016/j.ymeth.2024.10.007
Sumeet Patiyal , Anjali Dhall , Nishant Kumar , Gajendra P.S. Raghava
{"title":"HLA-DR4Pred2: An improved method for predicting HLA-DRB1*04:01 binders","authors":"Sumeet Patiyal ,&nbsp;Anjali Dhall ,&nbsp;Nishant Kumar ,&nbsp;Gajendra P.S. Raghava","doi":"10.1016/j.ymeth.2024.10.007","DOIUrl":"10.1016/j.ymeth.2024.10.007","url":null,"abstract":"<div><div>HLA-DRB1*04:01 is associated with numerous diseases, including sclerosis, arthritis, diabetes, and COVID-19, emphasizing the need to scan for binders in the antigens to develop immunotherapies and vaccines. Current prediction methods are often limited by their reliance on the small datasets. This study presents HLA-DR4Pred2, developed on a large dataset containing 12,676 binders and an equal number of non-binders. It’s an improved version of HLA-DR4Pred, which was trained on a small dataset, containing 576 binders and an equal number of non-binders. All models were trained, optimized, and tested on 80 % of the data using five-fold cross-validation and evaluated on the remaining 20 %. A range of machine learning techniques was employed, achieving maximum AUROC of 0.90 and 0.87, using composition and binary profile features, respectively. The performance of the composition-based model increased to 0.93, when combined with BLAST search. Additionally, models developed on the realistic dataset containing 12,676 binders and 86,300 non-binders, achieved a maximum AUROC of 0.99. Our proposed method outperformed existing methods when we compared the performance of our best model to that of existing methods on the independent dataset. Finally, we developed a standalone tool and a webserver for HLADR4Pred2, enabling the prediction, design, and virtual scanning of HLA-DRB1*04:01 binding peptides, and we also released a Python package available on the Python Package Index (<span><span>https://webs.iiitd.edu.in/raghava/hladr4pred2/</span><svg><path></path></svg></span>; <span><span>https://github.com/raghavagps/hladr4pred2</span><svg><path></path></svg></span>; <span><span>https://pypi.org/project/hladr4pred2/</span><svg><path></path></svg></span>).</div></div>","PeriodicalId":390,"journal":{"name":"Methods","volume":"232 ","pages":"Pages 18-28"},"PeriodicalIF":4.2,"publicationDate":"2024-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142455002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A heterogeneous graph transformer framework for accurate cancer driver gene prediction and downstream analysis 用于准确预测癌症驱动基因和下游分析的异构图转换器框架
IF 4.2 3区 生物学
Methods Pub Date : 2024-10-18 DOI: 10.1016/j.ymeth.2024.09.018
Shuwen Xiong , Junming Zhang , Hong Luo , Yongqing Zhang , Qinyin Xiao
{"title":"A heterogeneous graph transformer framework for accurate cancer driver gene prediction and downstream analysis","authors":"Shuwen Xiong ,&nbsp;Junming Zhang ,&nbsp;Hong Luo ,&nbsp;Yongqing Zhang ,&nbsp;Qinyin Xiao","doi":"10.1016/j.ymeth.2024.09.018","DOIUrl":"10.1016/j.ymeth.2024.09.018","url":null,"abstract":"<div><div>Accurately predicting cancer driver genes remains a formidable challenge amidst the burgeoning volume and intricacy of cancer genomic data. In this investigation, we propose HGTDG, an innovative heterogeneous graph transformer framework tailored for precisely predicting cancer driver genes and exploring downstream tasks. A heterogeneous graph construction module is central to the framework, which assembles a gene-protein heterogeneous network leveraging the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways and protein-protein interactions sourced from the STRING (search tool for recurring instances of neighboring genes) database. Moreover, our framework introduces a pioneering heterogeneous graph transformer module, harnessing multi-head attention mechanisms for nuanced node embedding. This transformative module proficiently captures distinct representations for both nodes and edges, thereby enriching the model's predictive capacity. Subsequently, the generated node embeddings are seamlessly integrated into a classification module, facilitating the discrimination between driver and non-driver genes. Our experimental findings evince the superiority of HGTDG over existing methodologies, as evidenced by the enhanced performance metrics, including the area under the receiver operating characteristic curves (AUROC) and the area under the precision-recall curves (AUPRC). Furthermore, the downstream analysis utilizing the newly identified cancer driver genes underscores the efficacy and versatility of our proposed framework.</div></div>","PeriodicalId":390,"journal":{"name":"Methods","volume":"232 ","pages":"Pages 9-17"},"PeriodicalIF":4.2,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142455001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信