Interdisciplinary Sciences: Computational Life Sciences最新文献

筛选
英文 中文
PLMC: Language Model of Protein Sequences Enhances Protein Crystallization Prediction. PLMC:蛋白质序列语言模型增强蛋白质结晶预测。
IF 3.9 2区 生物学
Interdisciplinary Sciences: Computational Life Sciences Pub Date : 2024-12-01 Epub Date: 2024-08-19 DOI: 10.1007/s12539-024-00639-6
Dapeng Xiong, Kaicheng U, Jianfeng Sun, Adam P Cribbs
{"title":"PLMC: Language Model of Protein Sequences Enhances Protein Crystallization Prediction.","authors":"Dapeng Xiong, Kaicheng U, Jianfeng Sun, Adam P Cribbs","doi":"10.1007/s12539-024-00639-6","DOIUrl":"10.1007/s12539-024-00639-6","url":null,"abstract":"<p><p>X-ray diffraction crystallography has been most widely used for protein three-dimensional (3D) structure determination for which whether proteins are crystallizable is a central prerequisite. Yet, there are a number of procedures during protein crystallization, including protein material production, purification, and crystal production, which take turns affecting the crystallization outcome. Due to the expensive and laborious nature of this multi-stage process, various computational tools have been developed to predict protein crystallization propensity, which is then used to guide the experimental determination. In this study, we presented a novel deep learning framework, PLMC, to improve multi-stage protein crystallization propensity prediction by leveraging a pre-trained protein language model. To effectively train PLMC, two groups of features of each protein were integrated into a more comprehensive representation, including protein language embeddings from the large-scale protein sequence database and a handcrafted feature set consisting of physicochemical, sequence-based and disordered-related information. These features were further separately embedded for refinement, and then concatenated for the final prediction. Notably, our extensive benchmarking tests demonstrate that PLMC greatly outperforms other state-of-the-art methods by achieving AUC scores of 0.773, 0.893, and 0.913, respectively, at the aforementioned individual stages, and 0.982 at the final crystallization stage. Furthermore, PLMC is shown to be superior for predicting the crystallization of both globular and membrane proteins, as demonstrated by an AUC score of 0.991 for the latter. These results suggest the significant potential of PLMC in assisting researchers with the experimental design of crystallizable protein variants.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"802-813"},"PeriodicalIF":3.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141999874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Predicting Disease-Metabolite Associations Based on the Metapath Aggregation of Tripartite Heterogeneous Networks. 基于三方异构网络元路径聚合的疾病-代谢物关联预测
IF 3.9 2区 生物学
Interdisciplinary Sciences: Computational Life Sciences Pub Date : 2024-12-01 Epub Date: 2024-08-07 DOI: 10.1007/s12539-024-00645-8
Wenzhi Liu, Pengli Lu
{"title":"Predicting Disease-Metabolite Associations Based on the Metapath Aggregation of Tripartite Heterogeneous Networks.","authors":"Wenzhi Liu, Pengli Lu","doi":"10.1007/s12539-024-00645-8","DOIUrl":"10.1007/s12539-024-00645-8","url":null,"abstract":"<p><p>The exploration of the interactions between diseases and metabolites holds significant implications for the diagnosis and treatment of diseases. However, traditional experimental methods are time-consuming and costly, and current computational methods often overlook the influence of other biological entities on both. In light of these limitations, we proposed a novel deep learning model based on metapath aggregation of tripartite heterogeneous networks (MAHN) to explore disease-related metabolites. Specifically, we introduced microbes to construct a tripartite heterogeneous network and employed graph convolutional network and enhanced GraphSAGE to learn node features with metapath length 3. Additionally, we utilized node-level and semantic-level attention mechanisms, a more granular approach, to aggregate node features with metapath length 2. Finally, the reconstructed association probability is obtained by fusing features from different metapaths into the bilinear decoder. The experiments demonstrate that the proposed MAHN model achieved superior performance in five-fold cross-validation with Acc (91.85%), Pre (90.48%), Recall (93.53%), F1 (91.94%), AUC (97.39%), and AUPR (97.47%), outperforming four state-of-the-art algorithms. Case studies on two complex diseases, irritable bowel syndrome and obesity, further validate the predictive results, and the MAHN model is a trustworthy prediction tool for discovering potential metabolites. Moreover, deep learning models integrating multi-omics data represent the future mainstream direction for predicting disease-related biological entities.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"829-843"},"PeriodicalIF":3.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141901633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Viral Rebound After Antiviral Treatment: A Mathematical Modeling Study of the Role of Antiviral Mechanism of Action. 抗病毒治疗后的病毒反弹:抗病毒作用机制的数学模型研究。
IF 3.9 2区 生物学
Interdisciplinary Sciences: Computational Life Sciences Pub Date : 2024-12-01 Epub Date: 2024-07-21 DOI: 10.1007/s12539-024-00643-w
Aubrey Chiarelli, Hana Dobrovolny
{"title":"Viral Rebound After Antiviral Treatment: A Mathematical Modeling Study of the Role of Antiviral Mechanism of Action.","authors":"Aubrey Chiarelli, Hana Dobrovolny","doi":"10.1007/s12539-024-00643-w","DOIUrl":"10.1007/s12539-024-00643-w","url":null,"abstract":"<p><p>The development of antiviral treatments for SARS-CoV-2 was an important turning point for the pandemic. Availability of safe and effective antivirals has allowed people to return back to normal life. While SARS-CoV-2 antivirals are highly effective at preventing severe disease, there have been concerning reports of viral rebound in some patients after cessation of antiviral treatment. In this study, we use a mathematical model of viral infection to study the potential of different antivirals to prevent viral rebound. We find that antivirals that block production are most likely to result in viral rebound if the treatment time course is not sufficiently long. Since these antivirals do not prevent infection of cells, cells continue to be infected during treatment. When treatment is stopped, the infected cells will begin producing virus at the usual rate. Antivirals that prevent infection of cells are less likely to result in viral rebound since cells are not being infected during treatment. This study highlights the role of antiviral mechanism of action in increasing or reducing the probability of viral rebound.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"844-853"},"PeriodicalIF":3.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141734033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FPJA-Net: A Lightweight End-to-End Network for Sleep Stage Prediction Based on Feature Pyramid and Joint Attention. FPJA-Net:基于特征金字塔和联合注意力的轻量级端到端睡眠阶段预测网络
IF 3.9 2区 生物学
Interdisciplinary Sciences: Computational Life Sciences Pub Date : 2024-12-01 Epub Date: 2024-08-19 DOI: 10.1007/s12539-024-00636-9
Zhi Liu, Qinhan Zhang, Sixin Luo, Meiqiao Qin
{"title":"FPJA-Net: A Lightweight End-to-End Network for Sleep Stage Prediction Based on Feature Pyramid and Joint Attention.","authors":"Zhi Liu, Qinhan Zhang, Sixin Luo, Meiqiao Qin","doi":"10.1007/s12539-024-00636-9","DOIUrl":"10.1007/s12539-024-00636-9","url":null,"abstract":"<p><p>Sleep staging is the most crucial work before diagnosing and treating sleep disorders. Traditional manual sleep staging is time-consuming and depends on the skill of experts. Nowadays, automatic sleep staging based on deep learning attracts more and more scientific researchers. As we know, the salient waves in sleep signals contain the most important information for automatic sleep staging. However, the key information is not fully utilized in existing deep learning methods since most of them only use CNN or RNN which could not capture multi-scale features in salient waves effectively. To tackle this limitation, we propose a lightweight end-to-end network for sleep stage prediction based on feature pyramid and joint attention. The feature pyramid module is designed to effectively extract multi-scale features in salient waves, and these features are then fed to the joint attention module to closely attend to the channel and location information of the salient waves. The proposed network has much fewer parameters and significant performance improvement, which is better than the state-of-the-art results. The overall accuracy and macro F1 score on the public dataset Sleep-EDF39, Sleep-EDF153 and SHHS are 90.1%, 87.8%, 87.4%, 84.4% and 86.9%, 83.9%, respectively. Ablation experiments confirm the effectiveness of each module.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"769-780"},"PeriodicalIF":3.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141999873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Function-Genes and Disease-Genes Prediction Based on Network Embedding and One-Class Classification. 基于网络嵌入和单类分类的功能基因和疾病基因预测
IF 3.9 2区 生物学
Interdisciplinary Sciences: Computational Life Sciences Pub Date : 2024-12-01 Epub Date: 2024-09-04 DOI: 10.1007/s12539-024-00638-7
Weiyu Shi, Yan Zhang, Yeqing Sun, Zhengkui Lin
{"title":"Function-Genes and Disease-Genes Prediction Based on Network Embedding and One-Class Classification.","authors":"Weiyu Shi, Yan Zhang, Yeqing Sun, Zhengkui Lin","doi":"10.1007/s12539-024-00638-7","DOIUrl":"10.1007/s12539-024-00638-7","url":null,"abstract":"<p><p>Using genes which have been experimentally-validated for diseases (functions) can develop machine learning methods to predict new disease/function-genes. However, the prediction of both function-genes and disease-genes faces the same problem: there are only certain positive examples, but no negative examples. To solve this problem, we proposed a function/disease-genes prediction algorithm based on network embedding (Variational Graph Auto-Encoders, VGAE) and one-class classification (Fast Minimum Covariance Determinant, Fast-MCD): VGAEMCD. Firstly, we constructed a protein-protein interaction (PPI) network centered on experimentally-validated genes; then VGAE was used to get the embeddings of nodes (genes) in the network; finally, the embeddings were input into the improved deep learning one-class classifier based on Fast-MCD to predict function/disease-genes. VGAEMCD can predict function-gene and disease-gene in a unified way, and only the experimentally-verified genes are needed to provide (no need for expression profile). VGAEMCD outperforms classical one-class classification algorithms in Recall, Precision, F-measure, Specificity, and Accuracy. Further experiments show that seven metrics of VGAEMCD are higher than those of state-of-art function/disease-genes prediction algorithms. The above results indicate that VGAEMCD can well learn the distribution characteristics of positive examples and accurately identify function/disease-genes.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"781-801"},"PeriodicalIF":3.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142125655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adap-BDCM: Adaptive Bilinear Dynamic Cascade Model for Classification Tasks on CNV Datasets. Adap-BDCM:用于 CNV 数据集分类任务的自适应双线性动态级联模型。
IF 3.9 2区 生物学
Interdisciplinary Sciences: Computational Life Sciences Pub Date : 2024-12-01 Epub Date: 2024-05-17 DOI: 10.1007/s12539-024-00635-w
Liancheng Jiang, Liye Jia, Yizhen Wang, Yongfei Wu, Junhong Yue
{"title":"Adap-BDCM: Adaptive Bilinear Dynamic Cascade Model for Classification Tasks on CNV Datasets.","authors":"Liancheng Jiang, Liye Jia, Yizhen Wang, Yongfei Wu, Junhong Yue","doi":"10.1007/s12539-024-00635-w","DOIUrl":"10.1007/s12539-024-00635-w","url":null,"abstract":"<p><p>Copy number variation (CNV) is an essential genetic driving factor of cancer formation and progression, making intelligent classification based on CNV feasible. However, there are a few challenges in the current machine learning and deep learning methods, such as the design of base classifier combination schemes in ensemble methods and the selection of layers of neural networks, which often result in low accuracy. Therefore, an adaptive bilinear dynamic cascade model (Adap-BDCM) is developed to further enhance the accuracy and applicability of these methods for intelligent classification on CNV datasets. In this model, a feature selection module is introduced to mitigate the interference of redundant information, and a bilinear model based on the gated attention mechanism is proposed to extract more beneficial deep fusion features. Furthermore, an adaptive base classifier selection scheme is designed to overcome the difficulty of manually designing base classifier combinations and enhance the applicability of the model. Lastly, a novel feature fusion scheme with an attribute recall submodule is constructed, effectively avoiding getting stuck in local solutions and missing some valuable information. Numerous experiments have demonstrated that our Adap-BDCM model exhibits optimal performance in cancer classification, stage prediction, and recurrence on CNV datasets. This study can assist physicians in making diagnoses faster and better.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"1019-1037"},"PeriodicalIF":3.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140956486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CVGAE: A Self-Supervised Generative Method for Gene Regulatory Network Inference Using Single-Cell RNA Sequencing Data. CVGAE:利用单细胞 RNA 测序数据进行基因调控网络推断的自监督生成方法。
IF 3.9 2区 生物学
Interdisciplinary Sciences: Computational Life Sciences Pub Date : 2024-12-01 Epub Date: 2024-05-23 DOI: 10.1007/s12539-024-00633-y
Wei Liu, Zhijie Teng, Zejun Li, Jing Chen
{"title":"CVGAE: A Self-Supervised Generative Method for Gene Regulatory Network Inference Using Single-Cell RNA Sequencing Data.","authors":"Wei Liu, Zhijie Teng, Zejun Li, Jing Chen","doi":"10.1007/s12539-024-00633-y","DOIUrl":"10.1007/s12539-024-00633-y","url":null,"abstract":"<p><p>Gene regulatory network (GRN) inference based on single-cell RNA sequencing data (scRNAseq) plays a crucial role in understanding the regulatory mechanisms between genes. Various computational methods have been employed for GRN inference, but their performance in terms of network accuracy and model generalization is not satisfactory, and their poor performance is caused by high-dimensional data and network sparsity. In this paper, we propose a self-supervised method for gene regulatory network inference using single-cell RNA sequencing data (CVGAE). CVGAE uses graph neural network for inductive representation learning, which merges gene expression data and observed topology into a low-dimensional vector space. The well-trained vectors will be used to calculate mathematical distance of each gene, and further predict interactions between genes. In overall framework, FastICA is implemented to relief computational complexity caused by high dimensional data, and CVGAE adopts multi-stacked GraphSAGE layers as an encoder and an improved decoder to overcome network sparsity. CVGAE is evaluated on several single cell datasets containing four related ground-truth networks, and the result shows that CVGAE achieve better performance than comparative methods. To validate learning and generalization capabilities, CVGAE is applied in few-shot environment by change the ratio of train set and test set. In condition of few-shot, CVGAE obtains comparable or superior performance.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"990-1004"},"PeriodicalIF":3.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141081107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Pragmatic Approach to Fetal Monitoring via Cardiotocography Using Feature Elimination and Hyperparameter Optimization. 利用特征消除和超参数优化通过心脏排出图监测胎儿的实用方法
IF 3.9 2区 生物学
Interdisciplinary Sciences: Computational Life Sciences Pub Date : 2024-12-01 Epub Date: 2024-10-05 DOI: 10.1007/s12539-024-00647-6
Fırat Hardalaç, Haad Akmal, Kubilay Ayturan, U Rajendra Acharya, Ru-San Tan
{"title":"A Pragmatic Approach to Fetal Monitoring via Cardiotocography Using Feature Elimination and Hyperparameter Optimization.","authors":"Fırat Hardalaç, Haad Akmal, Kubilay Ayturan, U Rajendra Acharya, Ru-San Tan","doi":"10.1007/s12539-024-00647-6","DOIUrl":"10.1007/s12539-024-00647-6","url":null,"abstract":"<p><p>Cardiotocography (CTG) is used to assess the health of the fetus during birth or antenatally in the third trimester. It concurrently detects the maternal uterine contractions (UC) and fetal heart rate (FHR). Fetal distress, which may require therapeutic intervention, can be diagnosed using baseline FHR and its reaction to uterine contractions. Using CTG, a pragmatic machine learning strategy based on feature reduction and hyperparameter optimization was suggested in this study to classify the various fetal states (Normal, Suspect, Pathological). An application of this strategy can be a decision support tool to manage pregnancies. On a public dataset of 2126 CTG recordings, the model was assessed using various standard CTG dataset specific and relevant classifiers. The classifiers' accuracy was improved by the proposed method. The model accuracy was increased to 97.20% while using Random Forest (best classifier). Practically speaking, the model was able to correctly predict 100% of all pathological cases and 98.8% of all normal cases in the dataset. The proposed model was also implemented on another public CTG dataset having 552 CTG signals, resulting in a 97.34% accuracy. If integrated with telemedicine, this proposed model could also be used for long-distance \"stay at home\" fetal monitoring in high-risk pregnancies.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"882-906"},"PeriodicalIF":3.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142377867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Integrating HRMAS-NMR Data and Machine Learning-Assisted Profiling of Metabolite Fluxes to Classify Low- and High-Grade Gliomas. 整合 HRMAS-NMR 数据和机器学习辅助的代谢通量分析,对低级别和高级别胶质瘤进行分类。
IF 3.9 2区 生物学
Interdisciplinary Sciences: Computational Life Sciences Pub Date : 2024-12-01 Epub Date: 2024-09-27 DOI: 10.1007/s12539-024-00642-x
Safia Firdous, Zubair Nawaz, Rizwan Abid, Leo L Cheng, Syed Ghulam Musharraf, Saima Sadaf
{"title":"Integrating HRMAS-NMR Data and Machine Learning-Assisted Profiling of Metabolite Fluxes to Classify Low- and High-Grade Gliomas.","authors":"Safia Firdous, Zubair Nawaz, Rizwan Abid, Leo L Cheng, Syed Ghulam Musharraf, Saima Sadaf","doi":"10.1007/s12539-024-00642-x","DOIUrl":"10.1007/s12539-024-00642-x","url":null,"abstract":"<p><p>Diagnosing and classifying central nervous system tumors such as gliomas or glioblastomas pose a significant challenge due to their aggressive and infiltrative nature. However, recent advancements in metabolomics and magnetic resonance spectroscopy (MRS) offer promising avenues for differentiating tumor grades both in vivo and ex vivo. This study aimed to explore tissue-based metabolic signatures to classify/distinguish between low- and high-grade gliomas. Forty-six histologically confirmed, intact solid tumor samples from glioma patients were analyzed using high-resolution magic angle spinning nuclear magnetic resonance (HRMAS-NMR) spectroscopy. By integrating machine learning (ML) algorithms, spectral regions with the most discriminative potential were identified. Validation was performed through univariate and multivariate statistical analyses, along with HRMAS-NMR analyses of 46 paired plasma samples. Amongst the various ML models applied, the logistics regression identified 46 spectral regions capable of sub-classifying gliomas with accuracy 87% (F1-measure 0.87, Precision 0.82, Recall 0.93), whereas the extra-tree classifier identified three spectral regions with predictive accuracy of 91% (F1-measure 0.91, Precision 0.85, Recall 0.97). Wilcoxon test presented 51 spectral regions significantly differentiating low- and high-grade glioma groups (p < 0.05). Based on sensitivity and area under the curve values, 40 spectral regions corresponding to 18 metabolites were considered as potential biomarkers for tissue-based glioma classification and amongst these N-acetyl aspartate, glutamate, and glutamine emerged as the most important markers. These markers were validated in paired plasma samples, and their absolute concentrations were computed. Our results demonstrate that the metabolic markers identified through the HRMAS-NMR-ML analysis framework, and their associated metabolic networks, hold promise for targeted treatment planning and clinical interventions in the future.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"854-871"},"PeriodicalIF":3.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142346019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Predicting Promoters in Multiple Prokaryotes with Prompt. 利用 Prompt 预测多种原核生物的启动子
IF 3.9 2区 生物学
Interdisciplinary Sciences: Computational Life Sciences Pub Date : 2024-12-01 Epub Date: 2024-08-07 DOI: 10.1007/s12539-024-00637-8
Qimeng Du, Yixue Guo, Junpeng Zhang, Fuping Lu, Chong Peng, Chichun Zhou
{"title":"Predicting Promoters in Multiple Prokaryotes with Prompt.","authors":"Qimeng Du, Yixue Guo, Junpeng Zhang, Fuping Lu, Chong Peng, Chichun Zhou","doi":"10.1007/s12539-024-00637-8","DOIUrl":"10.1007/s12539-024-00637-8","url":null,"abstract":"<p><p>Promoters are important cis-regulatory elements for the regulation of gene expression, and their accurate predictions are crucial for elucidating the biological functions and potential mechanisms of genes. Many previous prokaryotic promoter prediction methods are encouraging in terms of the prediction performance, but most of them focus on the recognition of promoters in only one or a few bacterial species. Moreover, due to ignoring the promoter sequence motifs, the interpretability of predictions with existing methods is limited. In this work, we present a generalized method Prompt (Promoters in multiple prokaryotes) to predict promoters in 16 prokaryotes and improve the interpretability of prediction results. Prompt integrates three methods including RSK (Regression based on Selected k-mer), CL (Contrastive Learning) and MLP (Multilayer Perception), and employs a voting strategy to divide the datasets into high-confidence and low-confidence categories. Results on the promoter prediction tasks in 16 prokaryotes show that the accuracy (Accuracy, Matthews correlation coefficient) of Prompt is greater than 80% in highly credible datasets of 16 prokaryotes, and is greater than 90% in 12 prokaryotes, and Prompt performs the best compared with other existing methods. Moreover, by identifying promoter sequence motifs, Prompt can improve the interpretability of the predictions. Prompt is freely available at https://github.com/duqimeng/PromptPrompt , and will contribute to the research of promoters in prokaryote.</p>","PeriodicalId":13670,"journal":{"name":"Interdisciplinary Sciences: Computational Life Sciences","volume":" ","pages":"814-828"},"PeriodicalIF":3.9,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141897299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信