Osphanie Mentari, Muhammad Shujaat, Hilal Tayara, Kil To Chong
{"title":"Toxicity Prediction for Immune Thrombocytopenia Caused by Drugs Based on Logistic Regression with Feature Importance","authors":"Osphanie Mentari, Muhammad Shujaat, Hilal Tayara, Kil To Chong","doi":"10.2174/0115748936269606231001140647","DOIUrl":"https://doi.org/10.2174/0115748936269606231001140647","url":null,"abstract":"Background: One of the problems in drug discovery that can be solved by artificial intelligence is toxicity prediction. In drug-induced immune thrombocytopenia, toxicity can arise in patients after five to ten days by significant bleeding caused by drugdependent antibodies. In clinical trials, when this condition occurs, all the drugs consumed by patients should be stopped, although sometimes this is not possible, especially for older patients who are dependent on their medication. Therefore, being able to predict toxicity in drug-induced immune thrombocytopenia is very important. Computational technologies, such as machine learning, can help predict toxicity better than empirical techniques owing to the lower cost and faster processing. Objective: Previous studies used the KNN method. However, the performance of these approaches needs to be enhanced. This study proposes a Logistic Regression to improve accuracy scores. Methods: In this study, we present a new model for drug-induced immune thrombocytopenia using a machine learning method. Our model extracts several features from the Simplified Molecular Input Line Entry System (SMILES). These features were fused and cleaned, and the important features were selected using the SelectKBest method. The model uses a Logistic Regression that is optimized and tuned by the Grid Search Cross Validation. Results: The highest accuracy occurred when using features from PADEL, CDK, RDKIT, MORDRED, BLUEDESC combinations, resulting in an accuracy of 80%. Conclusion: Our proposed model outperforms previous studies in accuracy categories. The information and source code is accessible online at Github: https://github.com/Osphanie/Thrombocytopenia.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135889328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Network propagation-based identification of oligometastatic biomarkers in metastatic colorectal cancer","authors":"Qing Jin, Kexin Yu, Xianze Zhang, Diwei Huo, Denan Zhang, Lei Liu, Hongbo Xie, Binhua Liang, Xiujie Chen","doi":"10.2174/1574893618666230913110025","DOIUrl":"https://doi.org/10.2174/1574893618666230913110025","url":null,"abstract":"Background: The oligometastatic disease has been proposed as an intermediate state between primary tumor and systemically metastatic disease, which has great potential curable with locoregional therapies. However, since no biomarker for the identification of patients with true oligometastatic disease is clinically available, the diagnosis of oligometastatic disease remains controversial. Objective: We aim to identify potential biomarkers of colorectal cancer patients with true oligometastatic states, who will benefit most from local therapy. Methods: This study retrospectively analyzed the transcriptome profiles and clinical parameters of 307 metastatic colorectal cancer patients. A novel network propagation method and network-based strategy were combined to identify oligometastatic biomarkers to predict the prognoses of metastatic colorectal cancer patients. Results: We defined two metastatic risk groups according to twelve oligometastatic biomarkers, which exhibit distinct prognoses, clinicopathological features, immunological characteristics, and biological mechanisms. The metastatic risk assessment model exhibited a more powerful capacity for survival prediction compared to traditional clinicopathological features. The low-MRS group was most consistent with an oligometastatic state, while the high-MRS might be a potential polymetastatic state, which leads to the divergence of their prognostic outcomes and response to treatments. We also identified 22 significant immune check genes between the high-MRS and low- MRS groups. The difference in molecular mechanism between the two metastatic risk groups was associated with focal adhesion, nucleocytoplasmic transport, Hippo, PI3K-Akt, TGF-β, and EMCreceptor interaction signaling pathways. Conclusion: Our study provided a molecular definition of the oligometastatic state in colorectal cancer, which contributes to precise treatment decision-making for advanced patients.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136078556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jie Gao, Qiming Fu, Jiacheng Sun, Yunzhe Wang, Youbing Xia, You Lu, Hongjie Wu, Jianping Chen
{"title":"QLDTI: A Novel Reinforcement Learning-based Prediction Model for Drug-Target Interaction","authors":"Jie Gao, Qiming Fu, Jiacheng Sun, Yunzhe Wang, Youbing Xia, You Lu, Hongjie Wu, Jianping Chen","doi":"10.2174/0115748936264731230928112936","DOIUrl":"https://doi.org/10.2174/0115748936264731230928112936","url":null,"abstract":"Background: Predicting drug-target interaction (DTI) plays a crucial role in drug research and development. More and more researchers pay attention to the problem of developing more powerful prediction methods. Traditional DTI prediction methods are basically realized by biochemical experiments, which are time-consuming, risky, and costly. Nowadays, DTI prediction is often solved by using a single information source and a single model, or by combining some models, but the prediction results are still not accurate enough. Objective: The study aimed to utilize existing data and machine learning models to integrate heterogeneous data sources and different models, further improving the accuracy of DTI prediction. Methods: This paper has proposed a novel prediction method based on reinforcement learning, called QLDTI (predicting drug-target interaction based on Q-learning), which can be mainly divided into two parts: data fusion and model fusion. Firstly, it fuses the drug and target similarity matrices calculated by different calculation methods through Q-learning. Secondly, the new similarity matrix is inputted into five models, NRLMF, CMF, BLM-NII, NetLapRLS, and WNN-GIP, for further training. Then, all sub-model weights are continuously optimized again by Q-learning, which can be used to linearly weight all sub-model prediction results to output the final prediction result. Results: QLDTI achieved AUC accuracy of 99.04%, 99.12%, 98.28%, and 98.35% on E, NR, IC, and GPCR datasets, respectively. Compared to the existing five models NRLMF, CMF, BLM-NII, NetLapRLS, and WNN-GIP, the QLDTI method has achieved better results on four benchmark datasets of E, NR, IC, and GPCR. Conclusion: Data fusion and model fusion have been proven effective for DTI prediction, further improving the prediction accuracy of DTI.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136182037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bioinformatics Perspective of Drug Repurposing","authors":"Binita Patel, Brijesh Gelat, Mehul Soni, Pooja Rathaur, Kaid Johar SR","doi":"10.2174/0115748936264692230921071504","DOIUrl":"https://doi.org/10.2174/0115748936264692230921071504","url":null,"abstract":"Abstract: Different diseases can be treated with various therapeutic agents. Drug discovery aims to find potential molecules for existing and emerging diseases. However, factors, such as increasing development cost, generic competition due to the patent expiry of several drugs, increase in conservative regulatory policies, and insufficient breakthrough innovations impairs the development of new drugs and the learning productivity of pharmaceutical industries. Drug repurposing is the process of finding new therapeutic applications for already approved, withdrawn from use, abandoned, and experimental drugs. Drug repurposing is another method that may partially overcome the hurdles related to drug discovery and hence appears to be a wise attempt. However, drug repurposing being not a standard regulatory process, leads to administrative concerns and problems. The drug repurposing also requires expensive, high-risk clinical trials to establish the safety and efficacy of the repurposed drug. Recent innovations in the field of bioinformatics can accelerate the new drug repurposing studies by identifying new targets of the existing drugs along with drug candidate screening and refinement. Recent advancements in the field of comprehensive high throughput data in genomics, epigenetics, chromosome architecture, transcriptomic, proteomics, and metabolomics may also contribute to the understanding of molecular mechanisms involved in drug-target interaction. The present review describes the current scenario in the field of drug repurposing along with the application of various bioinformatic tools for the identification of new targets for the existing drug.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136358655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Systematic Review of Medical Expert Systems for Cardiac Arrest Prediction","authors":"Ishleen Kaur, Tanvir Ahmad, M.N. Doja","doi":"10.2174/0115748936251658231002043812","DOIUrl":"https://doi.org/10.2174/0115748936251658231002043812","url":null,"abstract":"Background:: Predicting cardiac arrest is crucial for timely intervention and improved patient outcomes. Machine learning has yielded astounding results by offering tailored prediction analyses on complex data. Despite advancements in medical expert systems, there remains a need for a comprehensive analysis of their effectiveness and limitations in cardiac arrest prediction. This need arises because there are not enough existing studies that thoroughly cover the topic. Objective:: The systematic review aims to analyze the existing literature on medical expert systems for cardiac arrest prediction, filling the gaps in knowledge and identifying key challenges. Methods:: This paper adopts the PRISMA methodology to conduct a systematic review of 37 publications obtained from PubMed, Springer, ScienceDirect, and IEEE, published within the last decade. Careful inclusion and exclusion criteria were applied during the selection process, resulting in a comprehensive analysis that utilizes five integrated layers- research objectives, data collection, feature set generation, model training and validation employing various machine learning techniques. Results and Conclusion:: The findings indicate that current studies frequently use ensemble and deep learning methods to improve machine learning predictions’ accuracy. However, they lack adequate implementation of proper pre-processing techniques. Further research is needed to address challenges related to external validation, implementation, and adoption of machine learning models in real clinical settings, as well as integrating machine learning with AI technologies like NLP. This review aims to be a valuable resource for both novice and experienced researchers, offering insights into current methods and potential future recommendations.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136358104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaoqing Peng, Wanxin Cui, Xiangyan Kong, Yuannan Huang, Ji Li
{"title":"DMR_Kmeans: Identifying Differentially Methylated Regions Based on kmeans Clustering and Read Methylation Haplotype Filtering","authors":"Xiaoqing Peng, Wanxin Cui, Xiangyan Kong, Yuannan Huang, Ji Li","doi":"10.2174/0115748936245495230925112419","DOIUrl":"https://doi.org/10.2174/0115748936245495230925112419","url":null,"abstract":"Introduction:: Differentially methylated regions (DMRs), including tissue-specific DMRs and disease-specific DMRs, can be used in revealing the mechanisms of gene regulation and screening diseases. Up until now, many methods have been proposed to detect DMRs from bisulfite sequencing data. In these methods, differentially methylated CpG sites and DMRs are usually identified based on statistical tests or distribution models, which neglect the joint methylation statuses provided in each read and result in inaccurate boundaries of DMRs. background: Differentially methylated regions (DMRs), including the tissue-specific DMRs and disease-specific DMRs, can be used in revealing the mechanisms of gene regulation and screening diseases. Up until now, many methods have been proposed to detect DMRs from bisulfite sequencing data. In these methods, differentially methylated CpG sites and DMRs are usually identified based on statistical tests or distribution models, which neglects the joint methylation statuses provided in each read and results in inaccurate boundaries of DMRs. Methods:: In this paper, a method, named DMR_Kmeans, is proposed to detect DMRs based on kmeans clustering and read methylation haplotype filtering. In DMR_Kmeans, for each CpG site, the k-means algorithm is used to cluster the methylation levels from two groups, and the methylation difference of the CpG is measured based on the different distributions in clusters. Methylation haplotypes of reads are employed to extract the methylation patterns in a candidate region. Finally, DMRs are identified based on the methylation differences and the methylation patterns in candidate regions. objective: Make use of the joint methylation statuses provided in each read and predict accurate boundaries of DMRs. Result:: Comparing the performance of DMR_Kmeans and eight DMR detection methods on the whole genome bisulfite sequencing data of six pairs of tissues, the results show that DMR_Kmeans achieves higher Qn and Ql, and more overlapped promoters than other methods when given a certain threshold of methylation difference greater than 0.4, which indicates that the DMRs predicted by DMR_Kmeans with accurate boundaries contain less CpGs with small methylation differences than those by other methods. method: In this paper, a method, named DMR_Kmeans, is proposed to detect DMRs based on k-means clustering and read methylation haplotype filtering. In DMR_Kmeans, for each CpG site, the k-means algorithm is used to cluster the methylation levels from two groups, and the methylation difference of the CpG is measured based on the different distributions in clusters. Methylation haplotypes of reads are employed to extract the methylation patterns in a candidate region. Finally, DMRs are identified based on the methylation differences and the methylation patterns in candidate regions. Conclusion:: Furthermore, it suggests that DMR_Kmeans can provide a DMR set with high quality for downstream analysis since th","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134945067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Meng-Meng Wei, Chang-Qing Yu, Li-Ping Li, Zhu-Hong You, Lei Wang
{"title":"A Deep Neural Network Model with Attribute Network Representation for lncRNA-Protein Interaction Prediction","authors":"Meng-Meng Wei, Chang-Qing Yu, Li-Ping Li, Zhu-Hong You, Lei Wang","doi":"10.2174/0115748936267109230919104630","DOIUrl":"https://doi.org/10.2174/0115748936267109230919104630","url":null,"abstract":"background: LncRNA is not only involved in the regulation of the biological functions of protein-coding genes but its dysfunction is also associated with the occurrence and progression of various diseases. As more and more studies have shown that an in-depth understanding of the mechanism of action of lncRNA is of great significance for disease treatment. However, traditional wet testing is time-consuming, laborious, expensive, and has many subjective factors, which may affect the accuracy of the experiment. objective: Most of the methods for predicting lncRNA-protein interaction (LPI) only rely on a single feature or there is noise in the feature. To solve this problem, we propose a computational model CSALPI based on a deep neural network. method: Firstly, this model utilizes cosine similarity to extract similarity features for lncRNA-lncRNA and protein-protein. Denoising similar features using the Sparse Autoencoder. Second, a neighbor enhancement autoencoder is employed to enforce neighboring nodes to be represented in a similar way by reconstructing the denoised features. Finally, a Light Gradient Boosting Machine classifier is used to predict potential LPIs. result: To demonstrate the reliability of CSALPI, multiple evaluation metrics were used under a 5-fold cross-validation experiment and excellent results were achieved. In the case study, the model successfully predicted 7 out of 10 disease-associated lncRNA and protein pairs. conclusion: The CSALPI can be used as an effective complementary method for predicting potential LPIs from biological experiments.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134944940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiajie Xing, Xu Song, Meiju Yu, Juan Wang, Jing Yu
{"title":"MSSD: An Efficient Method for Constructing Accurate and Stable Phylogenetic Networks by Merging Subtrees of Equal Depth","authors":"Jiajie Xing, Xu Song, Meiju Yu, Juan Wang, Jing Yu","doi":"10.2174/0115748936256923230927081102","DOIUrl":"https://doi.org/10.2174/0115748936256923230927081102","url":null,"abstract":"Background:: Systematic phylogenetic networks are essential for studying the evolutionary relationships and diversity among species. These networks are particularly important for capturing non-tree-like processes resulting from reticulate evolutionary events. However, existing methods for constructing phylogenetic networks are influenced by the order of inputs. The different orders can lead to inconsistent experimental results. Moreover, constructing a network for large datasets is time-consuming and the network often does not include all of the input tree nodes. Aims: This paper aims to propose a novel method, called as MSSD, which can construct a phylogenetic network from gene trees by Merging Subtrees with the Same Depth in a bottom-up way. background: Phylogenetic trees can represent the evolutionary history of genes vertically. There is a difference between phylogenetic trees of different genes due to the reticulate evolution events of species. Phylogenetic networks can represent reticulate evolutionary processes and show the difference between rooted gene trees. Methods:: The MSSD first decomposes trees into subtrees based on depth. Then it merges subtrees with the same depth from 0 to the maximum depth. For all subtrees of one depth, it inserts each subtree into the current networks by means of identical subtrees. Results:: We test the MSSD on the simulated data and real data. The experimental results show that the networks constructed by the MSSD can represent all input trees and the MSSD is more stable than other methods. The MSSD can construct networks faster and the constructed networks have more similar information with the input trees than other methods. Conclusion:: MSSD is a powerful tool for studying the evolutionary relationships among species in biologyand is free available at https://github.com/xingjiajie2023/MSSD. conclusion: The MSSD can construct networks faster and the constructed networks have more similar information with the input trees than other methods.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135647357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"iPSI(2L)-EDL: a Two-layer Predictor for Identifying Promoters and their Types based on Ensemble Deep Learning","authors":"Xuan Xiao, Zaihao Hu, ZhenTao Luo, Zhaochun Xu","doi":"10.2174/0115748936264316230926073231","DOIUrl":"https://doi.org/10.2174/0115748936264316230926073231","url":null,"abstract":"Abstract: Promoters are DNA fragments located near the transcription initiation site, they can be divided into strong promoter type and weak promoter type according to transcriptional activation and expression level. Identifying promoters and their strengths in DNA sequences is essential for understanding gene expression regulation. Therefore, it is crucial to further improve predictive quality of predictors for real-world application requirements. Here, we constructed the latest training dataset based on the RegalonDB website, where all the promoters in this dataset have been experimentally validated, and their sequence similarity is less than 85%. We used one-hot and nucleotide chemical property and density (NCPD) to represent DNA sequence samples. Additionally, we proposed an ensemble deep learning framework containing a multi-head attention module, long short-term memory present, and a convolutional neural network module. The results showed that iPSI(2L)-EDL outperformed other existing methods for both promoter prediction and identification of strong promoter type and weak promoter type, the AUC and MCC for the iPSI(2L)-EDL in identifying promoter were improved by 2.23% and 2.96% compared to that of PseDNC-DL on independent testing data, respectively, while the AUC and MCC for the iPSI(2L)- EDL were increased by 3.74% and 5.86% in predicting promoter strength type, respectively. The results of ablation experiments indicate that CNN plays a crucial role in recognizing promoters, the importance of different input positions and long-range dependency relationships among features are helpful for recognizing promoters. Furthermore, to make it easier for most experimental scientists to get the results they need, a userfriendly web server has been established and can be accessed at http://47.94.248.117/IPSW(2L)-EDL.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135901216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yaping Lv, Yanfeng Wang, Yumeng Zhang, Shuzhen Chen, Yuhua Yao
{"title":"Predicting the risk of breast cancer recurrence and metastasis based on miRNA expression","authors":"Yaping Lv, Yanfeng Wang, Yumeng Zhang, Shuzhen Chen, Yuhua Yao","doi":"10.2174/1574893618666230914105741","DOIUrl":"https://doi.org/10.2174/1574893618666230914105741","url":null,"abstract":"Background: Even after surgery, breast cancer patients still suffer from recurrence and metastasis. Thus, it is critical to predict accurately the risk of recurrence and metastasis for individual patients, which can help determine the appropriate adjuvant therapy. Methods: The purpose of this study is to investigate and compare the performance of several categories of molecular biomarkers, i.e., microRNA (miRNA), long non-coding RNA (lncRNA), messenger RNA (mRNA), and copy number variation (CNV), in predicting the risk of breast cancer recurrence and metastasis. First, the molecular data (miRNA, lncRNA, mRNA, and CNV) of 483 breast cancer patients were downloaded from the Cancer Genome Atlas, which were then randomly divided into the training and test sets with a ratio of 7:3. Second, the feature selection process was applied by univariate Cox and multivariate Cox variance analysis on the training set (e.g., 15 miRNAs). According to the selected features (e.g., 15 miRNAs), a random forest classifier and several other classification methods were established according to the label of recurrence and metastasis. Finally, the performances of the classification models were compared and evaluated on the test set. Results: The area under the ROC curve was 0.70 for miRNA, better than those using other biomarkers. Conclusion: These results indicated that miRNA has important guiding significance in predicting recurrence and metastasis of breast cancer.","PeriodicalId":10801,"journal":{"name":"Current Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134969906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}