{"title":"RiRPSSP: A unified deep learning method for prediction of regular and irregular protein secondary structures.","authors":"Mukhtar Ahmad Sofi, M Arif Wani","doi":"10.1142/S0219720023500014","DOIUrl":"https://doi.org/10.1142/S0219720023500014","url":null,"abstract":"<p><p>Protein secondary structure prediction (PSSP) is an important and challenging task in protein bioinformatics. Protein secondary structures (SSs) are categorized in regular and irregular structure classes. Regular SSs, representing nearly 50% of amino acids consist of helices and sheets, whereas the remaining amino acids represent irregular SSs. [Formula: see text]-turns and [Formula: see text]-turns are the most abundant irregular SSs present in proteins. Existing methods are well developed for separate prediction of regular and irregular SSs. However, for more comprehensive PSSP, it is essential to develop a uniform model to predict all types of SSs simultaneously. In this work, using a novel dataset comprising dictionary of secondary structure of protein (DSSP)-based SSs and PROMOTIF-based [Formula: see text]-turns and [Formula: see text]-turns, we propose a unified deep learning model consisting of convolutional neural networks (CNNs) and long short-term memory networks (LSTMs) for simultaneous prediction of regular and irregular SSs. To the best of our knowledge, this is the first study in PSSP covering both regular and irregular structures. The protein sequences in our constructed datasets, RiR6069 and RiR513, have been borrowed from benchmark CB6133 and CB513 datasets, respectively. The results are indicative of increased PSSP accuracy.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 1","pages":"2350001"},"PeriodicalIF":1.0,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9474486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andrea Mae Añonuevo, Marineil Gomez, Lemmuel L Tayo
{"title":"<i>In silico de novo</i> drug design of a therapeutic peptide inhibitor against UBE2C in breast cancer.","authors":"Andrea Mae Añonuevo, Marineil Gomez, Lemmuel L Tayo","doi":"10.1142/S0219720022500299","DOIUrl":"https://doi.org/10.1142/S0219720022500299","url":null,"abstract":"<p><p>The World Health Organization (WHO) declared breast cancer (BC) as the most prevalent cancer in the world. With its prevalence and severity, there have been several breakthroughs in developing treatments for the disease. Targeted therapy treatments limit the damage done to healthy tissues. These targeted therapies are especially potent for luminal and HER-2 positive type breast cancer. However, for triple negative breast cancer (TNBC), the lack of defining biomarkers makes it hard to approach with targeted therapy methods. Protein-protein interactions (PPIs) have been studied as possible targets for drug action. However, small molecule drugs are not able to cover the entirety of the PPI binding interface. Peptides were found to be more suited to the large or flat PPI surfaces, in addition to their better pharmacokinetic properties. In this study, computational methods was used in order to verify whether peptide drug inhibitors are good drug candidates against the ubiquitin protein, UBE2C by conducting docking, MD and MMPBSA analyses. Results show that while the lead peptide, T20-M shows good potential as a peptide drug, its binding affinity towards UBE2C is not enough to overcome the natural UBE2C-ANAPC2 interaction. Further studies on modification of T20-M and the analysis of other peptide leads are recommended.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 1","pages":"2250029"},"PeriodicalIF":1.0,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9465490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hui-Ling Huang, Chong-Heng Weng, Torbjörn E M Nordling, Yi-Fan Liou
{"title":"ThermalProGAN: A sequence-based thermally stable protein generator trained using unpaired data.","authors":"Hui-Ling Huang, Chong-Heng Weng, Torbjörn E M Nordling, Yi-Fan Liou","doi":"10.1142/S0219720023500087","DOIUrl":"https://doi.org/10.1142/S0219720023500087","url":null,"abstract":"<p><strong>Motivation: </strong>The synthesis of proteins with novel desired properties is challenging but sought after by the industry and academia. The dominating approach is based on trial-and-error inducing point mutations, assisted by structural information or predictive models built with paired data that are difficult to collect. This study proposes a sequence-based unpaired-sample of novel protein inventor (SUNI) to build ThermalProGAN for generating thermally stable proteins based on sequence information.</p><p><strong>Results: </strong>The ThermalProGAN can strongly mutate the input sequence with a median number of 32 residues. A known normal protein, 1RG0, was used to generate a thermally stable form by mutating 51 residues. After superimposing the two structures, high similarity is shown, indicating that the basic function would be conserved. Eighty four molecular dynamics simulation results of 1RG0 and the COVID-19 vaccine candidates with a total simulation time of 840[Formula: see text]ns indicate that the thermal stability increased.</p><p><strong>Conclusion: </strong>This proof of concept demonstrated that transfer of a desired protein property from one set of proteins is feasible. <b>Availability and implementation:</b> The source code of ThermalProGAN can be freely accessed at https://github.com/markliou/ThermalProGAN/ with an MIT license. The website is https://thermalprogan.markliou.tw:433. <b>Supplementary information:</b> Supplementary data are available on Github.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 1","pages":"2350008"},"PeriodicalIF":1.0,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9466541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evaluating network-based missing protein prediction using <i>p</i>-values, Bayes Factors, and probabilities.","authors":"Wilson Wen Bin Goh, Weijia Kong, Limsoon Wong","doi":"10.1142/S0219720023500051","DOIUrl":"https://doi.org/10.1142/S0219720023500051","url":null,"abstract":"<p><p>Some prediction methods use probability to rank their predictions, while some other prediction methods do not rank their predictions and instead use [Formula: see text]-values to support their predictions. This disparity renders direct cross-comparison of these two kinds of methods difficult. In particular, approaches such as the Bayes Factor upper Bound (BFB) for [Formula: see text]-value conversion may not make correct assumptions for this kind of cross-comparisons. Here, using a well-established case study on renal cancer proteomics and in the context of missing protein prediction, we demonstrate how to compare these two kinds of prediction methods using two different strategies. The first strategy is based on false discovery rate (FDR) estimation, which does not make the same naïve assumptions as BFB conversions. The second strategy is a powerful approach which we colloquially call \"home ground testing\". Both strategies perform better than BFB conversions. Thus, we recommend comparing prediction methods by standardization to a common performance benchmark such as a global FDR. And where this is not possible, we recommend reciprocal \"home ground testing\".</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 1","pages":"2350005"},"PeriodicalIF":1.0,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9474482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xin Huang, Benzhe Su, Xingyu Wang, Yang Zhou, Xinyu He, Bing Liu
{"title":"A network-based dynamic criterion for identifying prediction and early diagnosis biomarkers of complex diseases.","authors":"Xin Huang, Benzhe Su, Xingyu Wang, Yang Zhou, Xinyu He, Bing Liu","doi":"10.1142/S0219720022500275","DOIUrl":"https://doi.org/10.1142/S0219720022500275","url":null,"abstract":"<p><p>Lung adenocarcinoma (LUAD) seriously threatens human health and generally results from dysfunction of relevant module molecules, which dynamically change with time and conditions, rather than that of an individual molecule. In this study, a novel network construction algorithm for identifying early warning network signals (IEWNS) is proposed for improving the performance of LUAD early diagnosis. To this end, we theoretically derived a dynamic criterion, namely, the relationship of variation (RV), to construct dynamic networks. RV infers correlation [Formula: see text] statistics to measure dynamic changes in molecular relationships during the process of disease development. Based on the dynamic networks constructed by IEWNS, network warning signals used to represent the occurrence of LUAD deterioration can be defined without human intervention. IEWNS was employed to perform a comprehensive analysis of gene expression profiles of LUAD from The Cancer Genome Atlas (TCGA) database and the Gene Expression Omnibus (GEO) database. The experimental results suggest that the potential biomarkers selected by IEWNS can facilitate a better understanding of pathogenetic mechanisms and help to achieve effective early diagnosis of LUAD. In conclusion, IEWNS provides novel insight into the initiation and progression of LUAD and helps to define prospective biomarkers for assessing disease deterioration.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"20 6","pages":"2250027"},"PeriodicalIF":1.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9471022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Author Index Volume 20 (2022).","authors":"","doi":"10.1142/S0219720022990013","DOIUrl":"https://doi.org/10.1142/S0219720022990013","url":null,"abstract":"","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"20 6","pages":"2299001"},"PeriodicalIF":1.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10505287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Adriana Laura López Lobato, Martha Lorena Avendaño Garrido, Héctor Gabriel Acosta Mesa, Clara Luz Sampieri, Víctor Hugo Sandoval Lozano
{"title":"Quantification of the presence of enzymes in gelatin zymography using the Gini index.","authors":"Adriana Laura López Lobato, Martha Lorena Avendaño Garrido, Héctor Gabriel Acosta Mesa, Clara Luz Sampieri, Víctor Hugo Sandoval Lozano","doi":"10.1142/S0219720022500251","DOIUrl":"https://doi.org/10.1142/S0219720022500251","url":null,"abstract":"<p><p>Gel zymography quantifies the activity of certain enzymes in tumor processes. These enzymes are widely used in medical diagnosis. In order to analyze them, experts classify the zymography spots into various classes according to their tonalities. This classification is done by visual analysis, which is what makes it a subjective process. This work proposes a methodology to carry out this classifications with a process that involves an unsupervised learning algorithm in the images, denoted as the GI algorithm. With the experiments shown in this paper, this methodology could constitute a tool that bioinformatics scientists can trust to perform the desired classification since it is a quantitative indicator to order the enzymatic activity of the spots in a zymography.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"20 6","pages":"2250025"},"PeriodicalIF":1.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9118622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Author Index Volume 20 (2022).","authors":"","doi":"10.1142/s0219749922990015","DOIUrl":"https://doi.org/10.1142/s0219749922990015","url":null,"abstract":"","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"20 6 1","pages":"2299001"},"PeriodicalIF":1.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"63928439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kano Hasegawa, Yoshitaka Moriwaki, Tohru Terada, Cao Wei, Kentaro Shimizu
{"title":"Feedback-AVPGAN: Feedback-guided generative adversarial network for generating antiviral peptides.","authors":"Kano Hasegawa, Yoshitaka Moriwaki, Tohru Terada, Cao Wei, Kentaro Shimizu","doi":"10.1142/S0219720022500263","DOIUrl":"https://doi.org/10.1142/S0219720022500263","url":null,"abstract":"<p><p>In this study, we propose <i>Feedback-AVPGAN</i>, a system that aims to computationally generate novel antiviral peptides (AVPs). This system relies on the key premise of the Generative Adversarial Network (GAN) model and the Feedback method. GAN, a generative modeling approach that uses deep learning methods, comprises a generator and a discriminator. The generator is used to generate peptides; the generated proteins are fed to the discriminator to distinguish between the AVPs and non-AVPs. The original GAN design uses actual data to train the discriminator. However, not many AVPs have been experimentally obtained. To solve this problem, we used the Feedback method to allow the discriminator to learn from the existing as well as generated synthetic data. We implemented this method using a classifier module that classifies each peptide sequence generated by the GAN generator as AVP or non-AVP. The classifier uses the transformer network and achieves high classification accuracy. This mechanism enables the efficient generation of peptides with a high probability of exhibiting antiviral activity. Using the Feedback method, we evaluated various algorithms and their performance. Moreover, we modeled the structure of the generated peptides using AlphaFold2 and determined the peptides having similar physicochemical properties and structures to those of known AVPs, although with different sequences.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"20 6","pages":"2250026"},"PeriodicalIF":1.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9118189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Accounting for treatment during the development or validation of prediction models.","authors":"Wei Xin Chan, Limsoon Wong","doi":"10.1142/S0219720022710019","DOIUrl":"https://doi.org/10.1142/S0219720022710019","url":null,"abstract":"Clinical prediction models are widely used to predict adverse outcomes in patients, and are often employed to guide clinical decision-making. Clinical data typically consist of patients who received different treatments. Many prediction modeling studies fail to account for differences in patient treatment appropriately, which results in the development of prediction models that show poor accuracy and generalizability. In this paper, we list the most common methods used to handle patient treatments and discuss certain caveats associated with each method. We believe that proper handling of differences in patient treatment is crucial for the development of accurate and generalizable models. As different treatment strategies are employed for different diseases, the best approach to properly handle differences in patient treatment is specific to each individual situation. We use the Ma-Spore acute lymphoblastic leukemia data set as a case study to demonstrate the complexities associated with differences in patient treatment, and offer suggestions on incorporating treatment information during evaluation of prediction models. In clinical data, patients are typically treated on a case by case basis, with unique cases occurring more frequently than expected. Hence, there are many subtleties to consider during the analysis and evaluation of clinical prediction models.","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"20 6","pages":"2271001"},"PeriodicalIF":1.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10523629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}