Klairton Lima Brito, Andre Rodrigues Oliveira, Alexsandro Oliveira Alexandrino, Ulisses Dias, Zanoni Dias
{"title":"Rearrangement distance with reversals, indels, and moves in intergenic regions on signed and unsigned permutations.","authors":"Klairton Lima Brito, Andre Rodrigues Oliveira, Alexsandro Oliveira Alexandrino, Ulisses Dias, Zanoni Dias","doi":"10.1142/S0219720023500099","DOIUrl":"https://doi.org/10.1142/S0219720023500099","url":null,"abstract":"<p><p>Genome rearrangement events are widely used to estimate a minimum-size sequence of mutations capable of transforming a genome into another. The length of this sequence is called distance, and determining it is the main goal in genome rearrangement distance problems. Problems in the genome rearrangement field differ regarding the set of rearrangement events allowed and the genome representation. In this work, we consider the scenario where the genomes share the same set of genes, gene orientation is known or unknown, and intergenic regions (structures between a pair of genes and at the extremities of the genome) are taken into account. We use two models, the first model allows only conservative events (reversals and moves), and the second model includes non-conservative events (insertions and deletions) in the intergenic regions. We show that both models result in NP-hard problems no matter if gene orientation is known or unknown. When the information regarding the orientation of genes is available, we present for both models an approximation algorithm with a factor of 2. For the scenario where this information is unavailable, we propose a 4-approximation algorithm for both models.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 2","pages":"2350009"},"PeriodicalIF":1.0,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9528408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Obstacles to effective model deployment in healthcare.","authors":"Wei Xin Chan, Limsoon Wong","doi":"10.1142/S0219720023710014","DOIUrl":"https://doi.org/10.1142/S0219720023710014","url":null,"abstract":"<p><p>Despite an exponential increase in publications on clinical prediction models over recent years, the number of models deployed in clinical practice remains fairly limited. In this paper, we identify common obstacles that impede effective deployment of prediction models in healthcare, and investigate their underlying causes. We observe a key underlying cause behind most obstacles - the improper development and evaluation of prediction models. Inherent heterogeneities in clinical data complicate the development and evaluation of clinical prediction models. Many of these heterogeneities in clinical data are unreported because they are deemed to be irrelevant, or due to privacy concerns. We provide real-life examples where failure to handle heterogeneities in clinical data, or sources of biases, led to the development of erroneous models. The purpose of this paper is to familiarize modeling practitioners with common sources of biases and heterogeneities in clinical data, both of which have to be dealt with to ensure proper development and evaluation of clinical prediction models. Proper model development and evaluation, together with complete and thorough reporting, are important prerequisites for a prediction model to be effectively deployed in healthcare.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 2","pages":"2371001"},"PeriodicalIF":1.0,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9554623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Integrated <i>in silico</i>-<i>in vitro</i> rational design of oncogenic EGFR-derived specific monoclonal antibody-binding peptide mimotopes.","authors":"Ke Chen, Lili Ge, Guorui Liu","doi":"10.1142/S0219720023500075","DOIUrl":"https://doi.org/10.1142/S0219720023500075","url":null,"abstract":"Human epidermal growth factor receptor (EGFR) is strongly associated with malignant proliferation and has been established as an attractive therapeutic target of diverse cancers and used as a significant biomarker for tumor diagnosis. Over the past decades, a variety of monoclonal antibodies (mAbs) have been successfully developed to specifically recognize the third subdomain (TSD) of EGFR extracellular domain. Here, the complex crystal structures of EGFR TSD subdomain with its cognate mAbs were examined and compared systematically, revealing a consistent binding mode shared by these mAbs. The recognition site is located on the [Formula: see text]-sheet surface of TSD ladder architecture, from which several hotspot residues that significantly confer both stability and specificity to the recognition were identified, responsible for about half of the total binding potency of mAbs to TSD subdomain. A number of linear peptide mimotopes were rationally designed to mimic these TSD hotspot residues in different orientations and/or in different head-to-tail manners by using an orthogonal threading-through-strand (OTTS) strategy, which, however, are intrinsically disordered in Free State and thus cannot be maintained in a native hotspot-like conformation. A chemical stapling strategy was employed to constrain the free peptides into a double-stranded conformation by introducing a disulfide bond across two strand arms of the peptide mimotopes. Both empirical scoring and [Formula: see text]fluorescence assay reached an agreement that the stapling can effectively improve the interaction potency of OTTS-designed peptide mimotopes to different mAbs, with binding affinity increase by [Formula: see text]-fold. Conformational analysis revealed that the stapled cyclic peptide mimotopes can spontaneously fold into a double-stranded conformation that well threads through all the hotspot residues on TSD [Formula: see text]-sheet surface and exhibits a consistent binding mode with the TSD hotspot site to mAbs.","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 2","pages":"2350007"},"PeriodicalIF":1.0,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9828071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michael Khristichenko, Yuri Nechepurenko, Dmitry Grebennikov, Gennady Bocharov
{"title":"Numerical study of chronic hepatitis B infection using Marchuk-Petrov model.","authors":"Michael Khristichenko, Yuri Nechepurenko, Dmitry Grebennikov, Gennady Bocharov","doi":"10.1142/S0219720023400012","DOIUrl":"https://doi.org/10.1142/S0219720023400012","url":null,"abstract":"<p><p>In this work, we briefly describe our technology developed for computing periodic solutions of time-delay systems and discuss the results of computing periodic solutions for the Marchuk-Petrov model with parameter values, corresponding to hepatitis <i>B</i> infection. We identified the regions in the model parameter space in which an oscillatory dynamics in the form of periodic solutions exists. The respective solutions can be interpreted as active forms of chronic hepatitis <i>B</i>. The period and amplitude of oscillatory solutions were traced along the parameter determining the efficacy of antigen presentation by macrophages for T- and <i>B</i>-lymphocytes in the model.. The oscillatory regimes are characterized by enhanced destruction of hepatocytes as a consequence of immunopathology and temporal reduction of viral load to values which can be a prerequisite of spontaneous recovery observed in chronic HBV infection. Our study presents a first step in a systematic analysis of the chronic HBV infection using Marchuk-Petrov model of antiviral immune response.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 2","pages":"2340001"},"PeriodicalIF":1.0,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9477723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Standa Na, Dhammika Leshan Wannigama, Thammakorn Saethang
{"title":"Antimicrobial peptides recognition using weighted physicochemical property encoding.","authors":"Standa Na, Dhammika Leshan Wannigama, Thammakorn Saethang","doi":"10.1142/S0219720023500063","DOIUrl":"https://doi.org/10.1142/S0219720023500063","url":null,"abstract":"<p><p>Antimicrobial resistance is a major public health concern. Antimicrobial peptides (AMPs) are one of the host defense mechanisms responding efficiently against multidrug-resistant microbes. Since the process of screening AMPs from a large number of peptides is still high-priced and time-consuming, the development of a precise and rapid computer-aided tool is essential for preliminary AMPs selection ahead of laboratory experiments. In this study, we proposed AMPs recognition models using a new peptide encoding method called amino acid index weight (AAIW). Four AMPs recognition models including antimicrobial, antibacterial, antiviral, and antifungal were trained based on datasets combined from the DRAMP and other published databases. These models achieved high performance compared to the preceding AMPs recognition models when evaluated on two independent test sets. All four models yielded over 93% in accuracy and 0.87 in Matthew's correlation coefficient (MCC). An online AMPs recognition server is accessible at https://amppred-aaiw.com.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 2","pages":"2350006"},"PeriodicalIF":1.0,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9528874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anna Dotsenko, Jury Denisenko, Dmitrii Osipov, Aleksandra Rozhkova, Ivan Zorov, Arkady Sinitsyn
{"title":"Testing and improving the performance of protein thermostability predictors for the engineering of cellulases.","authors":"Anna Dotsenko, Jury Denisenko, Dmitrii Osipov, Aleksandra Rozhkova, Ivan Zorov, Arkady Sinitsyn","doi":"10.1142/S0219720023300010","DOIUrl":"https://doi.org/10.1142/S0219720023300010","url":null,"abstract":"Thermostability of cellulases can be increased through amino acid substitutions and by protein engineering with predictors of protein thermostability. We have carried out a systematic analysis of the performance of 18 predictors for the engineering of cellulases. The predictors were PoPMuSiC, HoTMuSiC, I-Mutant 2.0, I-Mutant Suite, PremPS, Hotspot, Maestroweb, DynaMut, ENCoM ([Formula: see text] and [Formula: see text], mCSM, SDM, DUET, RosettaDesign, Cupsat (thermal and denaturant approaches), ConSurf, and Voronoia. The highest values of accuracy, F-measure, and MCC were obtained for DynaMut, SDM, RosettaDesign, and PremPS. A combination of the predictors provided an improvement in the performance. F-measure and MCC were improved by 14% and 28%, respectively. Accuracy and sensitivity were also improved by 9% and 20%, respectively, compared to the maximal values of single predictors. The reported values of the performance of the predictors and their combination may aid research in the engineering of thermostable cellulases as well as the further development of thermostability predictors.","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 2","pages":"2330001"},"PeriodicalIF":1.0,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9473268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A pharmacokinetic model based on the SSA-1DCNN-Attention method.","authors":"Zi-Yi He, Jie-Yu Yang, Yong Li","doi":"10.1142/S021972002350004X","DOIUrl":"https://doi.org/10.1142/S021972002350004X","url":null,"abstract":"<p><p>To solve the problem of the lack of representativeness of the training set and the poor prediction accuracy due to the limited number of training samples when the machine learning method is used for the classification and prediction of pharmacokinetic indicators, this paper proposes a 1DCNN-Attention concentration prediction model optimized by the sparrow search algorithm (SSA). First, the SMOTE method is used to expand the small sample experimental data to make the data diverse and representative. Then a one-dimensional convolutional neural network (1DCNN) model is established, and the attention mechanism is introduced to calculate the weight of each variable for dividing the importance of each pharmacokinetic indicator by the output drug concentration. The SSA algorithm was used to optimize the parameters in the model to improve the prediction accuracy after data expansion. Taking the pharmacokinetic model of phenobarbital (PHB) combined with <i>Cynanchum otophyllum saponins</i> to treat epilepsy as an example, the concentration changes of PHB were predicted and the effectiveness of the method was verified. The results show that the proposed model has a better prediction effect than other methods.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 1","pages":"2350004"},"PeriodicalIF":1.0,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9473265","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"PTGAC Model: A machine learning approach for constructing phylogenetic tree to compare protein sequences.","authors":"Jayanta Pal, Sourav Saha, Bansibadan Maji, Dilip Kumar Bhattacharya","doi":"10.1142/S0219720022500287","DOIUrl":"https://doi.org/10.1142/S0219720022500287","url":null,"abstract":"<p><p>This work proposes a machine learning-based phylogenetic tree generation model based on agglomerative clustering (PTGAC) that compares protein sequences considering all known chemical properties of amino acids. The proposed model can serve as a suitable alternative to the Unweighted Pair Group Method with Arithmetic Mean (UPGMA), which is inherently time-consuming in nature. Initially, principal component analysis (PCA) is used in the proposed scheme to reduce the dimensions of 20 amino acids using seven known chemical characteristics, yielding 20 TP (Total Points) values for each amino acid. The approach of cumulative summing is then used to give a non-degenerate numeric representation of the sequences based on these 20 TP values. A special kind of three-component vector is proposed as a descriptor, which consists of a new type of non-central moment of orders one, two, and three. Subsequently, the proposed model uses Euclidean Distance measures among the descriptors to create a distance matrix. Finally, a phylogenetic tree is constructed using hierarchical agglomerative clustering based on the distance matrix. The results are compared with the UPGMA and other existing methods in terms of the quality and time of constructing the phylogenetic tree. Both qualitative and quantitative analysis are performed as key assessment criteria for analyzing the performance of the proposed model. The qualitative analysis of the phylogenetic tree is performed by considering rationalized perception, while the quantitative analysis is performed based on symmetric distance (SD). On both criteria, the results obtained by the proposed model are more satisfactory than those produced earlier on the same species by other methods. Notably, this method is found to be efficient in terms of both time and space requirements and is capable of dealing with protein sequences of varying lengths.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 1","pages":"2250028"},"PeriodicalIF":1.0,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9472273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yonglin Zhang, Mei Hu, Qi Mo, Wenli Gan, Jiesi Luo
{"title":"A novel method for predicting DNA N<sup>4</sup>-methylcytosine sites based on deep forest algorithm.","authors":"Yonglin Zhang, Mei Hu, Qi Mo, Wenli Gan, Jiesi Luo","doi":"10.1142/S0219720023500038","DOIUrl":"https://doi.org/10.1142/S0219720023500038","url":null,"abstract":"<p><p>N<sup>4</sup>-methyladenosine (4mC) methylation is an essential epigenetic modification of deoxyribonucleic acid (DNA) that plays a key role in many biological processes such as gene expression, gene replication and transcriptional regulation. Genome-wide identification and analysis of the 4mC sites can better reveal the epigenetic mechanisms that regulate various biological processes. Although some high-throughput genomic experimental methods can effectively facilitate the identification in a genome-wide scale, they are still too expensive and laborious for routine use. Computational methods can compensate for these disadvantages, but they still leave much room for performance improvement. In this study, we develop a non-NN-style deep learning-based approach for accurately predicting 4mC sites from genomic DNA sequence. We generate various informative features represented sequence fragments around 4mC sites, and subsequently implement them into a deep forest (DF) model. After training the deep model using 10-fold cross-validation, the overall accuracies of 85.0%, 90.0%, and 87.8% were achieved for three representative model organisms, <i>A. thaliana, C. elegans</i>, and <i>D. melanogaster</i>, respectively. In addition, extensive experiment results show that our proposed approach outperforms other existing state-of-the-art predictors in the 4mC identification. Our approach stands for the first DF-based algorithm for the prediction of 4mC sites, providing a novel idea in this field.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 1","pages":"2350003"},"PeriodicalIF":1.0,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9474484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"NuKit: A deep learning platform for fast nucleus segmentation of histopathological images.","authors":"Ching-Nung Lin, Christine H Chung, Aik Choon Tan","doi":"10.1142/S0219720023500026","DOIUrl":"https://doi.org/10.1142/S0219720023500026","url":null,"abstract":"<p><p>Nucleus segmentation represents the initial step for histopathological image analysis pipelines, and it remains a challenge in many quantitative analysis methods in terms of accuracy and speed. Recently, deep learning nucleus segmentation methods have demonstrated to outperform previous intensity- or pattern-based methods. However, the heavy computation of deep learning provides impression of lagging response in real time and hampered the adoptability of these models in routine research. We developed and implemented NuKit a deep learning platform, which accelerates nucleus segmentation and provides prompt results to the users. NuKit platform consists of two deep learning models coupled with an interactive graphical user interface (GUI) to provide fast and automatic nucleus segmentation \"on the fly\". Both deep learning models provide complementary tasks in nucleus segmentation. The whole image segmentation model performs whole image nucleus whereas the click segmentation model supplements the nucleus segmentation with user-driven input to edits the segmented nuclei. We trained the NuKit whole image segmentation model on a large public training data set and tested its performance in seven independent public image data sets. The whole image segmentation model achieves average [Formula: see text] and [Formula: see text]. The outputs could be exported into different file formats, as well as provides seamless integration with other image analysis tools such as QuPath. NuKit can be executed on Windows, Mac, and Linux using personal computers.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 1","pages":"2350002"},"PeriodicalIF":1.0,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/68/f9/nihms-1915365.PMC10362904.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9852066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}