{"title":"Overlapping group screening for binary cancer classification with TCGA high-dimensional genomic data.","authors":"Jie-Huei Wang, Yi-Hau Chen","doi":"10.1142/S0219720023500130","DOIUrl":"https://doi.org/10.1142/S0219720023500130","url":null,"abstract":"<p><p>Precision medicine has been a global trend of medical development, wherein cancer diagnosis plays an important role. With accurate diagnosis of cancer, we can provide patients with appropriate medical treatments for improving patients' survival. Since disease developments involve complex interplay among multiple factors such as gene-gene interactions, cancer classifications based on microarray gene expression profiling data are expected to be effective, and hence, have attracted extensive attention in computational biology and medicine. However, when using genomic data to build a diagnostic model, there exist several problems to be overcome, including the high-dimensional feature space and feature contamination. In this paper, we propose using the overlapping group screening (OGS) approach to build an accurate cancer diagnosis model and predict the probability of a patient falling into some disease classification category in the logistic regression framework. This new proposal integrates gene pathway information into the procedure for identifying genes and gene-gene interactions associated with the classification of cancer outcome groups. We conduct a series of simulation studies to compare the predictive accuracy of our proposed method for cancer diagnosis with some existing machine learning methods, and find the better performances of the former method. We apply the proposed method to the genomic data of The Cancer Genome Atlas related to lung adenocarcinoma (LUAD), liver hepatocellular carcinoma (LHC), and thyroid carcinoma (THCA), to establish accurate cancer diagnosis models.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 3","pages":"2350013"},"PeriodicalIF":1.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9750378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Identification of a seven autophagy-related gene pairs signature for the diagnosis of colorectal cancer using the RankComp algorithm.","authors":"Qi-Shi Song, Hai-Jun Wu, Qian Lin, Yu-Kai Tang","doi":"10.1142/S0219720023500129","DOIUrl":"https://doi.org/10.1142/S0219720023500129","url":null,"abstract":"<p><p>Based on the colorectal cancer microarray sets gene expression data series (GSE) GSE10972 and GSE74602 in colon cancer and 222 autophagy-related genes, the differential signature in colorectal cancer and paracancerous tissues was analyzed by RankComp algorithm, and a signature consisting of seven autophagy-related reversal gene pairs with stable relative expression orderings (REOs) was obtained. Scoring based on these gene pairs could significantly distinguish colorectal cancer samples from adjacent noncancerous samples, with an average accuracy of 97.5% in two training sets and 90.25% in four independent validation GSE21510, GSE37182, GSE33126, and GSE18105. Scoring based on these gene pairs also accurately identifies 99.85% of colorectal cancer samples in seven other independent datasets containing a total of 1406 colorectal cancer samples.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 3","pages":"2350012"},"PeriodicalIF":1.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9752623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The mechanism accounting for DNA damage strength modulation of p53 dynamical properties.","authors":"Aiqing Ma, Xianhua Dai","doi":"10.1142/S0219720023500117","DOIUrl":"https://doi.org/10.1142/S0219720023500117","url":null,"abstract":"<p><p>The P53 protein levels exhibit a series of pulses in response to DNA double-stranded breaks (DSBs). However, the mechanism regarding how damage strength regulates physical parameters of p53 pulses remains to be elucidated. This paper established two mathematical models translating the mechanism of p53 dynamics in response to DSBs; the two models can reproduce many results observed in the experiments. Based on the models, numerical analysis suggested that the interval between pulses increases as the damage strength decreases, and we proposed that the p53 dynamical system in response to DSBs is modulated by frequency. Next, we found that the ATM positive self-feedback can realize the system characteristic that the pulse amplitude is independent of the damage strength. In addition, the pulse interval is negatively correlated with apoptosis; the greater the damage strength, the smaller the pulse interval, the faster the p53 accumulation rate, and the cells are more susceptible to apoptosis. These findings advance our understanding of the mechanism of p53 dynamical response and give new insights for experiments to probe the dynamics of p53 signaling.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 3","pages":"2350011"},"PeriodicalIF":1.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9752621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Büşra Özkan Kök, Yasemin Celik Altunoglu, Ali Burak Öncül, Abdulkadir Karaci, Mehmet Cengiz Baloglu
{"title":"Expansin gene family database: A comprehensive bioinformatics resource for plant expansin multigene family.","authors":"Büşra Özkan Kök, Yasemin Celik Altunoglu, Ali Burak Öncül, Abdulkadir Karaci, Mehmet Cengiz Baloglu","doi":"10.1142/S0219720023500154","DOIUrl":"https://doi.org/10.1142/S0219720023500154","url":null,"abstract":"<p><p>Expansins, which are plant cell wall loosening proteins associated with cell growth, have been identified as a multigene family. Plant expansin proteins are an important family that functions in cell growth and many of developmental processes including wall relaxation, fruit softening, abscission, seed germination, mycorrhiza and root nodule formation, biotic and abiotic stress resistance, invasion of pollen tube stigma and organogenesis. In addition, it is thought that increasing the efficiency of plant expansin genes in plants plays a significant role, especially in the production of secondary bioethanol. When the studies on the expansin genes are examined, it is seen that the expansin genes are a significant gene family in the cell wall expansion mechanism. Therefore, understanding the efficacy of expansin genes is of great importance. Considering the importance of this multigene family, we aimed to create a comprehensively informed database of plant expansin proteins and their properties. The expansin gene family database provides comprehensive online data for the expansin gene family members in the plants. We have designed a new website accessible to the public, including expansin gene family members in 70 plants and their features including gene, coding and peptide sequences, chromosomal location, amino acid length, molecular weight, stability, conserved motif and domain structure and predicted three-dimensional architecture. Furthermore, a deep learning system was developed to detect unknown genes belonging to the expansin gene family. In addition, we provided the blast process within the website by establishing a connection to the NCBI BLAST site in the tools section. Thus, the expansin gene family database becomes a useful database for researchers that enables access to all datasets simultaneously with its user-friendly interface. Our server can be reached freely at the following link (http://www.expansingenefamily.com/).</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 3","pages":"2350015"},"PeriodicalIF":1.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10109656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Klairton Lima Brito, Andre Rodrigues Oliveira, Alexsandro Oliveira Alexandrino, Ulisses Dias, Zanoni Dias
{"title":"Rearrangement distance with reversals, indels, and moves in intergenic regions on signed and unsigned permutations.","authors":"Klairton Lima Brito, Andre Rodrigues Oliveira, Alexsandro Oliveira Alexandrino, Ulisses Dias, Zanoni Dias","doi":"10.1142/S0219720023500099","DOIUrl":"https://doi.org/10.1142/S0219720023500099","url":null,"abstract":"<p><p>Genome rearrangement events are widely used to estimate a minimum-size sequence of mutations capable of transforming a genome into another. The length of this sequence is called distance, and determining it is the main goal in genome rearrangement distance problems. Problems in the genome rearrangement field differ regarding the set of rearrangement events allowed and the genome representation. In this work, we consider the scenario where the genomes share the same set of genes, gene orientation is known or unknown, and intergenic regions (structures between a pair of genes and at the extremities of the genome) are taken into account. We use two models, the first model allows only conservative events (reversals and moves), and the second model includes non-conservative events (insertions and deletions) in the intergenic regions. We show that both models result in NP-hard problems no matter if gene orientation is known or unknown. When the information regarding the orientation of genes is available, we present for both models an approximation algorithm with a factor of 2. For the scenario where this information is unavailable, we propose a 4-approximation algorithm for both models.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 2","pages":"2350009"},"PeriodicalIF":1.0,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9528408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Obstacles to effective model deployment in healthcare.","authors":"Wei Xin Chan, Limsoon Wong","doi":"10.1142/S0219720023710014","DOIUrl":"https://doi.org/10.1142/S0219720023710014","url":null,"abstract":"<p><p>Despite an exponential increase in publications on clinical prediction models over recent years, the number of models deployed in clinical practice remains fairly limited. In this paper, we identify common obstacles that impede effective deployment of prediction models in healthcare, and investigate their underlying causes. We observe a key underlying cause behind most obstacles - the improper development and evaluation of prediction models. Inherent heterogeneities in clinical data complicate the development and evaluation of clinical prediction models. Many of these heterogeneities in clinical data are unreported because they are deemed to be irrelevant, or due to privacy concerns. We provide real-life examples where failure to handle heterogeneities in clinical data, or sources of biases, led to the development of erroneous models. The purpose of this paper is to familiarize modeling practitioners with common sources of biases and heterogeneities in clinical data, both of which have to be dealt with to ensure proper development and evaluation of clinical prediction models. Proper model development and evaluation, together with complete and thorough reporting, are important prerequisites for a prediction model to be effectively deployed in healthcare.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 2","pages":"2371001"},"PeriodicalIF":1.0,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9554623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Integrated <i>in silico</i>-<i>in vitro</i> rational design of oncogenic EGFR-derived specific monoclonal antibody-binding peptide mimotopes.","authors":"Ke Chen, Lili Ge, Guorui Liu","doi":"10.1142/S0219720023500075","DOIUrl":"https://doi.org/10.1142/S0219720023500075","url":null,"abstract":"Human epidermal growth factor receptor (EGFR) is strongly associated with malignant proliferation and has been established as an attractive therapeutic target of diverse cancers and used as a significant biomarker for tumor diagnosis. Over the past decades, a variety of monoclonal antibodies (mAbs) have been successfully developed to specifically recognize the third subdomain (TSD) of EGFR extracellular domain. Here, the complex crystal structures of EGFR TSD subdomain with its cognate mAbs were examined and compared systematically, revealing a consistent binding mode shared by these mAbs. The recognition site is located on the [Formula: see text]-sheet surface of TSD ladder architecture, from which several hotspot residues that significantly confer both stability and specificity to the recognition were identified, responsible for about half of the total binding potency of mAbs to TSD subdomain. A number of linear peptide mimotopes were rationally designed to mimic these TSD hotspot residues in different orientations and/or in different head-to-tail manners by using an orthogonal threading-through-strand (OTTS) strategy, which, however, are intrinsically disordered in Free State and thus cannot be maintained in a native hotspot-like conformation. A chemical stapling strategy was employed to constrain the free peptides into a double-stranded conformation by introducing a disulfide bond across two strand arms of the peptide mimotopes. Both empirical scoring and [Formula: see text]fluorescence assay reached an agreement that the stapling can effectively improve the interaction potency of OTTS-designed peptide mimotopes to different mAbs, with binding affinity increase by [Formula: see text]-fold. Conformational analysis revealed that the stapled cyclic peptide mimotopes can spontaneously fold into a double-stranded conformation that well threads through all the hotspot residues on TSD [Formula: see text]-sheet surface and exhibits a consistent binding mode with the TSD hotspot site to mAbs.","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 2","pages":"2350007"},"PeriodicalIF":1.0,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9828071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michael Khristichenko, Yuri Nechepurenko, Dmitry Grebennikov, Gennady Bocharov
{"title":"Numerical study of chronic hepatitis B infection using Marchuk-Petrov model.","authors":"Michael Khristichenko, Yuri Nechepurenko, Dmitry Grebennikov, Gennady Bocharov","doi":"10.1142/S0219720023400012","DOIUrl":"https://doi.org/10.1142/S0219720023400012","url":null,"abstract":"<p><p>In this work, we briefly describe our technology developed for computing periodic solutions of time-delay systems and discuss the results of computing periodic solutions for the Marchuk-Petrov model with parameter values, corresponding to hepatitis <i>B</i> infection. We identified the regions in the model parameter space in which an oscillatory dynamics in the form of periodic solutions exists. The respective solutions can be interpreted as active forms of chronic hepatitis <i>B</i>. The period and amplitude of oscillatory solutions were traced along the parameter determining the efficacy of antigen presentation by macrophages for T- and <i>B</i>-lymphocytes in the model.. The oscillatory regimes are characterized by enhanced destruction of hepatocytes as a consequence of immunopathology and temporal reduction of viral load to values which can be a prerequisite of spontaneous recovery observed in chronic HBV infection. Our study presents a first step in a systematic analysis of the chronic HBV infection using Marchuk-Petrov model of antiviral immune response.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 2","pages":"2340001"},"PeriodicalIF":1.0,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9477723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Standa Na, Dhammika Leshan Wannigama, Thammakorn Saethang
{"title":"Antimicrobial peptides recognition using weighted physicochemical property encoding.","authors":"Standa Na, Dhammika Leshan Wannigama, Thammakorn Saethang","doi":"10.1142/S0219720023500063","DOIUrl":"https://doi.org/10.1142/S0219720023500063","url":null,"abstract":"<p><p>Antimicrobial resistance is a major public health concern. Antimicrobial peptides (AMPs) are one of the host defense mechanisms responding efficiently against multidrug-resistant microbes. Since the process of screening AMPs from a large number of peptides is still high-priced and time-consuming, the development of a precise and rapid computer-aided tool is essential for preliminary AMPs selection ahead of laboratory experiments. In this study, we proposed AMPs recognition models using a new peptide encoding method called amino acid index weight (AAIW). Four AMPs recognition models including antimicrobial, antibacterial, antiviral, and antifungal were trained based on datasets combined from the DRAMP and other published databases. These models achieved high performance compared to the preceding AMPs recognition models when evaluated on two independent test sets. All four models yielded over 93% in accuracy and 0.87 in Matthew's correlation coefficient (MCC). An online AMPs recognition server is accessible at https://amppred-aaiw.com.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 2","pages":"2350006"},"PeriodicalIF":1.0,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9528874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anna Dotsenko, Jury Denisenko, Dmitrii Osipov, Aleksandra Rozhkova, Ivan Zorov, Arkady Sinitsyn
{"title":"Testing and improving the performance of protein thermostability predictors for the engineering of cellulases.","authors":"Anna Dotsenko, Jury Denisenko, Dmitrii Osipov, Aleksandra Rozhkova, Ivan Zorov, Arkady Sinitsyn","doi":"10.1142/S0219720023300010","DOIUrl":"https://doi.org/10.1142/S0219720023300010","url":null,"abstract":"Thermostability of cellulases can be increased through amino acid substitutions and by protein engineering with predictors of protein thermostability. We have carried out a systematic analysis of the performance of 18 predictors for the engineering of cellulases. The predictors were PoPMuSiC, HoTMuSiC, I-Mutant 2.0, I-Mutant Suite, PremPS, Hotspot, Maestroweb, DynaMut, ENCoM ([Formula: see text] and [Formula: see text], mCSM, SDM, DUET, RosettaDesign, Cupsat (thermal and denaturant approaches), ConSurf, and Voronoia. The highest values of accuracy, F-measure, and MCC were obtained for DynaMut, SDM, RosettaDesign, and PremPS. A combination of the predictors provided an improvement in the performance. F-measure and MCC were improved by 14% and 28%, respectively. Accuracy and sensitivity were also improved by 9% and 20%, respectively, compared to the maximal values of single predictors. The reported values of the performance of the predictors and their combination may aid research in the engineering of thermostable cellulases as well as the further development of thermostability predictors.","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":"21 2","pages":"2330001"},"PeriodicalIF":1.0,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9473268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}