{"title":"SAMNA: accurate alignment of multiple biological networks based on simulated annealing.","authors":"Jing Chen, Zixiang Wang, Jia Huang","doi":"10.1515/jib-2023-0006","DOIUrl":"10.1515/jib-2023-0006","url":null,"abstract":"<p><p>Proteins are important parts of the biological structures and encode a lot of biological information. Protein-protein interaction network alignment is a model for analyzing proteins that helps discover conserved functions between organisms and predict unknown functions. In particular, multi-network alignment aims at finding the mapping relationship among multiple network nodes, so as to transfer the knowledge across species. However, with the increasing complexity of PPI networks, how to perform network alignment more accurately and efficiently is a new challenge. This paper proposes a new global network alignment algorithm called Simulated Annealing Multiple Network Alignment (SAMNA), using both network topology and sequence homology information. To generate the alignment, SAMNA first generates cross-network candidate clusters by a clustering algorithm on a <i>k</i>-partite similarity graph constructed with sequence similarity information, and then selects candidate cluster nodes as alignment results and optimizes them using an improved simulated annealing algorithm. Finally, the SAMNA algorithm was experimented on synthetic and real-world network datasets, and the results showed that SAMNA outperformed the state-of-the-art algorithm in biological performance.</p>","PeriodicalId":53625,"journal":{"name":"Journal of Integrative Bioinformatics","volume":null,"pages":null},"PeriodicalIF":1.9,"publicationDate":"2023-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10777366/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138805553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Grigory A Oborotov, Konstantin A Koshechkin, Yuriy L Orlov
{"title":"Application of Artificial Intelligence or machine learning in risk sharing agreements for pharmacotherapy risk management.","authors":"Grigory A Oborotov, Konstantin A Koshechkin, Yuriy L Orlov","doi":"10.1515/jib-2023-0014","DOIUrl":"10.1515/jib-2023-0014","url":null,"abstract":"<p><p>Applications of Artificial Intelligence in medical informatics solutions risk sharing have social value. At a time of ever-increasing cost for the provision of medicines to citizens, there is a need to restrain the growth of health care costs. The search for computer technologies to stop or slow down the growth of costs acquires a new very important and significant meaning. We discussed the two information technologies in pharmacotherapy and the possibility of combining and sharing them, namely the combination of risk-sharing agreements and Machine Learning, which was made possible by the development of Artificial Intelligence (AI). Neural networks could be used to predict the outcome to reduce the risk factors for treatment. AI-based data processing automation technologies could be also used for risk-sharing agreements automation.</p>","PeriodicalId":53625,"journal":{"name":"Journal of Integrative Bioinformatics","volume":null,"pages":null},"PeriodicalIF":1.9,"publicationDate":"2023-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10757074/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138805521","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Avery Mecham, Ashlie Stephenson, Badi I Quinteros, Grace S Brown, Stephen R Piccolo
{"title":"TidyGEO: preparing analysis-ready datasets from Gene Expression Omnibus.","authors":"Avery Mecham, Ashlie Stephenson, Badi I Quinteros, Grace S Brown, Stephen R Piccolo","doi":"10.1515/jib-2023-0021","DOIUrl":"10.1515/jib-2023-0021","url":null,"abstract":"<p><p>TidyGEO is a Web-based tool for downloading, tidying, and reformatting data series from Gene Expression Omnibus (GEO). As a freely accessible repository with data from over 6 million biological samples across more than 4000 organisms, GEO provides diverse opportunities for secondary research. Although scientists may find assay data relevant to a given research question, most analyses require sample-level annotations. In GEO, such annotations are stored alongside assay data in delimited, text-based files. However, the structure and semantics of the annotations vary widely from one series to another, and many annotations are not useful for analysis purposes. Thus, every GEO series must be tidied before it is analyzed. Manual approaches may be used, but these are error prone and take time away from other research tasks. Custom computer scripts can be written, but many scientists lack the computational expertise to create such scripts. To address these challenges, we created TidyGEO, which supports essential data-cleaning tasks for sample-level annotations, such as selecting informative columns, renaming columns, splitting or merging columns, standardizing data values, and filtering samples. Additionally, users can integrate annotations with assay data, restructure assay data, and generate code that enables others to reproduce these steps.</p>","PeriodicalId":53625,"journal":{"name":"Journal of Integrative Bioinformatics","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2023-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11294518/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138479290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Data literacy in genome research.","authors":"Katharina Wolff, Ronja Friedhoff, Friderieke Schwarzer, Boas Pucker","doi":"10.1515/jib-2023-0033","DOIUrl":"10.1515/jib-2023-0033","url":null,"abstract":"<p><p>With an ever increasing amount of research data available, it becomes constantly more important to possess data literacy skills to benefit from this valuable resource. An integrative course was developed to teach students the fundamentals of data literacy through an engaging genome sequencing project. Each cohort of students performed planning of the experiment, DNA extraction, nanopore sequencing, genome sequence assembly, prediction of genes in the assembled sequence, and assignment of functional annotation terms to predicted genes. Students learned how to communicate science through writing a protocol in the form of a scientific paper, providing comments during a peer-review process, and presenting their findings as part of an international symposium. Many students enjoyed the opportunity to own a project and to work towards a meaningful objective.</p>","PeriodicalId":53625,"journal":{"name":"Journal of Integrative Bioinformatics","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2023-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10777367/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138479289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yulia E Uvarova, Pavel S Demenkov, Irina N Kuzmicheva, Artur S Venzel, Elena L Mischenko, Timofey V Ivanisenko, Vadim M Efimov, Svetlana V Bannikova, Asya R Vasilieva, Vladimir A Ivanisenko, Sergey E Peltek
{"title":"Accurate noise-robust classification of Bacillus species from MALDI-TOF MS spectra using a denoising autoencoder.","authors":"Yulia E Uvarova, Pavel S Demenkov, Irina N Kuzmicheva, Artur S Venzel, Elena L Mischenko, Timofey V Ivanisenko, Vadim M Efimov, Svetlana V Bannikova, Asya R Vasilieva, Vladimir A Ivanisenko, Sergey E Peltek","doi":"10.1515/jib-2023-0017","DOIUrl":"10.1515/jib-2023-0017","url":null,"abstract":"<p><p>Bacillus strains are ubiquitous in the environment and are widely used in the microbiological industry as valuable enzyme sources, as well as in agriculture to stimulate plant growth. The Bacillus genus comprises several closely related groups of species. The rapid classification of these remains challenging using existing methods. Techniques based on MALDI-TOF MS data analysis hold significant promise for fast and precise microbial strains classification at both the genus and species levels. In previous work, we proposed a geometric approach to Bacillus strain classification based on mass spectra analysis via the centroid method (CM). One limitation of such methods is the noise in MS spectra. In this study, we used a denoising autoencoder (DAE) to improve bacteria classification accuracy under noisy MS spectra conditions. We employed a denoising autoencoder approach to convert noisy MS spectra into latent variables representing molecular patterns in the original MS data, and the Random Forest method to classify bacterial strains by latent variables. Comparison of the DAE-RF with the CM method using the artificially noisy test samples showed that DAE-RF offers higher noise robustness. Hence, the DAE-RF method could be utilized for noise-robust, fast, and neat classification of Bacillus species according to MALDI-TOF MS data.</p>","PeriodicalId":53625,"journal":{"name":"Journal of Integrative Bioinformatics","volume":null,"pages":null},"PeriodicalIF":1.9,"publicationDate":"2023-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10757077/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136400294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Evgeniya A Antropova, Tamara M Khlebodarova, Pavel S Demenkov, Anastasiia R Volianskaia, Artur S Venzel, Nikita V Ivanisenko, Alexandr D Gavrilenko, Timofey V Ivanisenko, Anna V Adamovskaya, Polina M Revva, Nikolay A Kolchanov, Inna N Lavrik, Vladimir A Ivanisenko
{"title":"Reconstruction of the regulatory hypermethylation network controlling hepatocellular carcinoma development during hepatitis C viral infection.","authors":"Evgeniya A Antropova, Tamara M Khlebodarova, Pavel S Demenkov, Anastasiia R Volianskaia, Artur S Venzel, Nikita V Ivanisenko, Alexandr D Gavrilenko, Timofey V Ivanisenko, Anna V Adamovskaya, Polina M Revva, Nikolay A Kolchanov, Inna N Lavrik, Vladimir A Ivanisenko","doi":"10.1515/jib-2023-0013","DOIUrl":"10.1515/jib-2023-0013","url":null,"abstract":"<p><p>Hepatocellular carcinoma (HCC) has been associated with hepatitis C viral (HCV) infection as a potential risk factor. Nonetheless, the precise genetic regulatory mechanisms triggered by the virus, leading to virus-induced hepatocarcinogenesis, remain unclear. We hypothesized that HCV proteins might modulate the activity of aberrantly methylated HCC genes through regulatory pathways. Virus-host regulatory pathways, interactions between proteins, gene expression, transport, and stability regulation, were reconstructed using the ANDSystem. Gene expression regulation was statistically significant. Gene network analysis identified four out of 70 HCC marker genes whose expression regulation by viral proteins may be associated with HCC: <i>DNA-binding protein inhibitor ID - 1 (ID1)</i>, <i>flap endonuclease 1 (FEN1)</i>, <i>cyclin-dependent kinase inhibitor 2A (CDKN2A)</i>, and <i>telomerase reverse transcriptase (TERT)</i>. It suggested the following viral protein effects in HCV/human protein heterocomplexes: HCV NS3(p70) protein activates human STAT3 and NOTC1; NS2-3(p23), NS5B(p68), NS1(E2), and core(p21) activate SETD2; NS5A inhibits SMYD3; and NS3 inhibits CCN2. Interestingly, NS3 and E1(gp32) activate c-Jun when it positively regulates <i>CDKN2A</i> and inhibit it when it represses <i>TERT</i>. The discovered regulatory mechanisms might be key areas of focus for creating medications and preventative therapies to decrease the likelihood of HCC development during HCV infection.</p>","PeriodicalId":53625,"journal":{"name":"Journal of Integrative Bioinformatics","volume":null,"pages":null},"PeriodicalIF":1.9,"publicationDate":"2023-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10757076/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136400296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuriy L Orlov, Ming Chen, Nikolay A Kolchanov, Ralf Hofestädt
{"title":"BGRS: bioinformatics of genome regulation and data integration.","authors":"Yuriy L Orlov, Ming Chen, Nikolay A Kolchanov, Ralf Hofestädt","doi":"10.1515/jib-2023-0032","DOIUrl":"10.1515/jib-2023-0032","url":null,"abstract":"","PeriodicalId":53625,"journal":{"name":"Journal of Integrative Bioinformatics","volume":null,"pages":null},"PeriodicalIF":1.9,"publicationDate":"2023-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10757072/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136400295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andrea Marino, Blerina Sinaimeri, Enrico Tronci, Tiziana Calamoneri
{"title":"STARGATE-X: a Python package for statistical analysis on the REACTOME network.","authors":"Andrea Marino, Blerina Sinaimeri, Enrico Tronci, Tiziana Calamoneri","doi":"10.1515/jib-2022-0029","DOIUrl":"10.1515/jib-2022-0029","url":null,"abstract":"<p><p>Many important aspects of biological knowledge at the molecular level can be represented by <i>pathways</i>. Through their analysis, we gain mechanistic insights and interpret lists of interesting genes from experiments (usually omics and functional genomic experiments). As a result, pathways play a central role in the development of bioinformatics methods and tools for computing predictions from known molecular-level mechanisms. Qualitative as well as quantitative knowledge about pathways can be effectively represented through <i>biochemical networks</i> linking the <i>biochemical reactions</i> and the compounds (<i>e.g.</i>, proteins) occurring in the considered pathways. So, repositories providing biochemical networks for known pathways play a central role in bioinformatics and in <i>systems biology</i>. Here we focus on Reactome, a free, comprehensive, and widely used repository for biochemical networks and pathways. In this paper, we: (1) introduce a tool StARGate-X (<i>STatistical Analysis of the</i> Reactome <i>multi-GrAph Through</i> nEtworkX) to carry out an automated analysis of the connectivity properties of Reactome biochemical reaction network and of its biological hierarchy (<i>i.e.</i>, cell compartments, namely, the closed parts within the cytosol, usually surrounded by a membrane); the code is freely available at https://github.com/marinoandrea/stargate-x; (2) show the effectiveness of our tool by providing an analysis of the Reactome network, in terms of centrality measures, with respect to in- and out-degree. As an example of usage of StARGate-X, we provide a detailed automated analysis of the Reactome network, in terms of centrality measures. We focus both on the subgraphs induced by single compartments and on the graph whose nodes are the strongly connected components. To the best of our knowledge, this is the first freely available tool that enables automatic analysis of the large biochemical network within Reactome through easy-to-use APIs (<i>Application Programming Interfaces</i>).</p>","PeriodicalId":53625,"journal":{"name":"Journal of Integrative Bioinformatics","volume":null,"pages":null},"PeriodicalIF":1.9,"publicationDate":"2023-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10757075/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41168952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"RNAcode_Web - Convenient identification of evolutionary conserved protein coding regions.","authors":"John Anders, Peter F Stadler","doi":"10.1515/jib-2022-0046","DOIUrl":"10.1515/jib-2022-0046","url":null,"abstract":"<p><p>The differentiation of regions with coding potential from non-coding regions remains a key task in computational biology. Methods such as RNAcode that exploit patterns of sequence conservation for this task have a substantial advantage in classification accuracy in particular for short coding sequences, compared to methods that rely on a single input sequence. However, they require sequence alignments as input. Frequently, suitable multiple sequence alignments are not readily available and are tedious, and sometimes difficult to construct. We therefore introduce here a new web service that provides access to the well-known coding sequence detector RNAcode with minimal user overhead. It requires as input only a single target nucleotide sequence. The service automates the collection, selection, and preparation of homologous sequences from the NCBI database, as well as the construction of the multiple sequence alignment that are needed as input for RNAcode. The service automatizes the entire pre- and postprocessing and thus makes the investigation of specific genomic regions for previously unannotated coding regions, such as small peptides or additional introns, a simple task that is easily accessible to non-expert users. RNAcode_Web is accessible online at rnacode.bioinf.uni-leipzig.de.</p>","PeriodicalId":53625,"journal":{"name":"Journal of Integrative Bioinformatics","volume":null,"pages":null},"PeriodicalIF":1.9,"publicationDate":"2023-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10757073/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10057634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jaroslav Budiš, Werner Krampl, Marcel Kucharík, Rastislav Hekel, Adrián Goga, Jozef Sitarčík, Michal Lichvár, Dávid Smol'ak, Miroslav Böhmer, Andrej Baláž, František Ďuriš, Juraj Gazdarica, Katarína Šoltys, Ján Turňa, Ján Radvánszky, Tomáš Szemes
{"title":"SnakeLines: integrated set of computational pipelines for sequencing reads.","authors":"Jaroslav Budiš, Werner Krampl, Marcel Kucharík, Rastislav Hekel, Adrián Goga, Jozef Sitarčík, Michal Lichvár, Dávid Smol'ak, Miroslav Böhmer, Andrej Baláž, František Ďuriš, Juraj Gazdarica, Katarína Šoltys, Ján Turňa, Ján Radvánszky, Tomáš Szemes","doi":"10.1515/jib-2022-0059","DOIUrl":"10.1515/jib-2022-0059","url":null,"abstract":"<p><p>With the rapid growth of massively parallel sequencing technologies, still more laboratories are utilising sequenced DNA fragments for genomic analyses. Interpretation of sequencing data is, however, strongly dependent on bioinformatics processing, which is often too demanding for clinicians and researchers without a computational background. Another problem represents the reproducibility of computational analyses across separated computational centres with inconsistent versions of installed libraries and bioinformatics tools. We propose an easily extensible set of computational pipelines, called SnakeLines, for processing sequencing reads; including mapping, assembly, variant calling, viral identification, transcriptomics, and metagenomics analysis. Individual steps of an analysis, along with methods and their parameters can be readily modified in a single configuration file. Provided pipelines are embedded in virtual environments that ensure isolation of required resources from the host operating system, rapid deployment, and reproducibility of analysis across different Unix-based platforms. SnakeLines is a powerful framework for the automation of bioinformatics analyses, with emphasis on a simple set-up, modifications, extensibility, and reproducibility. The framework is already routinely used in various research projects and their applications, especially in the Slovak national surveillance of SARS-CoV-2.</p>","PeriodicalId":53625,"journal":{"name":"Journal of Integrative Bioinformatics","volume":null,"pages":null},"PeriodicalIF":1.9,"publicationDate":"2023-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10757078/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10089530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}