{"title":"Editorial: Women in bioinformatics.","authors":"Irma Martínez-Flores, Constanza Cárdenas Carvajal, Viviana Monje-Galvan","doi":"10.3389/fbinf.2024.1499514","DOIUrl":"https://doi.org/10.3389/fbinf.2024.1499514","url":null,"abstract":"","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1499514"},"PeriodicalIF":2.8,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11494442/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142514118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ruslan Kalendar, Alexandr Shevtsov, Zhenis Otarbay, Aisulu Ismailova
{"title":"<i>In silico</i> PCR analysis: a comprehensive bioinformatics tool for enhancing nucleic acid amplification assays.","authors":"Ruslan Kalendar, Alexandr Shevtsov, Zhenis Otarbay, Aisulu Ismailova","doi":"10.3389/fbinf.2024.1464197","DOIUrl":"10.3389/fbinf.2024.1464197","url":null,"abstract":"<p><p>Nucleic acid amplification assays represent a pivotal category of methodologies for targeted sequence detection within contemporary biological research, boasting diverse utility in diagnostics, identification, and DNA sequencing. The foundational principles of these assays have been extrapolated to various simple and intricate nucleic acid amplification technologies. Concurrently, a burgeoning trend toward computational or virtual methodologies is exemplified by <i>in silico</i> PCR analysis. <i>In silico</i> PCR analysis is a valuable and productive adjunctive approach for ensuring primer or probe specificity across a broad spectrum of PCR applications encompassing gene discovery through homology analysis, molecular diagnostics, DNA profiling, and repeat sequence identification. The prediction of primer and probe sensitivity and specificity necessitates thorough database searches, accounting for an optimal balance of mismatch tolerance, sequence similarity, and thermal stability. This software facilitates <i>in silico</i> PCR analyses of both linear and circular DNA templates, including bisulfited treatment DNA, enabling multiple primer or probe searches within databases of varying scales alongside advanced search functionalities. This tool is suitable for processing batch files and is essential for automation when working with large amounts of data.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1464197"},"PeriodicalIF":2.8,"publicationDate":"2024-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11491563/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142485996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Discovery of plasma biomarkers related to blood-brain barrier dysregulation in Alzheimer's disease.","authors":"Yuet Ruh Dan, Keng-Hwee Chiam","doi":"10.3389/fbinf.2024.1463001","DOIUrl":"10.3389/fbinf.2024.1463001","url":null,"abstract":"<p><strong>Introduction: </strong>Blood-based biomarkers are quantitative, non-invasive diagnostic tools. This study aimed to identify candidate biomarkers for Alzheimer's disease (AD) using publicly available omics datasets, using the hypothesis that with blood-brain barrier dysfunction in AD, brain-synthesized proteins can leak into plasma for detection.</p><p><strong>Methods: </strong>Differential abundance results of plasma and brain proteomic datasets were integrated to obtain a list of potential biomarkers. Biological validity was investigated with intercellular communication and gene regulatory analyses on brain single-cell transcriptomics data.</p><p><strong>Results: </strong>Five proteins (APOD, B2M, CFH, CLU, and C3) fit biomarker criteria. 4 corresponding transcripts (APOD, B2M, CLU, and C3) were overexpressed in AD astrocytes, mediated by AD-related dysregulations in transcription factors regulating neuroinflammation. Additionally, CLU specifically induced downstream expression of neuronal death genes.</p><p><strong>Discussion: </strong>In conclusion, a 5-protein panel is shown to effectively identify AD patients, with evidence of disease specificity and biological validity. Future research should investigate the mechanism of protein leakage through the blood-brain barrier.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1463001"},"PeriodicalIF":2.8,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11487119/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142482320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Júlio C M Chaves, Fábio Hepp, Carlos G Schrago, Beatriz Mello
{"title":"A time-calibrated phylogeny of the diversification of Holoadeninae frogs.","authors":"Júlio C M Chaves, Fábio Hepp, Carlos G Schrago, Beatriz Mello","doi":"10.3389/fbinf.2024.1441373","DOIUrl":"https://doi.org/10.3389/fbinf.2024.1441373","url":null,"abstract":"<p><p>The phylogeny of the major lineages of Amphibia has received significant attention in recent years, although evolutionary relationships within families remain largely neglected. One such overlooked group is the subfamily Holoadeninae, comprising 73 species across nine genera and characterized by a disjunct geographical distribution. The lack of a fossil record for this subfamily hampers the formulation of a comprehensive evolutionary hypothesis for their diversification. Aiming to fill this gap, we inferred the phylogenetic relationships and divergence times for Holoadeninae using molecular data and calibration information derived from the fossil record of Neobatrachia. Our inferred phylogeny confirmed most genus-level associations, and molecular dating analysis placed the origin of Holoadeninae in the Eocene, with subsequent splits also occurring during this period. The climatic and geological events that occurred during the Oligocene-Miocene transition were crucial to the dynamic biogeographical history of the subfamily. However, the wide highest posterior density intervals in our divergence time estimates are primarily attributed to the absence of Holoadeninae fossil information and, secondarily, to the limited number of sampled nucleotide sites.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1441373"},"PeriodicalIF":2.8,"publicationDate":"2024-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11480671/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142482319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gabriel J Selzer, Curtis T Rueden, Mark C Hiner, Edward L Evans, David Kolb, Marcel Wiedenmann, Christian Birkhold, Tim-Oliver Buchholz, Stefan Helfrich, Brian Northan, Alison Walter, Johannes Schindelin, Tobias Pietzsch, Stephan Saalfeld, Michael R Berthold, Kevin W Eliceiri
{"title":"SciJava Ops: an improved algorithms framework for Fiji and beyond.","authors":"Gabriel J Selzer, Curtis T Rueden, Mark C Hiner, Edward L Evans, David Kolb, Marcel Wiedenmann, Christian Birkhold, Tim-Oliver Buchholz, Stefan Helfrich, Brian Northan, Alison Walter, Johannes Schindelin, Tobias Pietzsch, Stephan Saalfeld, Michael R Berthold, Kevin W Eliceiri","doi":"10.3389/fbinf.2024.1435733","DOIUrl":"https://doi.org/10.3389/fbinf.2024.1435733","url":null,"abstract":"<p><p>Decades of iteration on scientific imaging hardware and software has yielded an explosion in not only the size, complexity, and heterogeneity of image datasets but also in the tooling used to analyze this data. This wealth of image analysis tools, spanning different programming languages, frameworks, and data structures, is itself a problem for data analysts who must adapt to new technologies and integrate established routines to solve increasingly complex problems. While many \"bridge\" layers exist to unify pairs of popular tools, there exists a need for a general solution to unify new and existing toolkits. The SciJava Ops library presented here addresses this need through two novel principles. Algorithm implementations are declared as plugins called Ops, providing a uniform interface regardless of the toolkit they came from. Users express their needs declaratively to the Op environment, which can then find and adapt available Ops on demand. By using these principles instead of direct function calls, users can write streamlined workflows while avoiding the translation boilerplate of bridge layers. Developers can easily extend SciJava Ops to introduce new libraries and more efficient, specialized algorithm implementations, even immediately benefitting existing workflows. We provide several use cases showing both user and developer benefits, as well as benchmarking data to quantify the negligible impact on overall analysis performance. We have initially deployed SciJava Ops on the Fiji platform, however it would be suitable for integration with additional analysis platforms in the future.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1435733"},"PeriodicalIF":2.8,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11466933/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142482321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Esteban Gabory, Moses Njagi Mwaniki, Nadia Pisanti, Solon P Pissis, Jakub Radoszewski, Michelle Sweering, Wiktor Zuba
{"title":"Pangenome comparison via ED strings.","authors":"Esteban Gabory, Moses Njagi Mwaniki, Nadia Pisanti, Solon P Pissis, Jakub Radoszewski, Michelle Sweering, Wiktor Zuba","doi":"10.3389/fbinf.2024.1397036","DOIUrl":"10.3389/fbinf.2024.1397036","url":null,"abstract":"<p><strong>Introduction: </strong>An elastic-degenerate (ED) string is a sequence of sets of strings. It can also be seen as a directed acyclic graph whose edges are labeled by strings. The notion of ED strings was introduced as a simple alternative to variation and sequence graphs for representing a pangenome, that is, a collection of genomic sequences to be analyzed jointly or to be used as a reference.</p><p><strong>Methods: </strong>In this study, we define notions of <i>matching statistics</i> of two ED strings as similarity measures between pangenomes and, consequently infer a corresponding distance measure. We then show that both measures can be computed efficiently, in both theory and practice, by employing the <i>intersection graph</i> of two ED strings.</p><p><strong>Results: </strong>We also implemented our methods as a software tool for pangenome comparison and evaluated their efficiency and effectiveness using both synthetic and real datasets.</p><p><strong>Discussion: </strong>As for efficiency, we compare the runtime of the intersection graph method against the classic product automaton construction showing that the intersection graph is faster by up to one order of magnitude. For showing effectiveness, we used real SARS-CoV-2 datasets and our matching statistics similarity measure to reproduce a well-established clade classification of SARS-CoV-2, thus demonstrating that the classification obtained by our method is in accordance with the existing one.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1397036"},"PeriodicalIF":2.8,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11464492/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142402117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"QSPRmodeler - An open source application for molecular predictive analytics.","authors":"Rafał A Bachorz, Damian Nowak, Marcin Ratajewski","doi":"10.3389/fbinf.2024.1441024","DOIUrl":"10.3389/fbinf.2024.1441024","url":null,"abstract":"<p><p>The drug design process can be successfully supported using a variety of <i>in silico</i> methods. Some of these are oriented toward molecular property prediction, which is a key step in the early drug discovery stage. Before experimental validation, drug candidates are usually compared with known experimental data. Technically, this can be achieved using machine learning approaches, in which selected experimental data are used to train the predictive models. The proposed Python software is designed for this purpose. It supports the entire workflow of molecular data processing, starting from raw data preparation followed by molecular descriptor creation and machine learning model training. The predictive capabilities of the resulting models were carefully validated internally and externally. These models can be easily applied to new compounds, including within more complex workflows involving generative approaches.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1441024"},"PeriodicalIF":2.8,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11464749/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142402118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The quantum hypercube as a k-mer graph.","authors":"Gustavo Becerra-Gavino, Liliana Ibeth Barbosa-Santillan","doi":"10.3389/fbinf.2024.1401223","DOIUrl":"https://doi.org/10.3389/fbinf.2024.1401223","url":null,"abstract":"<p><p>The application of quantum principles in computing has garnered interest since the 1980s. Today, this concept is not only theoretical, but we have the means to design and execute techniques that leverage the quantum principles to perform calculations. The emergence of the quantum walk search technique exemplifies the practical application of quantum concepts and their potential to revolutionize information technologies. It promises to be versatile and may be applied to various problems. For example, the coined quantum walk search allows for identifying a marked item in a combinatorial search space, such as the quantum hypercube. The quantum hypercube organizes the qubits such that the qubit states represent the vertices and the edges represent the transitions to the states differing by one qubit state. It offers a novel framework to represent k-mer graphs in the quantum realm. Thus, the quantum hypercube facilitates the exploitation of parallelism, which is made possible through superposition and entanglement to search for a marked k-mer. However, as found in the analysis of the results, the search is only sometimes successful in hitting the target. Thus, through a meticulous examination of the quantum walk search circuit outcomes, evaluating what input-target combinations are useful, and a visionary exploration of DNA k-mer search, this paper opens the door to innovative possibilities, laying down the groundwork for further research to bridge the gap between theoretical conjecture in quantum computing and a tangible impact in bioinformatics.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1401223"},"PeriodicalIF":2.8,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11425167/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142333667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Austin Swart, Ron Caspi, Suzanne Paley, Peter D Karp
{"title":"Visual analysis of multi-omics data.","authors":"Austin Swart, Ron Caspi, Suzanne Paley, Peter D Karp","doi":"10.3389/fbinf.2024.1395981","DOIUrl":"https://doi.org/10.3389/fbinf.2024.1395981","url":null,"abstract":"<p><p>We present a tool for multi-omics data analysis that enables simultaneous visualization of up to four types of omics data on organism-scale metabolic network diagrams. The tool's interactive web-based metabolic charts depict the metabolic reactions, pathways, and metabolites of a single organism as described in a metabolic pathway database for that organism; the charts are constructed using automated graphical layout algorithms. The multi-omics visualization facility paints each individual omics dataset onto a different \"visual channel\" of the metabolic-network diagram. For example, a transcriptomics dataset might be displayed by coloring the reaction arrows within the metabolic chart, while a companion proteomics dataset is displayed as reaction arrow thicknesses, and a complementary metabolomics dataset is displayed as metabolite node colors. Once the network diagrams are painted with omics data, semantic zooming provides more details within the diagram as the user zooms in. Datasets containing multiple time points can be displayed in an animated fashion. The tool will also graph data values for individual reactions or metabolites designated by the user. The user can interactively adjust the mapping from data value ranges to the displayed colors and thicknesses to provide more informative diagrams.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1395981"},"PeriodicalIF":2.8,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11420163/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142333668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Catriona Miller, Theo Portlock, Denis M Nyaga, Justin M O'Sullivan
{"title":"A review of model evaluation metrics for machine learning in genetics and genomics.","authors":"Catriona Miller, Theo Portlock, Denis M Nyaga, Justin M O'Sullivan","doi":"10.3389/fbinf.2024.1457619","DOIUrl":"https://doi.org/10.3389/fbinf.2024.1457619","url":null,"abstract":"<p><p>Machine learning (ML) has shown great promise in genetics and genomics where large and complex datasets have the potential to provide insight into many aspects of disease risk, pathogenesis of genetic disorders, and prediction of health and wellbeing. However, with this possibility there is a responsibility to exercise caution against biases and inflation of results that can have harmful unintended impacts. Therefore, researchers must understand the metrics used to evaluate ML models which can influence the critical interpretation of results. In this review we provide an overview of ML metrics for clustering, classification, and regression and highlight the advantages and disadvantages of each. We also detail common pitfalls that occur during model evaluation. Finally, we provide examples of how researchers can assess and utilise the results of ML models, specifically from a genomics perspective.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1457619"},"PeriodicalIF":2.8,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11420621/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142333666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}