{"title":"Expression of Concern: Cleavage-Stage Embryo Segmentation Using SAM-Based Dual Branch Pipeline: Development and Evaluation with the CleavageEmbryo Dataset.","authors":"","doi":"10.1093/bioinformatics/btaf001","DOIUrl":"10.1093/bioinformatics/btaf001","url":null,"abstract":"","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":"41 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11724708/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142973811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Denis Beslic, Martin Kucklick, Susanne Engelmann, Stephan Fuchs, Bernhard Y Renard, Nils Körber
{"title":"End-to-end simulation of nanopore sequencing signals with feed-forward transformers.","authors":"Denis Beslic, Martin Kucklick, Susanne Engelmann, Stephan Fuchs, Bernhard Y Renard, Nils Körber","doi":"10.1093/bioinformatics/btae744","DOIUrl":"10.1093/bioinformatics/btae744","url":null,"abstract":"<p><strong>Motivation: </strong>Nanopore sequencing represents a significant advancement in genomics, enabling direct long-read DNA sequencing at the single-molecule level. Accurate simulation of nanopore sequencing signals from nucleotide sequences is crucial for method development and for complementing experimental data. Most existing approaches rely on predefined statistical models, which may not adequately capture the properties of experimental signal data. Furthermore, these simulators were developed for earlier versions of nanopore chemistry, which limits their applicability and adaptability to the latest flow cell data.</p><p><strong>Results: </strong>To enhance the quality of artificial signals, we introduce seq2squiggle, a novel transformer-based, non-autoregressive model designed to generate nanopore sequencing signals from nucleotide sequences. Unlike existing simulators that rely on static k-mer models, our approach learns sequential contextual information from segmented signal data. We benchmark seq2squiggle against state-of-the-art simulators on real experimental R9.4.1 and R10.4.1 data, evaluating signal similarity, basecalling accuracy, and variant detection rates. Seq2squiggle consistently outperforms existing tools across multiple datasets, demonstrating superior similarity to real data and offering a robust solution for simulating nanopore sequencing signals with the latest flow cell generation.</p><p><strong>Availability and implementation: </strong>seq2squiggle is freely available on GitHub at: github.com/ZKI-PH-ImageAnalysis/seq2squiggle.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11729726/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142878935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Miles D Woodcock-Girard, Eric C Bretz, Holly M Robertson, Karolis Ramanauskas, Jarrad T Hampton-Marcell, Joseph F Walker
{"title":"Semblans: automated assembly and processing of RNA-seq data.","authors":"Miles D Woodcock-Girard, Eric C Bretz, Holly M Robertson, Karolis Ramanauskas, Jarrad T Hampton-Marcell, Joseph F Walker","doi":"10.1093/bioinformatics/btaf003","DOIUrl":"10.1093/bioinformatics/btaf003","url":null,"abstract":"<p><strong>Motivation: </strong>Recent advancements in parallel sequencing methods have precipitated a surge in publicly available short-read sequence data. This has encouraged the development of novel computational tools for the de novo assembly of transcriptomes from RNA-seq data. Despite the availability of these tools, performing an end-to-end transcriptome assembly remains a programmatically involved task necessitating familiarity with best practices. Aside from quality control steps, including error correction, adapter trimming, and chimera filtration needing to be correctly used, moving data between programs often requires manual reformatting or restructuring, which can further impede throughput. Here, we introduce Semblans, a tool for streamlining the assembly process that efficiently and consistently produces high-quality transcriptome assemblies.</p><p><strong>Results: </strong>Semblans abstracts the key quality control, reconstitution, and postprocessing steps of transcriptome assembly from raw short-read sequences to annotated coding sequences. Evaluating its performance against previously assembled transcriptomes on the basis of assembly quality, we find that Semblans produced higher quality assemblies for 98 of the 101 short-read runs tested.</p><p><strong>Availability and implementation: </strong>Semblans is written in C++ and runs on Unix-compliant operating systems. Source code, documentation, and compiled binaries are hosted under the GNU General Public License at https://github.com/gladshire/Semblans.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11748423/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142960041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Open Chrono-Morph Viewer: visualize big bioimage time series containing heterogeneous volumes.","authors":"Andre C Faubert, Shang Wang","doi":"10.1093/bioinformatics/btae761","DOIUrl":"10.1093/bioinformatics/btae761","url":null,"abstract":"<p><strong>Summary: </strong>Time-lapse 3D imaging is fundamental for studying biological processes but requires software able to handle terabytes of voxel data. Although many multidimensional viewing applications exist, they mostly lack support for heterogeneous voxel counts, datatypes, and modalities in a single timeline. Open Chrono-Morph Viewer provides a straightforward graphical user interface to quickly investigate multi-timescale datasets represented as separate volume files in the common NRRD format for compatibility between toolchains. It features dynamic clipping surfaces for rapid investigation of 3D morphology and a scriptable animation API for quantitative, repeatable, publication-quality visualization. It is implemented in pure Python using common libraries to facilitate community-driven development.</p><p><strong>Availability and implementation: </strong>OCMV is available at https://github.com/ShangWangLab/OpenChronoMorphViewer for Windows, Linux, and macOS. Supporting tutorials, documentation, and installation instructions can be found in the supplementary information. Our modified Fiji I/O plugin for up to 5D NRRD file conversion is available at https://github.com/afaubert/IO.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":"41 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11751631/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143018191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jacek Karolczak, Anna Przybyłowska, Konrad Szewczyk, Witold Taisner, John M Heumann, Michael H B Stowell, Michał Nowicki, Dariusz Brzezinski
{"title":"Ligand identification in CryoEM and X-ray maps using deep learning.","authors":"Jacek Karolczak, Anna Przybyłowska, Konrad Szewczyk, Witold Taisner, John M Heumann, Michael H B Stowell, Michał Nowicki, Dariusz Brzezinski","doi":"10.1093/bioinformatics/btae749","DOIUrl":"10.1093/bioinformatics/btae749","url":null,"abstract":"<p><strong>Motivation: </strong>Accurately identifying ligands plays a crucial role in the process of structure-guided drug design. Based on density maps from X-ray diffraction or cryogenic-sample electron microscopy (cryoEM), scientists verify whether small-molecule ligands bind to active sites of interest. However, the interpretation of density maps is challenging, and cognitive bias can sometimes mislead investigators into modeling fictitious compounds. Ligand identification can be aided by automatic methods, but existing approaches are available only for X-ray diffraction and are based on iterative fitting or feature-engineered machine learning rather than end-to-end deep learning.</p><p><strong>Results: </strong>Here, we propose to identify ligands using a deep-learning approach that treats density maps as 3D point clouds. We show that the proposed model is on par with existing machine learning methods for X-ray crystallography while also being applicable to cryoEM density maps. Our study demonstrates that electron density map fragments can aid the training of models that can later be applied to cryoEM structures but also highlights challenges associated with the standardization of electron microscopy maps and the quality assessment of cryoEM ligands.</p><p><strong>Availability and implementation: </strong>Code and model weights are available on GitHub at https://github.com/jkarolczak/ligands-classification. An accompanying ChimeraX bundle is available at https://github.com/wtaisner/chimerax-ligand-recognizer.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11709248/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142866571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wannes Mores, Satyajeet S Bhonsale, Filip Logist, Jan F M Van Impe
{"title":"Accelerated enumeration of extreme rays through a positive-definite elementarity test.","authors":"Wannes Mores, Satyajeet S Bhonsale, Filip Logist, Jan F M Van Impe","doi":"10.1093/bioinformatics/btae723","DOIUrl":"10.1093/bioinformatics/btae723","url":null,"abstract":"<p><strong>Motivation: </strong>Analysis of metabolic networks through extreme rays such as extreme pathways and elementary flux modes has been shown to be effective for many applications. However, due to the combinatorial explosion of candidate vectors, their enumeration is currently limited to small- and medium-scale networks (typically <200 reactions). Partial enumeration of the extreme rays is shown to be possible, but either relies on generating them one-by-one or by implementing a sampling step in the enumeration algorithms. Sampling-based enumeration can be achieved through the canonical basis approach (CBA) or the nullspace approach (NSA). Both algorithms are very efficient in medium-scale networks, but struggle with elementarity testing in sampling-based enumeration of larger networks.</p><p><strong>Results: </strong>In this paper, a novel elementarity test is defined and exploited, resulting in significant speedup of the enumeration. Even though NSA is currently considered more effective, the novel elementarity test allows CBA to significantly outpace NSA. This is shown through two case studies, ranging from a medium-scale network to a genome-scale metabolic network with over 600 reactions. In this study, extreme pathways are chosen as the extreme rays, but the novel elementarity test and CBA are equally applicable to the other types. With the increasing complexity of metabolic networks in recent years, CBA with the novel elementarity test shows even more promise as its advantages grows with increased network complexity. Given this scaling aspect, CBA is now the faster method for enumerating extreme rays in genome-scale metabolic networks.</p><p><strong>Availability and implementation: </strong>All case studies are implemented in Python. The codebase used to generate extreme pathways using the different approaches is available at https://gitlab.kuleuven.be/biotec-plus/pos-def-ep.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11724715/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142869798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Stijn Wittouck, Tom Eilers, Vera van Noort, Sarah Lebeer
{"title":"SCARAP: scalable cross-species comparative genomics of prokaryotes.","authors":"Stijn Wittouck, Tom Eilers, Vera van Noort, Sarah Lebeer","doi":"10.1093/bioinformatics/btae735","DOIUrl":"10.1093/bioinformatics/btae735","url":null,"abstract":"<p><strong>Motivation: </strong>Much of prokaryotic comparative genomics currently relies on two critical computational tasks: pangenome inference and core genome inference. Pangenome inference involves clustering genes from a set of genomes into gene families, enabling genome-wide association studies and evolutionary history analysis. The core genome represents gene families present in nearly all genomes and is required to infer a high-quality phylogeny. For species-level datasets, fast pangenome inference tools have been developed. However, tools applicable to more diverse datasets are currently slow and scale poorly.</p><p><strong>Results: </strong>Here, we introduce SCARAP, a program containing three modules for comparative genomics analyses: a fast and scalable pangenome inference module, a direct core genome inference module, and a module for subsampling representative genomes. When benchmarked against existing tools, the SCARAP pan module proved up to an order of magnitude faster with comparable accuracy. The core module was validated by comparing its result against a core genome extracted from a full pangenome. The sample module demonstrated the rapid sampling of genomes with decreasing novelty. Applied to a dataset of over 31 000 Lactobacillales genomes, SCARAP showcased its ability to derive a representative pangenome. Finally, we applied the novel concept of gene fixation frequency to this pangenome, showing that Lactobacillales genes that are prevalent but rarely fixate in species often encode bacteriophage functions.</p><p><strong>Availability and implementation: </strong>The SCARAP toolkit is publicly available at https://github.com/swittouck/scarap.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11681940/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142815257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fabienne Thelen, Jannis Hochmuth, Sven Griep, Benedikt Schwab, Alexander Goesmann, Frank Förster
{"title":"Crypt4GH-JS: securely storing sensitive data online with client-side encryption.","authors":"Fabienne Thelen, Jannis Hochmuth, Sven Griep, Benedikt Schwab, Alexander Goesmann, Frank Förster","doi":"10.1093/bioinformatics/btae763","DOIUrl":"10.1093/bioinformatics/btae763","url":null,"abstract":"<p><strong>Motivation and results: </strong>Crypt4GH-JS is a browser-ready implementation of the Crypt4GH file encryption standard written in JavaScript. While having minimal to no impact on data upload and download throughput this library enables on-the-fly encryption of arbitrary data in web applications, regardless of whether on the client or server side. As development moves more and more toward cloud-native applications, this library represents a significant step forward for flexible data security in the context of opaque cloud storage systems.</p><p><strong>Availability and implementation: </strong>Crypt4GH-JS can be installed via Node Package Manager (https://www.npmjs.com/package/crypt4gh_js) or through its public GitHub Repository (https://github.com/fathelen/crypt4ghJS), where the source code is available. Crypt4GH-JS can be tested in the browser using our demonstration website, which can be found at: https://fathelen.github.io/crypt4ghJS/.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11771768/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142933805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"NLSDeconv: an efficient cell-type deconvolution method for spatial transcriptomics data.","authors":"Yunlu Chen, Feng Ruan, Ji-Ping Wang","doi":"10.1093/bioinformatics/btae747","DOIUrl":"10.1093/bioinformatics/btae747","url":null,"abstract":"<p><strong>Summary: </strong>Spatial transcriptomics (ST) allows gene expression profiling within intact tissue samples but lacks single-cell resolution. This necessitates computational deconvolution methods to estimate the contributions of distinct cell types. This article introduces NLSDeconv, a novel cell-type deconvolution method based on non-negative least squares, along with an accompanying Python package. Benchmarking against 18 existing deconvolution methods on various ST datasets demonstrates NLSDeconv's competitive statistical performance and superior computational efficiency.</p><p><strong>Availability and implementation: </strong>NLSDeconv is freely available at https://github.com/tinachentc/NLSDeconv as a Python package.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11696698/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142869804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Manci Li, Damani N Bryant, Sarah Gresch, Marissa S Milstein, Peter R Christenson, Stuart S Lichtenberg, Peter A Larsen, Sang-Hyun Oh
{"title":"QuICSeedR: an R package for analyzing fluorophore-assisted seed amplification assay data.","authors":"Manci Li, Damani N Bryant, Sarah Gresch, Marissa S Milstein, Peter R Christenson, Stuart S Lichtenberg, Peter A Larsen, Sang-Hyun Oh","doi":"10.1093/bioinformatics/btae752","DOIUrl":"10.1093/bioinformatics/btae752","url":null,"abstract":"<p><strong>Motivation: </strong>Fluorophore-assisted seed amplification assays (F-SAAs), such as real-time quaking-induced conversion (RT-QuIC) and fluorophore-assisted protein misfolding cyclic amplification (F-PMCA), have become indispensable tools for studying protein misfolding in neurodegenerative diseases. However, analyzing data generated by these techniques often requires complex and time-consuming manual processes. In addition, the lack of standardization in F-SAA data analysis presents a significant challenge to the interpretation and reproducibility of F-SAA results across different laboratories and studies. There is a need for automated, standardized analysis tools that can efficiently process F-SAA data while ensuring consistency and reliability across different research settings.</p><p><strong>Results: </strong>Here, we present QuICSeedR (pronounced as \"quick seeder\"), an R package that addresses these challenges by providing a comprehensive toolkit for the automated processing, analysis, and visualization of F-SAA data. Importantly, QuICSeedR also establishes the foundation for building an F-SAA data management and analysis framework, enabling more consistent and comparable results across different research groups.</p><p><strong>Availability and implementation: </strong>QuICSeedR is freely available at: https://CRAN.R-project.org/package=QuICSeedR. Data and code used in this manuscript are provided in Supplementary Materials.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11742141/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142883903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}