Wudmir Y Rojas, Zargham Ahmad, Julia Jakiela, Helge Hecht, Jana Klánová, Elliott J Price
{"title":"Galaxy QCxMS for straightforward semi-empirical quantum mechanical EI-MS prediction.","authors":"Wudmir Y Rojas, Zargham Ahmad, Julia Jakiela, Helge Hecht, Jana Klánová, Elliott J Price","doi":"10.46471/gigabyte.160","DOIUrl":"10.46471/gigabyte.160","url":null,"abstract":"<p><p>High-performance computing (HPC) environments are crucial for computational research, including quantum chemistry (QC), but pose challenges for non-expert users. Researchers with limited computational knowledge struggle to utilise domain-specific software and access mass spectra prediction for <i>in silico</i> annotation. Here, we provide a robust workflow that leverages interoperable file formats for molecular structures to ensure integration across various QC tools. The quantum chemistry package for mass spectral predictions after electron ionization or collision-induced dissociation has been integrated into the Galaxy platform, enabling automated analysis of fragmentation mechanisms. The extended tight binding quantum chemistry package, chosen for its balance between accuracy and computational efficiency, provides molecular geometry optimisation. A Docker image encapsulates the necessary software stack. We demonstrated the workflow for four molecules, highlighting the scalability and efficiency of our solution via runtime performance analysis. This work shows how non-HPC users can make these predictions effortlessly, using advanced computational tools without needing in-depth expertise.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2025 ","pages":"gigabyte160"},"PeriodicalIF":0.0,"publicationDate":"2025-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12257954/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144638787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Roberto Márquez, Denis Jacob Machado, Reyhaneh Nouri, Kerry L Gendreau, Daniel Janies, Ralph A Saporito, Marcus R Kronforst, Taran Grant
{"title":"A draft genome assembly for the dart-poison frog <i>Phyllobates terribilis</i>.","authors":"Roberto Márquez, Denis Jacob Machado, Reyhaneh Nouri, Kerry L Gendreau, Daniel Janies, Ralph A Saporito, Marcus R Kronforst, Taran Grant","doi":"10.46471/gigabyte.157","DOIUrl":"10.46471/gigabyte.157","url":null,"abstract":"<p><p>Dendrobatid poison frogs have become well established as model systems in several fields of biology. Nevertheless, the development of molecular and genetic resources for these frogs has been hindered by their large, highly repetitive genomes, which have proven difficult to assemble. Here we present a draft assembly for <i>Phyllobates terribilis</i> (12.6 Gb), generated using a combination of sequencing platforms and bioinformatic approaches. Similar to other poison frog sequencing efforts, we recovered a highly fragmented assembly, likely due to the genome's large size and very high repeat content, which we estimated to be ≍88%. Despite the assembly's low contiguity, we were able to annotate multiple members of three gene sets of interest (voltage-gated sodium channels and <i>Notch</i> and <i>Wnt</i> signaling pathways), demonstrating the usefulness of our assembly to the amphibian research community.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2025 ","pages":"gigabyte157"},"PeriodicalIF":0.0,"publicationDate":"2025-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12208295/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144531342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marcel Nebenführ, David Prochotta, Maria A Nilsson, Menno J de Jong, Tunca D Yazici, Fabienne Langefeld, Malambo Muloongo, Helena Woköck, Jakob Jilg, Sina C Bender, Marvin M Zangl, Juan-Manuel Ortega Guatame, Kimberley Williams, Moritz Sonnewald, Axel Janke
{"title":"Chromosome-level genome assembly of the lemon sole, <i>Microstomus kitt</i> (Pleuronectiformes: Pleuronectidae).","authors":"Marcel Nebenführ, David Prochotta, Maria A Nilsson, Menno J de Jong, Tunca D Yazici, Fabienne Langefeld, Malambo Muloongo, Helena Woköck, Jakob Jilg, Sina C Bender, Marvin M Zangl, Juan-Manuel Ortega Guatame, Kimberley Williams, Moritz Sonnewald, Axel Janke","doi":"10.46471/gigabyte.156","DOIUrl":"10.46471/gigabyte.156","url":null,"abstract":"<p><strong>Background: </strong>The lemon sole (<i>Microstomus kitt</i>) is a culinary fish from the family of righteye flounders (Pleuronectidae), inhabiting sandy, shallow offshore grounds of the North Sea, western Baltic Sea, English Channel, Great Britain and Ireland, Bay of Biscay, and coastal waters of Norway.</p><p><strong>Findings: </strong>Here, we present a chromosome-level genome assembly of the lemon sole. We applied PacBio HiFi sequencing on the PacBio Revio system to generate a highly complete and contiguous reference genome.The resulting assembly has a contig N50 of 17.2 Mbp and a scaffold N50 of 27.2 Mbp. The total assembly length is 628 Mbp, comprising 24 chromosome-length scaffolds. The identification of 99.7% complete BUSCO genes indicates a high level of assembly completeness.</p><p><strong>Conclusions: </strong>The chromosome-level genome assembly of the lemon sole provides a high-quality reference genome for future population-level genomic analyses of this commercially valuable, edible fish.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2025 ","pages":"gigabyte156"},"PeriodicalIF":0.0,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12135936/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144227869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Chromosome-level genome assemblies of five <i>Sinocyclocheilus</i> species.","authors":"Chao Bian, Ruihan Li, Yuqian Ouyang, Junxing Yang, Xidong Mu, Qiong Shi","doi":"10.46471/gigabyte.155","DOIUrl":"10.46471/gigabyte.155","url":null,"abstract":"<p><p><i>Sinocyclocheilus</i>, a genus of tetraploid fishes endemic to Southwest China's karst regions, are classified as second-class nationally protected species due to their fragile habitat. Limited high-quality genomic resources have hampered studies on their phylogenetic relationships and the origin of their polyploidy. Here, we present a high-quality genome assembly of the most abundant <i>Sinocyclocheilus</i> species, the golden-line barbel (<i>Sinocyclocheilus grahami</i>), by integrating PacBio long-read and Hi-C sequencing. The resulting scaffold-level genome-assembly is 1.6 Gb long, with a scaffold N50 of up to 30.7 Mb. We annotated 42,806 protein-coding genes. Also, 93.1% of the assembled genome sequences (about 1.5 Gb) and 93.8% of the total predicted genes were successfully anchored onto 48 chromosomes. Furthermore, we obtained chromosome-level genome assemblies for four other <i>Sinocyclocheilus</i> species (<i>S. anophthalmus</i>, <i>S. maitianheensis</i>, <i>S. anshuiensis</i>, and <i>S. rhinocerous</i>) based on homologous comparisons. These genomic resources will enable in-depth investigations on cave adaptation, improvement of economic values, and conservation of diverse <i>Sinocyclocheilus</i> fishes.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2025 ","pages":"gigabyte155"},"PeriodicalIF":0.0,"publicationDate":"2025-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12089701/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144113018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficiently constructing complete genomes with CycloneSEQ to fill gaps in bacterial draft assemblies.","authors":"Hewei Liang, Yuanqiang Zou, Mengmeng Wang, Tongyuan Hu, Haoyu Wang, Wenxin He, Yanmei Ju, Ruijin Guo, Junyi Chen, Fei Guo, Tao Zeng, Yuliang Dong, Yuning Zhang, Bo Wang, Chuanyu Liu, Xin Jin, Wenwei Zhang, Xun Xu, Liang Xiao","doi":"10.46471/gigabyte.154","DOIUrl":"https://doi.org/10.46471/gigabyte.154","url":null,"abstract":"<p><p>Current microbial sequencing relies on short-read platforms like Illumina and DNBSEQ, which are cost-effective and accurate but often produce fragmented draft genomes. Here, we used CycloneSEQ for long-read sequencing of ATCC BAA-835, producing long-reads with an average length of 11.6 kbp and an average quality score of 14.4. Hybrid assembly with short-reads data resulted in an error rate of only 0.04 mismatches and 0.08 indels per 100 kbp compared to the reference genome. This method, validated across nine species, successfully assembled complete circular genomes. Hybrid assembly significantly enhances genome completeness by using long-reads to fill gaps and accurately assembling multi-copy rRNA genes, unlike short-reads alone. Data subsampling showed that combining over 500 Mbp of short-read data with 100 Mbp of long-read data yields high-quality circular assemblies. CycloneSEQ long-reads improves the assembly of circular complete genomes from mixed microbial communities; however, its base quality needs improving. Integrating DNBSEQ short-reads improved accuracy, resulting in complete and accurate assemblies.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2025 ","pages":"gigabyte154"},"PeriodicalIF":0.0,"publicationDate":"2025-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12051259/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144044131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Trinity Conn, Jill Ashey, Ross Cunning, Hollie M Putnam
{"title":"Genome assembly and annotation of <i>Acropora pulchra</i> from Mo'orea French Polynesia.","authors":"Trinity Conn, Jill Ashey, Ross Cunning, Hollie M Putnam","doi":"10.46471/gigabyte.153","DOIUrl":"https://doi.org/10.46471/gigabyte.153","url":null,"abstract":"<p><p>Reef-building corals are integral ecosystem engineers of tropical reefs but face threats from climate change. Investigating genetic, epigenetic, and environmental factors influencing their adaptation is critical. Genomic resources are essential for understanding coral biology and guiding conservation efforts. However, genomes of the coral genus <i>Acropora</i> are limited to highly-studied species. Here, we present the assembly and annotation of the genome and DNA methylome of <i>Acropora pulchra</i> from Mo'orea, French Polynesia. Using long-read PacBio HiFi and Illumina RNASeq, we generated the most complete <i>Acropora</i> genome to date (BUSCO completeness of 96.7% metazoan genes). The assembly size is 518 Mbp, with 174 scaffolds, and a scaffold N50 of 17 Mbp. We predicted 40,518 protein-coding genes and 16.74% of the genome in repeats. DNA methylation in the CpG context is 14.6%. This assembly of the <i>A. pulchra</i> genome and DNA methylome will support studies of coastal corals in French Polynesia, aiding conservation and comparative studies of <i>Acropora</i> and cnidarians.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2025 ","pages":"gigabyte153"},"PeriodicalIF":0.0,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11985253/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144060361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CompactTree: a lightweight header-only C++ library and Python wrapper for ultra-large phylogenetics.","authors":"Niema Moshiri","doi":"10.46471/gigabyte.152","DOIUrl":"10.46471/gigabyte.152","url":null,"abstract":"<p><p>The study of viral and bacterial species requires the ability to load and traverse ultra-large phylogenies with tens of millions of tips, but existing tree libraries struggle to scale to these sizes. We introduce CompactTree, a lightweight header-only C++ library with a user-friendly Python wrapper for traversing ultra-large trees that can be easily incorporated into other tools. We show that CompactTree is orders of magnitude faster and requires orders of magnitude less memory than existing tree packages. CompactTree is freely accessible as an open source project: https://github.com/niemasd/CompactTree.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2025 ","pages":"gigabyte152"},"PeriodicalIF":0.0,"publicationDate":"2025-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11921128/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143665474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Portable-CELLxGENE: standalone executables of CELLxGENE for easy installation.","authors":"George T Hall","doi":"10.46471/gigabyte.151","DOIUrl":"10.46471/gigabyte.151","url":null,"abstract":"<p><p>Biologists who want to analyse their single-cell transcriptomics dataset must install and use specialist software via the command line. This is often impractical for non-bioinformaticians. Whilst the popular CELLxGENE software provides an intuitive graphical interface to facilitate analysis outside the command line, its server-side installation and execution remain complex. A version that is easier to install and run would allow non-bioinformaticians to take advantage of this valuable tool without needing to use the command line. This work introduces Portable-CELLxGENE, a standalone distribution of CELLxGENE that can be installed via a graphical interface. It contains an easy-to-use extension of the CELLxGENE-Gateway Python package to allow the analysis of multiple datasets. This tool enables non-bioinformaticians to carry out simple analyses independently.</p><p><strong>Availability and implementation: </strong>Versions of Portable-CELLxGENE for Windows and MacOS, along with source code, are available at https://george-hall-ucl.github.io/Portable-CELLxGENE-Docs. It is licensed under the GNU General Public License v3.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2025 ","pages":"gigabyte151"},"PeriodicalIF":0.0,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11894539/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143607446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Eleanore J Ritter, Noé Cochetel, Andrea Minio, Peter Cousins, Dario Cantu, Chad Niederhuth
{"title":"The assembly and annotation of two teinturier grapevine varieties, Dakapo and Rubired.","authors":"Eleanore J Ritter, Noé Cochetel, Andrea Minio, Peter Cousins, Dario Cantu, Chad Niederhuth","doi":"10.46471/gigabyte.149","DOIUrl":"10.46471/gigabyte.149","url":null,"abstract":"<p><p>Teinturier grapevines, known for their pigmented flesh berries due to anthocyanin production, are valuable for enhancing the pigmentation of wine, for potential health benefits, and for investigating anthocyanin production in plants. Here, we assembled and annotated the Dakapo and Rubired genomes, two teinturier varieties. For Dakapo, we combined Nanopore sequencing, Illumina sequencing, and scaffolding to the existing grapevine assembly to generate a final assembly of 508.5 Mbp. Combining <i>de novo</i> annotation and lifting over annotations from the existing grapevine reference produced annotation 36,940 gene annotations for Dakapo. For Rubired, PacBio HiFi reads were assembled, scaffolded, and phased to generate a diploid assembly with two haplotypes 474.7-476.0 Mbp long. <i>De novo</i> annotation of the diploid Rubired genome yielded annotations for 56,681 genes. Both genomes are highly contiguous and complete. The Dakapo and Rubired genome assemblies provide genetic resources for investigations into berry flesh pigmentation and other traits of interest in grapevine.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2025 ","pages":"gigabyte149"},"PeriodicalIF":0.0,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11891882/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143598414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ma Carmel F Javier, Albert C Noblezada, Persie Mark Q Sienes, Robert S Guino-O, Nadia Palomar-Abesamis, Maria Celia D Malay, Carmelo S Del Castillo, Victor Marco Emmanuel N Ferriols
{"title":"Draft genome of the endangered visayan spotted deer (<i>Rusa alfredi)</i>, a Philippine endemic species.","authors":"Ma Carmel F Javier, Albert C Noblezada, Persie Mark Q Sienes, Robert S Guino-O, Nadia Palomar-Abesamis, Maria Celia D Malay, Carmelo S Del Castillo, Victor Marco Emmanuel N Ferriols","doi":"10.46471/gigabyte.150","DOIUrl":"10.46471/gigabyte.150","url":null,"abstract":"<p><p>The Visayan Spotted Deer (VSD), or <i>Rusa alfredi</i>, is an endangered and endemic species in the Philippines. Despite its status, genomic information on <i>R. alfredi</i>, and the genus <i>Rusa</i> in general, is missing. This study presents the first draft genome assembly of the VSD using the Illumina short-read sequencing technology. The resulting RusAlf_1.1 assembly has a 2.52 Gb total length, with a contig N50 of 46 Kb and scaffold N50 size of 75 Mb. The assembly has a BUSCO complete score of 95.5%, demonstrating the genome's completeness, and includes the annotation of 24,531 genes. Our phylogenetic analysis based on single-copy orthologs revealed a close evolutionary relationship between <i>R. alfredi</i> and the genus <i>Cervus</i>. RusAlf_1.1 represents a significant advancement in our understanding of the VSD. It opens opportunities for further research in population genetics and evolutionary biology, potentially contributing to more effective conservation and management strategies for this endangered species.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2025 ","pages":"gigabyte150"},"PeriodicalIF":0.0,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11876970/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143560181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}