GigaSciencePub Date : 2024-01-02DOI: 10.1093/gigascience/giad099
Jasper Ouwerkerk, Helena Rasche, John D Spalding, Saskia Hiltemann, Andrew P Stubbs
{"title":"FAIR data retrieval for sensitive clinical research data in Galaxy.","authors":"Jasper Ouwerkerk, Helena Rasche, John D Spalding, Saskia Hiltemann, Andrew P Stubbs","doi":"10.1093/gigascience/giad099","DOIUrl":"10.1093/gigascience/giad099","url":null,"abstract":"<p><strong>Background: </strong>In clinical research, data have to be accessible and reproducible, but the generated data are becoming larger and analysis complex. Here we propose a platform for Findable, Accessible, Interoperable, and Reusable (FAIR) data access and creating reproducible findings. Standardized access to a major genomic repository, the European Genome-Phenome Archive (EGA), has been achieved with API services like PyEGA3. We aim to provide a FAIR data analysis service in Galaxy by retrieving genomic data from the EGA and provide a generalized \"omics\" platform for FAIR data analysis.</p><p><strong>Results: </strong>To demonstrate this, we implemented an end-to-end Galaxy workflow to replicate the findings from an RD-Connect synthetic dataset Beyond the 1 Million Genomes (synB1MG) available from the EGA. We developed the PyEGA3 connector within Galaxy to easily download multiple datasets from the EGA. We added the gene.iobio tool, a diagnostic environment for precision genomics, to Galaxy and demonstrate that it provides a more dynamic and interpretable view for trio analysis results. We developed a Galaxy trio analysis workflow to determine the pathogenic variants from the synB1MG trios using the GEMINI and gene.iobio tool. The complete workflow is available at WorkflowHub, and an associated tutorial was created in the Galaxy Training Network, which helps researchers unfamiliar with Galaxy to run the workflow.</p><p><strong>Conclusions: </strong>We showed the feasibility of reusing data from the EGA in Galaxy via PyEGA3 and validated the workflow by rediscovering spiked-in variants in synthetic data. Finally, we improved existing tools in Galaxy and created a workflow for trio analysis to demonstrate the value of FAIR genomics analysis in Galaxy.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"13 ","pages":""},"PeriodicalIF":3.5,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10821763/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139570419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
GigaSciencePub Date : 2024-01-02DOI: 10.1093/gigascience/giad119
Emma de Jong, Lara Parata, Philipp E Bayer, Shannon Corrigan, Richard J Edwards
{"title":"Toward genome assemblies for all marine vertebrates: current landscape and challenges.","authors":"Emma de Jong, Lara Parata, Philipp E Bayer, Shannon Corrigan, Richard J Edwards","doi":"10.1093/gigascience/giad119","DOIUrl":"10.1093/gigascience/giad119","url":null,"abstract":"<p><p>Marine vertebrate biodiversity is fundamental to ocean ecosystem health but is threatened by climate change, overharvesting, and habitat degradation. High-quality reference genomes are valuable foundational scientific resources that can inform conservation efforts. Consequently, global consortia are striving to produce reference genomes for representatives of all life. Here, we summarize the current landscape of available marine vertebrate reference genomes, including their phylogenetic diversity and geographic hotspots of production. We discuss key logistical and technical challenges that remain to be overcome if we are to realize the vision of a comprehensive reference genome library of all marine vertebrates.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"13 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10821707/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139570422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
GigaSciencePub Date : 2024-01-02DOI: 10.1093/gigascience/giae029
Scott Ferguson, Ashley Jones, Kevin Murray, Rose L Andrew, Benjamin Schwessinger, Helen Bothwell, Justin Borevitz
{"title":"Exploring the role of polymorphic interspecies structural variants in reproductive isolation and adaptive divergence in Eucalyptus.","authors":"Scott Ferguson, Ashley Jones, Kevin Murray, Rose L Andrew, Benjamin Schwessinger, Helen Bothwell, Justin Borevitz","doi":"10.1093/gigascience/giae029","DOIUrl":"10.1093/gigascience/giae029","url":null,"abstract":"<p><p>Structural variations (SVs) play a significant role in speciation and adaptation in many species, yet few studies have explored the prevalence and impact of different categories of SVs. We conducted a comparative analysis of long-read assembled reference genomes of closely related Eucalyptus species to identify candidate SVs potentially influencing speciation and adaptation. Interspecies SVs can be either fixed differences or polymorphic in one or both species. To describe SV patterns, we employed short-read whole-genome sequencing on over 600 individuals of Eucalyptus melliodora and Eucalyptus sideroxylon, along with recent high-quality genome assemblies. We aligned reads and genotyped interspecies SVs predicted between species reference genomes. Our results revealed that 49,756 of 58,025 and 39,536 of 47,064 interspecies SVs could be typed with short reads in E. melliodora and E. sideroxylon, respectively. Focusing on inversions and translocations, symmetric SVs that are readily genotyped within both populations, 24 were found to be structural divergences, 2,623 structural polymorphisms, and 928 shared structural polymorphisms. We assessed the functional significance of fixed interspecies SVs by examining differences in estimated recombination rates and genetic differentiation between species, revealing a complex history of natural selection. Shared structural polymorphisms displayed enrichment of potentially adaptive genes. Understanding how different classes of genetic mutations contribute to genetic diversity and reproductive barriers is essential for understanding how organisms enhance fitness, adapt to changing environments, and diversify. Our findings reveal the prevalence of interspecies SVs and elucidate their role in genetic differentiation, adaptive evolution, and species divergence within and between populations.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"13 ","pages":""},"PeriodicalIF":3.5,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11170218/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141310458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
GigaSciencePub Date : 2024-01-02DOI: 10.1093/gigascience/giae026
Bryan Raubenolt, Daniel Blankenberg
{"title":"Generalized open-source workflows for atomistic molecular dynamics simulations of viral helicases.","authors":"Bryan Raubenolt, Daniel Blankenberg","doi":"10.1093/gigascience/giae026","DOIUrl":"10.1093/gigascience/giae026","url":null,"abstract":"<p><p>Viral helicases are promising targets for the development of antiviral therapies. Given their vital function of unwinding double-stranded nucleic acids, inhibiting them blocks the viral replication cycle. Previous studies have elucidated key structural details of these helicases, including the location of substrate binding sites, flexible domains, and the discovery of potential inhibitors. Here we present a series of new Galaxy tools and workflows for performing and analyzing molecular dynamics simulations of viral helicases. We first validate them by demonstrating recapitulation of data from previous simulations of Zika (NS3) and SARS-CoV-2 (NSP13) helicases in apo and complex with inhibitors. We further demonstrate the utility and generalizability of these Galaxy workflows by applying them to new cases, proving their usefulness as a widely accessible method for exploring antiviral activity.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"13 ","pages":""},"PeriodicalIF":3.5,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11170216/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141310459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
GigaSciencePub Date : 2024-01-02DOI: 10.1093/gigascience/giae039
Sokratis Kariotis, Pei Fang Tan, Haiping Lu, Christopher J Rhodes, Martin R Wilkins, Allan Lawrie, Dennis Wang
{"title":"Omada: robust clustering of transcriptomes through multiple testing.","authors":"Sokratis Kariotis, Pei Fang Tan, Haiping Lu, Christopher J Rhodes, Martin R Wilkins, Allan Lawrie, Dennis Wang","doi":"10.1093/gigascience/giae039","DOIUrl":"10.1093/gigascience/giae039","url":null,"abstract":"<p><strong>Background: </strong>Cohort studies increasingly collect biosamples for molecular profiling and are observing molecular heterogeneity. High-throughput RNA sequencing is providing large datasets capable of reflecting disease mechanisms. Clustering approaches have produced a number of tools to help dissect complex heterogeneous datasets, but selecting the appropriate method and parameters to perform exploratory clustering analysis of transcriptomic data requires deep understanding of machine learning and extensive computational experimentation. Tools that assist with such decisions without prior field knowledge are nonexistent. To address this, we have developed Omada, a suite of tools aiming to automate these processes and make robust unsupervised clustering of transcriptomic data more accessible through automated machine learning-based functions.</p><p><strong>Findings: </strong>The efficiency of each tool was tested with 7 datasets characterized by different expression signal strengths to capture a wide spectrum of RNA expression datasets. Our toolkit's decisions reflected the real number of stable partitions in datasets where the subgroups are discernible. Within datasets with less clear biological distinctions, our tools either formed stable subgroups with different expression profiles and robust clinical associations or revealed signs of problematic data such as biased measurements.</p><p><strong>Conclusions: </strong>In conclusion, Omada successfully automates the robust unsupervised clustering of transcriptomic data, making advanced analysis accessible and reliable even for those without extensive machine learning expertise. Implementation of Omada is available at http://bioconductor.org/packages/omada/.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"13 ","pages":""},"PeriodicalIF":3.5,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11238428/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141590107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
GigaSciencePub Date : 2024-01-02DOI: 10.1093/gigascience/giae044
Rui Tang, Cong Huang, Jun Yang, Zhong-Chen Rao, Li Cao, Peng-Hua Bai, Xin-Cheng Zhao, Jun-Feng Dong, Xi-Zhong Yan, Fang-Hao Wan, Nan-Ji Jiang, Ri-Chou Han
{"title":"A ghost moth olfactory prototype of the lepidopteran sex communication.","authors":"Rui Tang, Cong Huang, Jun Yang, Zhong-Chen Rao, Li Cao, Peng-Hua Bai, Xin-Cheng Zhao, Jun-Feng Dong, Xi-Zhong Yan, Fang-Hao Wan, Nan-Ji Jiang, Ri-Chou Han","doi":"10.1093/gigascience/giae044","DOIUrl":"10.1093/gigascience/giae044","url":null,"abstract":"<p><p>Sex role differentiation is a widespread phenomenon. Sex pheromones are often associated with sex roles and convey sex-specific information. In Lepidoptera, females release sex pheromones to attract males, which evolve sophisticated olfactory structures to relay pheromone signals. However, in some primitive moths, sex role differentiation becomes diverged. Here, we introduce the chromosome-level genome assembly from ancestral Himalaya ghost moths, revealing a unique olfactory evolution pattern and sex role parity among Lepidoptera. These olfactory structures of the ghost moths are characterized by a dense population of trichoid sensilla, both larger male and female antennal entry parts of brains, compared to the evolutionary later Lepidoptera. Furthermore, a unique tandem of 34 odorant receptor 19 homologs in Thitarodes xiaojinensis (TxiaOr19) has been identified, which presents overlapped motifs with pheromone receptors (PRs). Interestingly, the expanded TxiaOr19 was predicted to have unconventional tuning patterns compared to canonical PRs, with nonsexual dimorphic olfactory neuropils discovered, which contributes to the observed equal sex roles in Thitarodes adults. Additionally, transposable element activity bursts have provided traceable loci landscapes where parallel diversifications occurred between TxiaOr19 and PRs, indicating that the Or19 homolog expansions were diversified to PRs during evolution and thus established the classic sex roles in higher moths. This study elucidates an olfactory prototype of intermediate sex communication from Himalaya ghost moths.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"13 ","pages":""},"PeriodicalIF":3.5,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11258902/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141727097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
GigaSciencePub Date : 2024-01-02DOI: 10.1093/gigascience/giae088
Pavel Akhtyamov, Ausaaf Nabi, Vladislav Gafurov, Alexey Sizykh, Alexander Favorov, Yulia Medvedeva, Alexey Stupnikov
{"title":"GPU-accelerated Kendall distance computation for large or sparse data.","authors":"Pavel Akhtyamov, Ausaaf Nabi, Vladislav Gafurov, Alexey Sizykh, Alexander Favorov, Yulia Medvedeva, Alexey Stupnikov","doi":"10.1093/gigascience/giae088","DOIUrl":"10.1093/gigascience/giae088","url":null,"abstract":"<p><strong>Background: </strong>Current experimental practices typically produce large multidimensional datasets. Distance matrix calculation between elements (e.g., samples) for such data, although being often necessary in preprocessing for statistical inference or visualization, can be computationally demanding. Data sparsity, which is often observed in various experimental data modalities, such as single-cell sequencing in bioinformatics or collaborative filtering in recommendation systems, may pose additional algorithmic challenges.</p><p><strong>Results: </strong>We present GPU-Assisted Distance Estimation Software (GADES), a graphical processing unit (GPU)-enhanced package that allows for massively paralleled Kendall-$tau$ distance matrices computation. The package's architecture involves specific memory management, which lifts the limits for the data size imposed by GPU memory capacity. Additional algorithmic solutions provide a means to address the data sparsity problem and reinforce the acceleration effect for sparse datasets. Benchmarking against available central processing unit-based packages on simulated and real experimental single-cell RNA sequencing or single-cell ATAC sequencing datasets demonstrated significantly higher speed for GADES compared to other methods for both sparse and dense data processing, with additional performance boost for the sparse data.</p><p><strong>Conclusions: </strong>This work significantly contributes to the development of computational strategies for high-performance Kendall distance matrices computation and allows for the efficient processing of Big Data with the power of GPU. GADES is freely available at https://github.com/lab-medvedeva/GADES-main.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"13 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11631066/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142806507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
GigaSciencePub Date : 2024-01-02DOI: 10.1093/gigascience/giae109
Teresa D Shippy, Prashant S Hosmani, Mirella Flores-Gonzalez, Marina Mann, Sherry Miller, Matthew T Weirauch, Chad Vosberg, Crissy Massimino, Will Tank, Lucas de Oliveira, Chang Chen, Stephanie Hoyt, Rebekah Adams, Samuel Adkins, Samuel T Bailey, Xiaoting Chen, Nina Davis, Yesmarie DeLaFlor, Michelle Espino, Kylie Gervais, Rebecca Grace, Douglas Harper, Denisse L Hasan, Maria Hoang, Rachel Holcomb, Margaryta R Jernigan, Melissa Kemp, Bailey Kennedy, Kyle Kercher, Stefan Klaessan, Angela Kruse, Sophia Licata, Andrea Lu, Ron Masse, Anuja Mathew, Sarah Michels, Elizabeth Michels, Alan Neiman, Seantel Norman, Jordan Norus, Yasmin Ortiz, Naftali Panitz, Thomson Paris, Kitty M R Perentesis, Michael Perry, Max Reynolds, Madison M Sena, Blessy Tamayo, Amanda Thate, Sara Vandervoort, Jessica Ventura, Nicholas Weis, Tanner Wise, Robert G Shatters, Michelle Heck, Joshua B Benoit, Wayne B Hunter, Lukas A Mueller, Susan J Brown, Tom D'Elia, Surya Saha
{"title":"Diaci v3.0: chromosome-level assembly, de novo transcriptome, and manual annotation of Diaphorina citri, insect vector of Huanglongbing.","authors":"Teresa D Shippy, Prashant S Hosmani, Mirella Flores-Gonzalez, Marina Mann, Sherry Miller, Matthew T Weirauch, Chad Vosberg, Crissy Massimino, Will Tank, Lucas de Oliveira, Chang Chen, Stephanie Hoyt, Rebekah Adams, Samuel Adkins, Samuel T Bailey, Xiaoting Chen, Nina Davis, Yesmarie DeLaFlor, Michelle Espino, Kylie Gervais, Rebecca Grace, Douglas Harper, Denisse L Hasan, Maria Hoang, Rachel Holcomb, Margaryta R Jernigan, Melissa Kemp, Bailey Kennedy, Kyle Kercher, Stefan Klaessan, Angela Kruse, Sophia Licata, Andrea Lu, Ron Masse, Anuja Mathew, Sarah Michels, Elizabeth Michels, Alan Neiman, Seantel Norman, Jordan Norus, Yasmin Ortiz, Naftali Panitz, Thomson Paris, Kitty M R Perentesis, Michael Perry, Max Reynolds, Madison M Sena, Blessy Tamayo, Amanda Thate, Sara Vandervoort, Jessica Ventura, Nicholas Weis, Tanner Wise, Robert G Shatters, Michelle Heck, Joshua B Benoit, Wayne B Hunter, Lukas A Mueller, Susan J Brown, Tom D'Elia, Surya Saha","doi":"10.1093/gigascience/giae109","DOIUrl":"https://doi.org/10.1093/gigascience/giae109","url":null,"abstract":"<p><strong>Background: </strong>Diaphorina citri is an insect vector of \"Candidatus Liberibacter asiaticus\" (CLas), the gram-negative bacterial pathogen associated with citrus greening disease. Control measures rely on pesticides with negative impacts on the environment, natural ecosystems, and human and animal health. In contrast, gene-targeting methods have the potential to specifically target the vector species and/or reduce pathogen transmission.</p><p><strong>Results: </strong>To improve the genomic resources needed for targeted pest control, we assembled a D. citri genome based on PacBio long reads followed by proximity ligation-based scaffolding. The 474-Mb genome has 13 chromosomal-length scaffolds. In total, 1,036 genes were manually curated as part of a community annotation project, composed primarily of undergraduate students. We also computationally identified a total of 1,015 putative transcription factors (TFs) and were able to infer motifs for 337 TFs (33%). In addition, we produced a genome-independent transcriptome and genomes for D. citri endosymbionts.</p><p><strong>Conclusions: </strong>Manual annotation provided more accurate gene models for use by researchers and provided an excellent training opportunity for students from multiple institutions. All resources are available on CitrusGreening.org and NCBI. The chromosomal-length D. citri genome assembly serves as a blueprint for the development of collaborative genomics projects for other medically and agriculturally significant insect vectors.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"13 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142863048","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
GigaSciencePub Date : 2024-01-02DOI: 10.1093/gigascience/giae105
Yalan He, Jiyin Lai, Qian Wang, Bingyue Pan, Siyuan Li, Xilong Zhao, Ziyi Wang, Yongbao Zhang, Yujie Tang, Junwei Han
{"title":"ssMutPA: single-sample mutation-based pathway analysis approach for cancer precision medicine.","authors":"Yalan He, Jiyin Lai, Qian Wang, Bingyue Pan, Siyuan Li, Xilong Zhao, Ziyi Wang, Yongbao Zhang, Yujie Tang, Junwei Han","doi":"10.1093/gigascience/giae105","DOIUrl":"https://doi.org/10.1093/gigascience/giae105","url":null,"abstract":"<p><strong>Background: </strong>Single-sample pathway enrichment analysis is an effective approach for identifying cancer subtypes and pathway biomarkers, facilitating the development of precision medicine. However, the existing approaches focused on investigating the changes in gene expression levels but neglected somatic mutations, which play a crucial role in cancer development.</p><p><strong>Findings: </strong>In this study, we proposed a novel single-sample mutation-based pathway analysis approach (ssMutPA) to infer individualized pathway activities by integrating somatic mutation data and the protein-protein interaction network. For each sample, ssMutPA first uses local and global weighted strategies to evaluate the effects of genes from mutations according to the network topology and then calculates a single-sample mutation-based pathway enrichment score (ssMutPES) to reflect the accumulated effect of mutations of each pathway. To illustrate the performance of ssMutPA, we applied it to 33 cancer cohorts from The Cancer Genome Atlas database and revealed patient stratification with significantly different prognosis in each cancer type based on the ssMutPES profiles. We also found that the identified characteristic pathways with high overlap across different cancers could be used as potential prognosis biomarkers. Moreover, we applied ssMutPA to 2 melanoma cohorts with immunotherapy and identified a subgroup of patients who may benefit from therapy.</p><p><strong>Conclusions: </strong>We provided evidence that ssMutPA could infer mutation-based individualized pathway activity profiles and complement the current individualized pathway analysis approaches focused on gene expression data, which may offer the potential for the development of precision medicine. ssMutPA is available at https://CRAN.R-project.org/package=ssMutPA.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"13 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142863306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
GigaSciencePub Date : 2024-01-02DOI: 10.1093/gigascience/giae006
Yujie Huang, Longbiao Guo, Lingjuan Xie, Nianmin Shang, Dongya Wu, Chuyu Ye, Eduardo Carlos Rudell, Kazunori Okada, Qian-Hao Zhu, Beng-Kah Song, Daguang Cai, Aldo Merotto Junior, Lianyang Bai, Longjiang Fan
{"title":"A reference genome of Commelinales provides insights into the commelinids evolution and global spread of water hyacinth (Pontederia crassipes).","authors":"Yujie Huang, Longbiao Guo, Lingjuan Xie, Nianmin Shang, Dongya Wu, Chuyu Ye, Eduardo Carlos Rudell, Kazunori Okada, Qian-Hao Zhu, Beng-Kah Song, Daguang Cai, Aldo Merotto Junior, Lianyang Bai, Longjiang Fan","doi":"10.1093/gigascience/giae006","DOIUrl":"10.1093/gigascience/giae006","url":null,"abstract":"<p><p>Commelinales belongs to the commelinids clade, which also comprises Poales that includes the most important monocot species, such as rice, wheat, and maize. No reference genome of Commelinales is currently available. Water hyacinth (Pontederia crassipes or Eichhornia crassipes), a member of Commelinales, is one of the devastating aquatic weeds, although it is also grown as an ornamental and medical plant. Here, we present a chromosome-scale reference genome of the tetraploid water hyacinth with a total length of 1.22 Gb (over 95% of the estimated size) across 8 pseudochromosome pairs. With the representative genomes, we reconstructed a phylogeny of the commelinids, which supported Zingiberales and Commelinales being sister lineages of Arecales and shed lights on the controversial relationship of the orders. We also reconstructed ancestral karyotypes of the commelinids clade and confirmed the ancient commelinids genome having 8 chromosomes but not 5 as previously reported. Gene family analysis revealed contraction of disease-resistance genes during polyploidization of water hyacinth, likely a result of fitness requirement for its role as a weed. Genetic diversity analysis using 9 water hyacinth lines from 3 continents (South America, Asia, and Europe) revealed very closely related nuclear genomes and almost identical chloroplast genomes of the materials, as well as provided clues about the global dispersal of water hyacinth. The genomic resources of P. crassipes reported here contribute a crucial missing link of the commelinids species and offer novel insights into their phylogeny.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"13 ","pages":""},"PeriodicalIF":3.5,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10938897/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140133765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}