GigaSciencePub Date : 2025-01-06DOI: 10.1093/gigascience/giaf035
Maximilian Wess, Maria K Andersen, Elise Midtbust, Juan Carlos Cabellos Guillem, Trond Viset, Øystein Størkersen, Sebastian Krossa, Morten Beck Rye, May-Britt Tessem
{"title":"Spatial integration of multi-omics data from serial sections using the novel Multi-Omics Imaging Integration Toolset.","authors":"Maximilian Wess, Maria K Andersen, Elise Midtbust, Juan Carlos Cabellos Guillem, Trond Viset, Øystein Størkersen, Sebastian Krossa, Morten Beck Rye, May-Britt Tessem","doi":"10.1093/gigascience/giaf035","DOIUrl":"10.1093/gigascience/giaf035","url":null,"abstract":"<p><strong>Background: </strong>Truly understanding the cancer biology of heterogeneous tumors in precision medicine requires capturing the complexities of multiple omics levels and the spatial heterogeneity of cancer tissue. Techniques like mass spectrometry imaging (MSI) and spatial transcriptomics (ST) achieve this by spatially detecting metabolites and RNA but are often applied to serial sections. To fully leverage the advantage of such multi-omics data, the individual measurements need to be integrated into 1 dataset.</p><p><strong>Results: </strong>We present the Multi-Omics Imaging Integration Toolset (MIIT), a Python framework for integrating spatially resolved multi-omics data. A key component of MIIT's integration is the registration of serial sections for which we developed a nonrigid registration algorithm, GreedyFHist. We validated GreedyFHist on 244 images from fresh-frozen serial sections, achieving state-of-the-art performance. As a proof of concept, we used MIIT to integrate ST and MSI data from prostate tissue samples and assessed the correlation of a gene signature for citrate-spermine secretion derived from ST with metabolic measurements from MSI.</p><p><strong>Conclusion: </strong>MIIT is a highly accurate, customizable, open-source framework for integrating spatial omics technologies performed on different serial sections.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12077394/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144076950","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
GigaSciencePub Date : 2025-01-06DOI: 10.1093/gigascience/giaf050
January Adams, Rafal Cymerys, Karol Szuster, Daniel Hekman, Zoryana Salo, Rutvik Solanki, Muhammad Mamdani, Alistair Johnson, Katarzyna Ryniak, Tom Pollard, David Rotenberg, Benjamin Haibe-Kains
{"title":"Health Data Nexus: an open data platform for AI research and education in medicine.","authors":"January Adams, Rafal Cymerys, Karol Szuster, Daniel Hekman, Zoryana Salo, Rutvik Solanki, Muhammad Mamdani, Alistair Johnson, Katarzyna Ryniak, Tom Pollard, David Rotenberg, Benjamin Haibe-Kains","doi":"10.1093/gigascience/giaf050","DOIUrl":"10.1093/gigascience/giaf050","url":null,"abstract":"<p><p>We outline the development of the Health Data Nexus, a data platform that enables data storage and access management with a cloud-based computational environment. We describe the importance of this secure platform in an evolving public-sector research landscape that utilizes significant quantities of data, particularly clinical data acquired from health systems, as well as the importance of providing meaningful benefits for three targeted user groups: data providers, researchers, and educators. We then describe the implementation of governance practices, technical standards, and data security, and the privacy protections needed to build this platform, as well as example use-cases highlighting the strengths of the platform in facilitating dataset acquisition, novel research, and hosting educational courses, workshops, and datathons. Finally, we discuss the key principles that informed the platform's development, highlighting the importance of flexible uses, collaborative development, and open-source science.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12131319/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144208238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A telomere-to-telomere gapless genome reveals SlPRR1 control of circadian rhythm and photoperiodic flowering in tomato.","authors":"Hui Liu, Jia-Qi Zhang, Jian-Ping Tao, Chen Chen, Li-Yao Su, Jin-Song Xiong, Ai-Sheng Xiong","doi":"10.1093/gigascience/giaf058","DOIUrl":"10.1093/gigascience/giaf058","url":null,"abstract":"<p><p>Cultivated tomato (Solanum lycopersicum) is a major vegetable crop of high economic value that serves as an important model for studying flowering time in day-neutral plants. A complete, continuous, and gapless genome of cultivated tomato is essential for genetic research and breeding programs. Here, we report the construction of a telomere-to-telomere (T2T) gap-free genome of S. lycopersicum cv. VF36 using a combination of sequencing technologies. The 815.27-Mb T2T \"VF36\" genome contained 600.23 Mb of transposable elements. Through comparative genomics and phylogenetic analysis, we identified structural variations between the \"VF36\" and \"Heinz 1706\" genomes and found no evidence of a recent species-specific whole-genome duplication in the \"VF36\" tomato. Furthermore, a core circadian oscillator, SlPRR1, was identified, which peaked at night in a circadian rhythm. CRISPR/Cas9-mediated knockdown of SlPRR1 in tomatoes demonstrated that slprr1 mutant lines exhibited significantly earlier flowering under long-day condition than wild type. We present a hypothetical model of how SlPRR1 regulates flowering time and chlorophyll biosynthesis in response to photoperiod. This T2T genomic resource will accelerate the genetic improvement of large-fruited tomatoes, and the SlPRR1-related hypothetical model will enhance our understanding of the photoperiodic response in cultivated tomatoes, revealing a regulatory mechanism for manipulating flowering time.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12218202/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144553222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A near-complete genome assembly of the bearded dragon Pogona vitticeps provides insights into the origin of Pogona sex chromosomes.","authors":"Qunfei Guo, Youliang Pan, Wei Dai, Fei Guo, Tao Zeng, Wanyi Chen, Yaping Mi, Yanshu Zhang, Shuaizhen Shi, Wei Jiang, Huimin Cai, Beiying Wu, Yang Zhou, Ying Wang, Chentao Yang, Xiao Shi, Xu Yan, Junyi Chen, Chongyang Cai, Jingnan Yang, Xun Xu, Ying Gu, Yuliang Dong, Qiye Li","doi":"10.1093/gigascience/giaf079","DOIUrl":"10.1093/gigascience/giaf079","url":null,"abstract":"<p><strong>Background: </strong>Vertebrate sex is typically determined either by genetic factors, such as sex chromosomes, or by environmental cues like temperature. Therefore, the agamid dragon lizard Pogona vitticeps is remarkable in this regard, as it exhibits both ZZ/ZW genetic and temperature-dependent sex determination. However, complete sequence and full gene content of P. vitticeps sex chromosomes remain unclear, hindering the investigation of sex-determining cascade in this model lizard.</p><p><strong>Results: </strong>Using CycloneSEQ and DNBSEQ sequencing technologies, we generated a near-complete chromosome-scale genome assembly for a ZZ male P. vitticeps. Compared with previous reference genome (GCF_900067755.1/Pvi1.1), this ∼1.8-Gb new assembly displayed >5,700-fold improvement in contiguity (contig N50: 202.5 Mb vs. 35.5 kb) and achieved complete chromosome anchoring (16 vs. 13,749 scaffolds). We found that over 80% of the P. vitticeps Z chromosome remains as a pseudo-autosomal region, where recombination is not suppressed. The sexually differentiated region (SDR) is small and occupied mostly by transposons, yet it aggregates genes involved in male development, such as AMH, AMHR2, and BMPR1A. Finally, by tracking the evolutionary origin and developmental expression of SDR genes, we proposed a model for the origin of P. vitticeps sex chromosomes that considered the Z-linked AMH as the master sex-determining gene.</p><p><strong>Conclusions: </strong>In this study, we fully characterized the Z sex chromosome of P. vitticeps, identified AMH as the candidate sex-determining gene, and proposed a new model for the origin of P. vitticeps sex chromosomes. The near-complete P. vitticeps reference genome will also benefit future study of reptile evolution.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12360845/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144872647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
GigaSciencePub Date : 2025-01-06DOI: 10.1093/gigascience/giaf109
Yichun Feng, Jiawei Wang, Ruikun He, Lu Zhou, Yixue Li
{"title":"A retrieval-augmented knowledge mining method with deep thinking LLMs for biomedical research and clinical support.","authors":"Yichun Feng, Jiawei Wang, Ruikun He, Lu Zhou, Yixue Li","doi":"10.1093/gigascience/giaf109","DOIUrl":"10.1093/gigascience/giaf109","url":null,"abstract":"<p><strong>Background: </strong>Knowledge graphs and large language models (LLMs) are key tools for biomedical knowledge integration and reasoning, facilitating structured organization of scientific articles and discovery of complex semantic relationships. However, current methods face challenges: knowledge graph construction is limited by complex terminology, data heterogeneity, and rapid knowledge evolution, while LLMs show limitations in retrieval and reasoning, making it difficult to uncover cross-document associations and reasoning pathways.</p><p><strong>Results: </strong>We propose a pipeline that uses LLMs to construct a Biomedical Stratified Knowledge Graph (BioStrataKG) from large-scale articles and builds the Biomedical Cross-Document Question Answering Dataset (BioCDQA) to evaluate latent knowledge retrieval and multihop reasoning. We then introduce Integrated and Progressive Retrieval-Augmented Reasoning (IP-RAR) to enhance retrieval accuracy and knowledge reasoning. IP-RAR maximizes information recall through integrated reasoning-based retrieval and refines knowledge via progressive reasoning-based generation, using self-reflection to achieve deep thinking and precise contextual understanding. Experiments show that IP-RAR improves document retrieval F1 score by 20% and answer generation accuracy by 25% over existing methods.</p><p><strong>Conclusions: </strong>The IP-RAR helps doctors efficiently integrate treatment evidence to inform the development of personalized medication plans and enables researchers to analyze advancements and research gaps, accelerating the hypothesis generation phase of scientific discovery and decision-making.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12448786/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145091620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
GigaSciencePub Date : 2025-01-06DOI: 10.1093/gigascience/giaf116
Thomas Barba, Bryce A Bagley, Sandra Steyaert, Francisco Carrillo-Perez, Christoph Sadée, Michael Iv, Olivier Gevaert
{"title":"DUNE: a versatile neuroimaging encoder captures brain complexity across 3 major diseases: cancer, dementia, and schizophrenia.","authors":"Thomas Barba, Bryce A Bagley, Sandra Steyaert, Francisco Carrillo-Perez, Christoph Sadée, Michael Iv, Olivier Gevaert","doi":"10.1093/gigascience/giaf116","DOIUrl":"10.1093/gigascience/giaf116","url":null,"abstract":"<p><strong>Background: </strong>Magnetic resonance imaging (MRI) of the brain contains complex data that pose significant challenges for computational analysis. While models proposed for brain MRI analyses yield encouraging results, the high complexity of neuroimaging data hinders generalizability and clinical application. We introduce DUNE, a neuroimaging-oriented workflow that transforms raw brain MRI scans into standardized compact patient-level embeddings through integrated preprocessing and deep feature extraction, thereby enabling their processing by basic machine learning algorithms. A UNet-based autoencoder was trained using 3,814 selected scans of morphologically normal (healthy volunteers) or abnormal (glioma patients) brains, to generate comprehensive compact representations of the full-sized images. To evaluate their quality, these embeddings were utilized to train machine learning models to predict a wide range of clinical variables.</p><p><strong>Results: </strong>Embeddings were extracted for cohorts used for the model development (21,102 individuals), along with 3 additional independent cohorts (Alzheimer's disease, schizophrenia, and glioma cohorts, 1,322 individuals), to evaluate the model's generalization capabilities. The embeddings extracted from healthy volunteers' scans could predict a broad spectrum of clinical parameters, including volumetry metrics, cardiovascular disease (area under the receiver operating characteristic curve [AUROC] = 0.80) and alcohol consumption (AUROC = 0.99), and more nuanced parameters such as the Alzheimer's predisposing APOE4 allele (AUROC = 0.67). Embeddings derived from the validation cohorts successfully predicted the diagnoses of Alzheimer's dementia (AUROC = 0.92) and schizophrenia (AUROC = 0.64). Embeddings extracted from glioma scans successfully predicted survival (C-index = 0.608) and IDH molecular status (AUROC = 0.92), matching the performances of previous task-oriented models.</p><p><strong>Conclusion: </strong>DUNE efficiently represents clinically relevant patterns from full-size brain MRI scans across several disease areas, opening ways for innovative clinical applications in neurology.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12527335/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145299561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"PanGIA: A universal framework for identifying association between ncRNAs and diseases.","authors":"Xiaoyuan Liu, Xiye Lü, Qiuhao Chen, Jiqiu Sun, Tianyi Zhao, Yan Zhu","doi":"10.1093/gigascience/giaf123","DOIUrl":"10.1093/gigascience/giaf123","url":null,"abstract":"<p><strong>Background: </strong>With the growing recognition of the important roles noncoding RNAs (ncRNAs) play in various biological functions, especially their potential involvement in many human diseases, predicting ncRNA-disease associations has become a key challenge in biomedical research.</p><p><strong>Results: </strong>Although many computational methods have been proposed to predict ncRNA-disease associations, most of these methods focus on a single type of ncRNA. However, the competitive and cooperative interactions among different types of ncRNAs are closely related to their functional roles in disease associations. To address this limitation, we propose a novel computational framework, PanGIA (Pan-ncRNA Graph-Interaction Attention network), designed to simultaneously predict potential associations between multiple types of noncoding RNAs, including microRNAs (miRNAs), long noncoding RNAs (lncRNAs), circular RNAs (circRNAs), and PIWI-interacting RNAs (piRNAs), and diseases. Experimental results show that PanGIA outperforms type-specific SOTA methods in both individual and comprehensive predictions. It remains robust even when nodes or ncRNA types are removed, and ablation studies confirm the benefits of cross-type information. PanGIA also outperforms several single-type state-of-the-art methods across multiple metrics.</p><p><strong>Conclusions: </strong>PanGIA demonstrates significant advantages in predicting disease associations for different types of ncRNAs, including miRNAs, lncRNAs, circRNAs, and piRNAs. Case studies further confirm the accuracy of the model's predictions, as all high-confidence associations were supported by literature evidence. This demonstrates the model's strong biological interpretability and promising potential for practical applications. The successful application of PanGIA provides a new paradigm for exploring disease-associated ncRNAs, highlighting their immense potential in the field of biomedical research.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12532321/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145307641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
GigaSciencePub Date : 2025-01-06DOI: 10.1093/gigascience/giae111
Daniel Jacob, François Ehrenmann, Romain David, Joseph Tran, Cathleen Mirande-Ney, Philippe Chaumeil
{"title":"An ecosystem for producing and sharing metadata within the web of FAIR Data.","authors":"Daniel Jacob, François Ehrenmann, Romain David, Joseph Tran, Cathleen Mirande-Ney, Philippe Chaumeil","doi":"10.1093/gigascience/giae111","DOIUrl":"https://doi.org/10.1093/gigascience/giae111","url":null,"abstract":"<p><strong>Background: </strong>Descriptive metadata are vital for reporting, discovering, leveraging, and mobilizing research datasets. However, resolving metadata issues as part of a data management plan can be complex for data producers. To organize and document data, various descriptive metadata must be created. Furthermore, when sharing data, it is important to ensure metadata interoperability in line with FAIR (Findable, Accessible, Interoperable, Reusable) principles. Given the practical nature of these challenges, there is a need for management tools that can assist data managers effectively. Additionally, these tools should meet the needs of data producers and be user-friendly, requiring minimal training.</p><p><strong>Results: </strong>We developed Maggot (Metadata Aggregation on Data Storage), a web-based tool to locally manage a data catalog using high-level metadata. The main goal was to facilitate easy data dissemination and deposition in data repositories. With Maggot, users can easily generate and attach high-level metadata to datasets, allowing for seamless sharing in a collaborative environment. This approach aligns with many data management plans as it effectively addresses challenges related to data organization, documentation, storage, and the sharing of metadata based on FAIR principles within and beyond the collaborative group. Furthermore, Maggot enables metadata crosswalks (i.e., generated metadata can be converted to the schema used by a specific data repository or be exported using a format suitable for data collection by third-party applications).</p><p><strong>Conclusion: </strong>The primary purpose of Maggot is to streamline the collection of high-level metadata using carefully chosen schemas and standards. Additionally, it simplifies data accessibility via metadata, typically a requirement for publicly funded projects. As a result, Maggot can be utilized to promote effective local management with the goal of facilitating data sharing while adhering to the FAIR principles. Furthermore, it can contribute to the preparation of the future EOSC FAIR Web of Data within the European Open Science Cloud framework.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11707607/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142947509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
GigaSciencePub Date : 2025-01-06DOI: 10.1093/gigascience/giae122
Coline Royaux, Jean-Baptiste Mihoub, Marie Jossé, Dominique Pelletier, Olivier Norvez, Yves Reecht, Anne Fouilloux, Helena Rasche, Saskia Hiltemann, Bérénice Batut, Eléaume Marc, Pauline Seguineau, Guillaume Massé, Alan Amossé, Claire Bissery, Romain Lorrilliere, Alexis Martin, Yves Bas, Thimothée Virgoulay, Valentin Chambon, Elie Arnaud, Elisa Michon, Clara Urfer, Eloïse Trigodet, Marie Delannoy, Gregoire Loïs, Romain Julliard, Björn Grüning, Yvan Le Bras
{"title":"Guidance framework to apply best practices in ecological data analysis: lessons learned from building Galaxy-Ecology.","authors":"Coline Royaux, Jean-Baptiste Mihoub, Marie Jossé, Dominique Pelletier, Olivier Norvez, Yves Reecht, Anne Fouilloux, Helena Rasche, Saskia Hiltemann, Bérénice Batut, Eléaume Marc, Pauline Seguineau, Guillaume Massé, Alan Amossé, Claire Bissery, Romain Lorrilliere, Alexis Martin, Yves Bas, Thimothée Virgoulay, Valentin Chambon, Elie Arnaud, Elisa Michon, Clara Urfer, Eloïse Trigodet, Marie Delannoy, Gregoire Loïs, Romain Julliard, Björn Grüning, Yvan Le Bras","doi":"10.1093/gigascience/giae122","DOIUrl":"10.1093/gigascience/giae122","url":null,"abstract":"<p><p>Numerous conceptual frameworks exist for best practices in research data and analysis (e.g., Open Science and FAIR principles). In practice, there is a need for further progress to improve transparency, reproducibility, and confidence in ecology. Here, we propose a practical and operational framework for researchers and experts in ecology to achieve best practices for building analytical procedures from individual research projects to production-level analytical pipelines. We introduce the concept of atomization to identify analytical steps that support generalization by allowing us to go beyond single analyses. The term atomization is employed to convey the idea of single analytical steps as \"atoms\" composing an analytical procedure. When generalized, \"atoms\" can be used in more than a single case analysis. These guidelines were established during the development of the Galaxy-Ecology initiative, a web platform dedicated to data analysis in ecology. Galaxy-Ecology allows us to demonstrate a way to reach higher levels of reproducibility in ecological sciences by increasing the accessibility and reusability of analytical workflows once atomized and generalized.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11816794/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143407005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A high-quality assembly revealing the PMEL gene for the unique plumage phenotype in Liancheng ducks.","authors":"Zhen Wang, Zhanbao Guo, Hongfei Liu, Tong Liu, Dapeng Liu, Simeng Yu, Hehe Tang, He Zhang, Qiming Mou, Bo Zhang, Junting Cao, Martine Schroyen, Shuisheng Hou, Zhengkui Zhou","doi":"10.1093/gigascience/giae114","DOIUrl":"10.1093/gigascience/giae114","url":null,"abstract":"<p><strong>Background: </strong>Plumage coloration is a distinctive trait in ducks, and the Liancheng duck, characterized by its white plumage and black beak and webbed feet, serves as an excellent subject for such studies. However, academic comprehension of the genetic mechanisms underlying duck plumage coloration remains limited. To this end, the Liancheng duck genome (GCA_039998735.1) was hereby de novo assembled using HiFi reads, and F2 segregating populations were generated from Liancheng and Pekin ducks. The aim was to identify the genetic mechanism of white plumage in Liancheng ducks.</p><p><strong>Results: </strong>In this study, 1.29 Gb Liancheng duck genome was de novo assembled, involving a contig N50 of 12.17 Mb and a scaffold N50 of 83.98 Mb. Beyond the epistatic effect of the MITF gene, genome-wide association study analysis pinpointed a 0.8-Mb genomic region encompassing the PMEL gene. This gene encoded a protein specific to pigment cells and was essential for the formation of fibrillar sheets within melanosomes, the organelles responsible for pigmentation. Additionally, linkage disequilibrium analysis revealed 2 candidate single-nucleotide polymorphisms (Chr33: 5,303,994A>G; 5,303,997A>G) that might alter PMEL transcription, potentially influencing plumage coloration in Liancheng ducks.</p><p><strong>Conclusions: </strong>Our study has assembled a high-quality genome for the Liancheng duck and has presented compelling evidence that the white plumage characteristic of this breed is attributable to the PMEL gene. Overall, these findings offer significant insights and direction for future studies and breeding programs aimed at understanding and manipulating avian plumage coloration.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11727711/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142977794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}