GigaScience最新文献

筛选
英文 中文
Knowledge graph-based thought: a knowledge graph-enhanced LLM framework for pan-cancer question answering. 基于知识图的思想:面向泛癌症问答的知识图增强LLM框架。
IF 11.8 2区 生物学
GigaScience Pub Date : 2025-01-06 DOI: 10.1093/gigascience/giae082
Yichun Feng, Lu Zhou, Chao Ma, Yikai Zheng, Ruikun He, Yixue Li
{"title":"Knowledge graph-based thought: a knowledge graph-enhanced LLM framework for pan-cancer question answering.","authors":"Yichun Feng, Lu Zhou, Chao Ma, Yikai Zheng, Ruikun He, Yixue Li","doi":"10.1093/gigascience/giae082","DOIUrl":"10.1093/gigascience/giae082","url":null,"abstract":"<p><strong>Background: </strong>In recent years, large language models (LLMs) have shown promise in various domains, notably in biomedical sciences. However, their real-world application is often limited by issues like erroneous outputs and hallucinatory responses.</p><p><strong>Results: </strong>We developed the knowledge graph-based thought (KGT) framework, an innovative solution that integrates LLMs with knowledge graphs (KGs) to improve their initial responses by utilizing verifiable information from KGs, thus significantly reducing factual errors in reasoning. The KGT framework demonstrates strong adaptability and performs well across various open-source LLMs. Notably, KGT can facilitate the discovery of new uses for existing drugs through potential drug-cancer associations and can assist in predicting resistance by analyzing relevant biomarkers and genetic mechanisms. To evaluate the knowledge graph question answering task within biomedicine, we utilize a pan-cancer knowledge graph to develop a pan-cancer question answering benchmark, named pan-cancer question answering.</p><p><strong>Conclusions: </strong>The KGT framework substantially improves the accuracy and utility of LLMs in the biomedical field. This study serves as a proof of concept, demonstrating its exceptional performance in biomedical question answering.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11702363/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142947471","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Characteristics and filtering of low-frequency artificial short deletion variations based on nanopore sequencing.
IF 11.8 2区 生物学
GigaScience Pub Date : 2025-01-06 DOI: 10.1093/gigascience/giaf018
Fuqiang Ye, Juanjuan Zhu, Xiaomin Zhang, Jiarong Zhang, Zihan Xie, Tingting Yang, Yifang Han, Xiaohong Yang, Zilin Ren, Ming Ni
{"title":"Characteristics and filtering of low-frequency artificial short deletion variations based on nanopore sequencing.","authors":"Fuqiang Ye, Juanjuan Zhu, Xiaomin Zhang, Jiarong Zhang, Zihan Xie, Tingting Yang, Yifang Han, Xiaohong Yang, Zilin Ren, Ming Ni","doi":"10.1093/gigascience/giaf018","DOIUrl":"10.1093/gigascience/giaf018","url":null,"abstract":"<p><strong>Background: </strong>Nanopore sequencing is characterized by high portability and long reads, albeit accompanied by systematic errors causing short deletions. Few tools can filter low-frequency artificial deletions, especially in single samples.</p><p><strong>Results: </strong>To solve this problem, we first synthesized or purchased 17 DNA/RNA standards for nanopore sequencing with R9 and R10 flowcells to obtain benchmarking datasets. False-positive (FP) deletions were prevalent (75.86%-96.26%), while the majority (62.07%-79.68%) were located in homopolymeric regions. The 10-mer base-quality scores (Q scores) and sequencing speeds flanking the FP homopolymeric deletions marginally differed from the true-positive (TP) deletions. We thus investigated the raw current signals after normalizing them by length. We found more significant differences in current signals between the reads with and without FP deletions. Indexes including the MRPP A (Multiple Response Permutation Procedure, statistic A), the accumulative difference of normalized current signals, and the Q score were tested for the power of distinguishing between FP and TP deletions. MRPP A outperformed the other indexes in homopolymeric regions and achieved the highest accuracy of 76.73% for challenging 1-base homopolymeric deletions. When sequencing depth was low, the Q score performed better than MRPP A. We developed Delter (Deletion filter) to filter low-frequency FP deletions of nanopore sequencing in single samples, which removed 60.98% to 100% of artificial homopolymeric deletions in real samples.</p><p><strong>Conclusions: </strong>Low-frequency artificial short deletion variations, especially the most challenging homopolymeric deletions, could be effectively filtered by Delter using normalized current signals or Q scores according to the employed sequencing strategies.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11927395/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143673818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Healthy microbiome-moving towards functional interpretation.
IF 11.8 2区 生物学
GigaScience Pub Date : 2025-01-06 DOI: 10.1093/gigascience/giaf015
Kinga Zielińska, Klas I Udekwu, Witold Rudnicki, Alina Frolova, Paweł P Łabaj
{"title":"Healthy microbiome-moving towards functional interpretation.","authors":"Kinga Zielińska, Klas I Udekwu, Witold Rudnicki, Alina Frolova, Paweł P Łabaj","doi":"10.1093/gigascience/giaf015","DOIUrl":"10.1093/gigascience/giaf015","url":null,"abstract":"<p><strong>Background: </strong>Microbiome-based disease prediction has significant potential as an early, noninvasive marker of multiple health conditions linked to dysbiosis of the human gut microbiota, thanks in part to decreasing sequencing and analysis costs. Microbiome health indices and other computational tools currently proposed in the field often are based on a microbiome's species richness and are completely reliant on taxonomic classification. A resurgent interest in a metabolism-centric, ecological approach has led to an increased understanding of microbiome metabolic and phenotypic complexity, revealing substantial restrictions of taxonomy-reliant approaches.</p><p><strong>Findings: </strong>In this study, we introduce a new metagenomic health index developed as an answer to recent developments in microbiome definitions, in an effort to distinguish between healthy and unhealthy microbiomes, here in focus, inflammatory bowel disease (IBD). The novelty of our approach is a shift from a traditional Linnean phylogenetic classification toward a more holistic consideration of the metabolic functional potential underlining ecological interactions between species. Based on well-explored data cohorts, we compare our method and its performance with the most comprehensive indices to date, the taxonomy-based Gut Microbiome Health Index (GMHI), and the high-dimensional principal component analysis (hiPCA) methods, as well as to the standard taxon- and function-based Shannon entropy scoring. After demonstrating better performance on the initially targeted IBD cohorts, in comparison with other methods, we retrain our index on an additional 27 datasets obtained from different clinical conditions and validate our index's ability to distinguish between healthy and disease states using a variety of complementary benchmarking approaches. Finally, we demonstrate its superiority over the GMHI and the hiPCA on a longitudinal COVID-19 cohort and highlight the distinct robustness of our method to sequencing depth.</p><p><strong>Conclusions: </strong>Overall, we emphasize the potential of this metagenomic approach and advocate a shift toward functional approaches to better understand and assess microbiome health as well as provide directions for future index enhancements. Our method, q2-predict-dysbiosis (Q2PD), is freely available (https://github.com/Kizielins/q2-predict-dysbiosis).</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11927397/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143673820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A telomere-to-telomere phased genome of an octoploid strawberry reveals a receptor kinase conferring anthracnose resistance.
IF 11.8 2区 生物学
GigaScience Pub Date : 2025-01-06 DOI: 10.1093/gigascience/giaf005
Hyeondae Han, Natalia Salinas, Christopher R Barbey, Yoon Jeong Jang, Zhen Fan, Sujeet Verma, Vance M Whitaker, Seonghee Lee
{"title":"A telomere-to-telomere phased genome of an octoploid strawberry reveals a receptor kinase conferring anthracnose resistance.","authors":"Hyeondae Han, Natalia Salinas, Christopher R Barbey, Yoon Jeong Jang, Zhen Fan, Sujeet Verma, Vance M Whitaker, Seonghee Lee","doi":"10.1093/gigascience/giaf005","DOIUrl":"10.1093/gigascience/giaf005","url":null,"abstract":"<p><strong>Background: </strong>Cultivated strawberry (Fragaria xananassa Duch.), an allo-octoploid species arising from at least 3 diploid progenitors, poses a challenge for genomic analysis due to its high levels of heterozygosity and the complex nature of its polyploid genome.</p><p><strong>Results: </strong>This study developed the complete haplotype-phased genome sequence from a short-day strawberry, 'Florida Brilliance' without parental data, assembling 56 chromosomes from telomere to telomere. This assembly was achieved with high-fidelity long reads and high-throughput chromatic capture sequencing (Hi-C). The centromere core regions and 96,104 genes were annotated using long-read isoform RNA sequencing. Using the high quality of the haplotype-phased reference genome, FaFB1, we identified the causal mutation within the gene encoding Leaf Rust 10 Disease-Resistance Locus Receptor-like Protein Kinase (LRK10) that confers resistance to anthracnose fruit rot (AFR). This disease is caused by the Colletotrichum acutatum species complex and results in significant economic losses in strawberry production. Comparison of resistant and susceptible haplotype assemblies and full-length transcript data revealed a 29-bp insertion at the first exon of the susceptible allele, leading to a premature stop codon and loss of gene function. The functional role of LRK10 in resistance to AFR was validated using a simplified Agrobacterium-based transformation method for transient gene expression analysis in strawberry fruits. Transient knockdown and overexpression of LRK10 in fruit indicate a key role for LRK10 in AFR resistance in strawberry.</p><p><strong>Conclusions: </strong>The FaFB1 assembly along with other resources will be valuable for the discovery of additional candidate genes associated with disease resistance and fruit quality, which will not only advance our understanding of genes and their functions but also facilitate advancements in genome editing in strawberry.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11899574/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143614573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Observational, causal relationship and shared genetic basis between cholelithiasis and gastroesophageal reflux disease: evidence from a cohort study and comprehensive genetic analysis.
IF 11.8 2区 生物学
GigaScience Pub Date : 2025-01-06 DOI: 10.1093/gigascience/giaf023
Yanlin Lyu, Shuangshuang Tong, Wentao Huang, Yuying Ma, Ruijie Zeng, Rui Jiang, Ruibang Luo, Felix W Leung, Qizhou Lian, Weihong Sha, Hao Chen
{"title":"Observational, causal relationship and shared genetic basis between cholelithiasis and gastroesophageal reflux disease: evidence from a cohort study and comprehensive genetic analysis.","authors":"Yanlin Lyu, Shuangshuang Tong, Wentao Huang, Yuying Ma, Ruijie Zeng, Rui Jiang, Ruibang Luo, Felix W Leung, Qizhou Lian, Weihong Sha, Hao Chen","doi":"10.1093/gigascience/giaf023","DOIUrl":"10.1093/gigascience/giaf023","url":null,"abstract":"<p><strong>Objective: </strong>Cholelithiasis and gastroesophageal reflux disease (GERD) contribute to significant health concerns. We aimed to investigate the potential observational, causal, and genetic relationships between cholelithiasis and GERD.</p><p><strong>Design: </strong>The observational correlations were assessed based on the prospective cohort study from UK Biobank. Then, by leveraging the genome-wide summary statistics of cholelithiasis (N = 334,277) and GERD (N = 332,601), the bidirectional causal associations were evaluated using Mendelian randomization (MR) analysis. Subsequently, a series of genetic analyses was used to assess the genetic correlation, shared loci, and genes between cholelithiasis and GERD.</p><p><strong>Results: </strong>The prospective cohort analyses revealed a significantly increased risk of GERD in individuals with cholelithiasis (hazard ratio [HR] = 1.99; 95% confidence interval [CI], 1.89-2.10) and a higher risk of cholelithiasis among patients with GERD (HR = 2.30; 95% CI, 2.18-2.44). The MR study indicated the causal effect of genetic liability to cholelithiasis on the incidence of GERD (odds ratio [OR] = 1.08; 95% CI, 1.05-1.11) and the causal effect of genetic predicted GERD on cholelithiasis (OR = 1.15; 95% CI, 1.02-1.31). In addition, cholelithiasis and GERD exhibited a strong genetic association. Cross-trait meta-analyses identified 5 novel independent loci shared between cholelithiasis and GERD. Three shared genes, including SUN2, CBY1, and JOSD1, were further identified as novel risk genes.</p><p><strong>Conclusion: </strong>The elucidation of the shared genetic basis underlying the phenotypic relationship of these 2 complex phenotypes offers new insights into the intrinsic linkage between cholelithiasis and GERD, providing a novel research direction for future therapeutic strategy and risk prediction.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11943489/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143729537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
EnrichDO: a global weighted model for Disease Ontology enrichment analysis.
IF 11.8 2区 生物学
GigaScience Pub Date : 2025-01-06 DOI: 10.1093/gigascience/giaf021
Haixiu Yang, Hongyu Fu, Meiyi Zhang, Yangyang Liu, Yongqun Oliver He, Chao Wang, Liang Cheng
{"title":"EnrichDO: a global weighted model for Disease Ontology enrichment analysis.","authors":"Haixiu Yang, Hongyu Fu, Meiyi Zhang, Yangyang Liu, Yongqun Oliver He, Chao Wang, Liang Cheng","doi":"10.1093/gigascience/giaf021","DOIUrl":"10.1093/gigascience/giaf021","url":null,"abstract":"<p><strong>Background: </strong>Disease Ontology (DO) has been widely studied in biomedical research and clinical practice to describe the roles of genes. DO enrichment analysis is an effective means to discover associations between genes and diseases. Compared to hundreds of Gene Ontology (GO)-based enrichment analysis methods, however, DO-based methods are relatively scarce, and most current DO-based approaches are term-for-term and thus are unable to solve over-enrichment problems caused by the \"true-path\" rule.</p><p><strong>Results: </strong>Here, we describe a novel double-weighted model, EnrichDO, which leverages the latest annotations of the human genome with DO terms and integrates DO graph topology on a global scale. Compared to classic enrichment methods (mainly for GO) and existing DO-based enrichment tools, EnrichDO performs better in both GO and DO enrichment analysis cases. It can accurately identify more specific terms, without ignoring the truly associated parent terms, as shown in the Alzheimer's disease (AD) case (AD ranked first). Moreover, both a simulated test and a data perturbation test validate the accuracy and robustness of EnrichDO. Finally, EnrichDO is applied to other types of datasets to expand its application, including gene expression profile datasets, a host gene set of microorganisms, and hallmark gene sets. Based on the findings reported here, EnrichDO shows significant improvement via all experimental results.</p><p><strong>Conclusions: </strong>EnrichDO provides an effective DO enrichment analysis model for gaining insight into the significance of a particular gene set in the context of disease. To increase the usability of EnrichDO, we have developed an R-based software package, which is freely available through Bioconductor (https://bioconductor.org/packages/release/bioc/html/EnrichDO.html) or at https://github.com/liangcheng-hrbmu/EnrichDO.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11945307/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143729552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
High-fidelity wheat plant reconstruction using 3D Gaussian splatting and neural radiance fields.
IF 11.8 2区 生物学
GigaScience Pub Date : 2025-01-06 DOI: 10.1093/gigascience/giaf022
Lewis A G Stuart, Darren M Wells, Jonathan A Atkinson, Simon Castle-Green, Jack Walker, Michael P Pound
{"title":"High-fidelity wheat plant reconstruction using 3D Gaussian splatting and neural radiance fields.","authors":"Lewis A G Stuart, Darren M Wells, Jonathan A Atkinson, Simon Castle-Green, Jack Walker, Michael P Pound","doi":"10.1093/gigascience/giaf022","DOIUrl":"10.1093/gigascience/giaf022","url":null,"abstract":"<p><strong>Background: </strong>The reconstruction of 3-dimensional (3D) plant models can offer advantages over traditional 2-dimensional approaches by more accurately capturing the complex structure and characteristics of different crops. Conventional 3D reconstruction techniques often produce sparse or noisy representations of plants using software or are expensive to capture in hardware. Recently, view synthesis models have been developed that can generate detailed 3D scenes, and even 3D models, from only RGB images and camera poses. These models offer unparalleled accuracy but are currently data hungry, requiring large numbers of views with very accurate camera calibration.</p><p><strong>Results: </strong>In this study, we present a view synthesis dataset comprising 20 individual wheat plants captured across 6 different time frames over a 15-week growth period. We develop a camera capture system using 2 robotic arms combined with a turntable, controlled by a re-deployable and flexible image capture framework. We trained each plant instance using two recent view synthesis models: 3D Gaussian splatting (3DGS) and neural radiance fields (NeRF). Our results show that both 3DGS and NeRF produce high-fidelity reconstructed images of a plant subject from views not captured in the initial training sets. We also show that these approaches can be used to generate accurate 3D representations of these plants as point clouds, with 0.74-mm and 1.43-mm average accuracy compared with a handheld scanner for 3DGS and NeRF, respectively.</p><p><strong>Conclusion: </strong>We believe that these new methods will be transformative in the field of 3D plant phenotyping, plant reconstruction, and active vision. To further this cause, we release all robot configuration and control software, alongside our extensive multiview dataset. We also release all scripts necessary to train both 3DGS and NeRF, all trained models data, and final 3D point cloud representations. Our dataset can be accessed via https://plantimages.nottingham.ac.uk/ or https://https://doi.org/10.5524/102661. Our software can be accessed via https://github.com/Lewis-Stuart-11/3D-Plant-View-Synthesis.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11945317/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143729555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Opaque ontology: neuroimaging classification of ICD-10 diagnostic groups in the UK Biobank.
IF 11.8 2区 生物学
GigaScience Pub Date : 2025-01-06 DOI: 10.1093/gigascience/giae119
Ty Easley, Xiaoke Luo, Kayla Hannon, Petra Lenzini, Janine Bijsterbosch
{"title":"Opaque ontology: neuroimaging classification of ICD-10 diagnostic groups in the UK Biobank.","authors":"Ty Easley, Xiaoke Luo, Kayla Hannon, Petra Lenzini, Janine Bijsterbosch","doi":"10.1093/gigascience/giae119","DOIUrl":"10.1093/gigascience/giae119","url":null,"abstract":"<p><strong>Background: </strong>The use of machine learning to classify diagnostic cases versus controls defined based on diagnostic ontologies such as the International Classification of Diseases, Tenth Revision (ICD-10) from neuroimaging features is now commonplace across a wide range of diagnostic fields. However, transdiagnostic comparisons of such classifications are lacking. Such transdiagnostic comparisons are important to establish the specificity of classification models, set benchmarks, and assess the value of diagnostic ontologies.</p><p><strong>Results: </strong>We investigated case-control classification accuracy in 17 different ICD-10 diagnostic groups from Chapter V (mental and behavioral disorders) and Chapter VI (diseases of the nervous system) using data from the UK Biobank. Classification models were trained using either neuroimaging (structural or functional brain magnetic resonance imaging feature sets) or sociodemographic features. Random forest classification models were adopted using rigorous shuffle-splits to estimate stability as well as accuracy of case-control classifications. Diagnostic classification accuracies were benchmarked against age classification (oldest vs. youngest) from the same feature sets and against additional classifier types (k-nearest neighbors and linear support vector machine). In contrast to age classification accuracy, which was high for all feature sets, few ICD-10 diagnostic groups were classified significantly above chance (namely, demyelinating diseases based on structural neuroimaging features and depression based on sociodemographic and functional neuroimaging features).</p><p><strong>Conclusion: </strong>These findings highlight challenges with the current disease classification system, leading us to recommend caution with the use of ICD-10 diagnostic groups as target labels in brain-based disease prediction studies.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11811528/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143390813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing inbreeding estimation and global conservation insights through chromosome-level assemblies of the Chinese and Malayan pangolin.
IF 11.8 2区 生物学
GigaScience Pub Date : 2025-01-06 DOI: 10.1093/gigascience/giaf003
Tianming Lan, Yinping Tian, Minhui Shi, Boyang Liu, Yu Lin, Yanling Xia, Yue Ma, Sunil Kumar Sahu, Qing Wang, Jun Li, Jin Chen, Fanghui Hou, Chuanling Yin, Kai Wang, Yuan Fu, Tengcheng Que, Wenjian Liu, Huan Liu, Haimeng Li, Yan Hua
{"title":"Enhancing inbreeding estimation and global conservation insights through chromosome-level assemblies of the Chinese and Malayan pangolin.","authors":"Tianming Lan, Yinping Tian, Minhui Shi, Boyang Liu, Yu Lin, Yanling Xia, Yue Ma, Sunil Kumar Sahu, Qing Wang, Jun Li, Jin Chen, Fanghui Hou, Chuanling Yin, Kai Wang, Yuan Fu, Tengcheng Que, Wenjian Liu, Huan Liu, Haimeng Li, Yan Hua","doi":"10.1093/gigascience/giaf003","DOIUrl":"10.1093/gigascience/giaf003","url":null,"abstract":"<p><p>A high-quality reference genome coupled with resequencing data is a promising strategy to address issues in conservation genomics. This has greatly enhanced the development of conservation plans for endangered species. Pangolins are fascinating animals with a variety of unique features. Unfortunately, they are the most trafficked wild animal in the world. In this study, we assembled a chromosome-scale genome with HiFi long reads and Hi-C short reads for the Chinese and Malayan pangolin and provided two new representative reference genomes for the pangolin species. We found a great improvement in the evaluation of genetic diversity and inbreeding based on these high-quality genomes and obtained different results for the detection of genome-wide extinction risks compared with genomes assembled using short reads. Moderate inbreeding and genetic diversity were reverified in these two pangolin species, except for one Malayan pangolin population with high inbreeding and low genetic diversity. Moreover, we identified a much higher inbreeding level (FROH = 0.54) in the Chinese pangolin individual from Taiwan Province compared with that from Mainland China, but more than 99.6% runs of homozygosity (ROH) fragments were restricted to less than 1 Mb, indicating that the high FROH in Taiwan Chinese pangolins may have accumulated from historical inbreeding events. Furthermore, our study is the first to detect relatively mild genetic purging in pangolin populations. These two high-quality reference genomes will provide valuable genetic resources for future studies and contribute to the protection and conservation of pangolins.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11825179/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143412817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel dataset for nuclei and tissue segmentation in melanoma with baseline nuclei segmentation and tissue segmentation benchmarks.
IF 11.8 2区 生物学
GigaScience Pub Date : 2025-01-06 DOI: 10.1093/gigascience/giaf011
Mark Schuiveling, Hong Liu, Daniel Eek, Gerben E Breimer, Karijn P M Suijkerbuijk, Willeke A M Blokx, Mitko Veta
{"title":"A novel dataset for nuclei and tissue segmentation in melanoma with baseline nuclei segmentation and tissue segmentation benchmarks.","authors":"Mark Schuiveling, Hong Liu, Daniel Eek, Gerben E Breimer, Karijn P M Suijkerbuijk, Willeke A M Blokx, Mitko Veta","doi":"10.1093/gigascience/giaf011","DOIUrl":"10.1093/gigascience/giaf011","url":null,"abstract":"<p><strong>Background: </strong>Melanoma is an aggressive form of skin cancer in which tumor-infiltrating lymphocytes (TILs) are a biomarker for recurrence and treatment response. Manual TIL assessment is prone to interobserver variability, and current deep learning models are not publicly accessible or have low performance. Deep learning models, however, have the potential of consistent spatial evaluation of TILs and other immune cell subsets with the potential of improved prognostic and predictive value. To make the development of these models possible, we created the Panoptic Segmentation of nUclei and tissue in advanced MelanomA (PUMA) dataset and assessed the performance of several state-of-the-art deep learning models. In addition, we show how to improve model performance further by using heuristic postprocessing in which nuclei classes are updated based on their tissue localization.</p><p><strong>Results: </strong>The PUMA dataset includes 155 primary and 155 metastatic melanoma hematoxylin and eosin-stained regions of interest with nuclei and tissue annotations from a single melanoma referral institution. The Hover-NeXt model, trained on the PUMA dataset, demonstrated the best performance for lymphocyte detection, approaching human interobserver agreement. In addition, heuristic postprocessing of deep learning models improved the detection of noncommon classes, such as epithelial nuclei.</p><p><strong>Conclusion: </strong>The PUMA dataset is the first melanoma-specific dataset that can be used to develop melanoma-specific nuclei and tissue segmentation models. These models can, in turn, be used for prognostic and predictive biomarker development. Incorporating tissue and nuclei segmentation is a step toward improved deep learning nuclei segmentation performance. To support the development of these models, this dataset is used in the PUMA challenge.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11837757/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143457766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信