GigaScience最新文献

筛选
英文 中文
Interspecific hybridization in Brassica species leads to changes in agronomic traits through the regulation of gene expression by chromatin accessibility and DNA methylation. 种间杂交通过染色质可及性和DNA甲基化对基因表达的调控,导致芸苔属植物农艺性状的变化。
IF 11.8 2区 生物学
GigaScience Pub Date : 2025-01-06 DOI: 10.1093/gigascience/giaf029
Chengtao Quan, Qin Zhang, Xiaoni Zhang, Kexin Chai, Guoting Cheng, Chaozhi Ma, Cheng Dai
{"title":"Interspecific hybridization in Brassica species leads to changes in agronomic traits through the regulation of gene expression by chromatin accessibility and DNA methylation.","authors":"Chengtao Quan, Qin Zhang, Xiaoni Zhang, Kexin Chai, Guoting Cheng, Chaozhi Ma, Cheng Dai","doi":"10.1093/gigascience/giaf029","DOIUrl":"https://doi.org/10.1093/gigascience/giaf029","url":null,"abstract":"<p><p>Interspecific hybridization is a common method in plant breeding to combine traits from different species, resulting in allopolyploidization and significant genetic and epigenetic changes. However, our understanding of genome-wide chromatin and gene expression dynamics during allopolyploidization remains limited. This study generated two Brassica allotriploid hybrids via interspecific hybridization. We observed that accessible chromatin regions (ACRs) and DNA methylation interact to regulates gene expression after interspecific hybridization, ultimately influencing the agronomic traits of the hybrids. In total, 234,649 ACRs were identified in the parental lines and hybrids; the hybridization process induces changes in the distribution and abundance of their accessible chromatin regions, particularly in gene regions and their proximity. Genes associated with proximal ACRs were more highly expressed than those associated with distal and genic ACRs. More than half of novel ACRs drove transgressive gene expression in the hybrids, and the transgressive upregulated genes showed significant enrichment in metal ion binding, especially magnesium ion, calcium ion, and potassium ion binding. We also identified Bna.bZIP11 in the single-parent activation ACR, which binds to BnaA06.UF3GT to promote anthocyanin accumulation in F1 hybrids. DNA methylation plays a role in repressing gene expression, and unmethylated ACRs are more transcriptionally active. Additionally, the A-subgenome ACRs were associated with genome dosage rather than DNA methylation. The interplay among DNA methylation, transposable elements, and sRNA contributes to the dynamic landscape of ACRs during interspecific hybridization, resulting in distinct gene expression patterns on the genome.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12012897/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143979240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Guidance framework to apply best practices in ecological data analysis: lessons learned from building Galaxy-Ecology. 应用生态数据分析最佳做法的指导框架:从建立星系生态学中学到的经验教训。
IF 11.8 2区 生物学
GigaScience Pub Date : 2025-01-06 DOI: 10.1093/gigascience/giae122
Coline Royaux, Jean-Baptiste Mihoub, Marie Jossé, Dominique Pelletier, Olivier Norvez, Yves Reecht, Anne Fouilloux, Helena Rasche, Saskia Hiltemann, Bérénice Batut, Eléaume Marc, Pauline Seguineau, Guillaume Massé, Alan Amossé, Claire Bissery, Romain Lorrilliere, Alexis Martin, Yves Bas, Thimothée Virgoulay, Valentin Chambon, Elie Arnaud, Elisa Michon, Clara Urfer, Eloïse Trigodet, Marie Delannoy, Gregoire Loïs, Romain Julliard, Björn Grüning, Yvan Le Bras
{"title":"Guidance framework to apply best practices in ecological data analysis: lessons learned from building Galaxy-Ecology.","authors":"Coline Royaux, Jean-Baptiste Mihoub, Marie Jossé, Dominique Pelletier, Olivier Norvez, Yves Reecht, Anne Fouilloux, Helena Rasche, Saskia Hiltemann, Bérénice Batut, Eléaume Marc, Pauline Seguineau, Guillaume Massé, Alan Amossé, Claire Bissery, Romain Lorrilliere, Alexis Martin, Yves Bas, Thimothée Virgoulay, Valentin Chambon, Elie Arnaud, Elisa Michon, Clara Urfer, Eloïse Trigodet, Marie Delannoy, Gregoire Loïs, Romain Julliard, Björn Grüning, Yvan Le Bras","doi":"10.1093/gigascience/giae122","DOIUrl":"10.1093/gigascience/giae122","url":null,"abstract":"<p><p>Numerous conceptual frameworks exist for best practices in research data and analysis (e.g., Open Science and FAIR principles). In practice, there is a need for further progress to improve transparency, reproducibility, and confidence in ecology. Here, we propose a practical and operational framework for researchers and experts in ecology to achieve best practices for building analytical procedures from individual research projects to production-level analytical pipelines. We introduce the concept of atomization to identify analytical steps that support generalization by allowing us to go beyond single analyses. The term atomization is employed to convey the idea of single analytical steps as \"atoms\" composing an analytical procedure. When generalized, \"atoms\" can be used in more than a single case analysis. These guidelines were established during the development of the Galaxy-Ecology initiative, a web platform dedicated to data analysis in ecology. Galaxy-Ecology allows us to demonstrate a way to reach higher levels of reproducibility in ecological sciences by increasing the accessibility and reusability of analytical workflows once atomized and generalized.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11816794/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143407005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A high-quality assembly revealing the PMEL gene for the unique plumage phenotype in Liancheng ducks. 高质量的基因组装揭示了连城鸭独特羽色表型的 PMEL 基因。
IF 11.8 2区 生物学
GigaScience Pub Date : 2025-01-06 DOI: 10.1093/gigascience/giae114
Zhen Wang, Zhanbao Guo, Hongfei Liu, Tong Liu, Dapeng Liu, Simeng Yu, Hehe Tang, He Zhang, Qiming Mou, Bo Zhang, Junting Cao, Martine Schroyen, Shuisheng Hou, Zhengkui Zhou
{"title":"A high-quality assembly revealing the PMEL gene for the unique plumage phenotype in Liancheng ducks.","authors":"Zhen Wang, Zhanbao Guo, Hongfei Liu, Tong Liu, Dapeng Liu, Simeng Yu, Hehe Tang, He Zhang, Qiming Mou, Bo Zhang, Junting Cao, Martine Schroyen, Shuisheng Hou, Zhengkui Zhou","doi":"10.1093/gigascience/giae114","DOIUrl":"10.1093/gigascience/giae114","url":null,"abstract":"<p><strong>Background: </strong>Plumage coloration is a distinctive trait in ducks, and the Liancheng duck, characterized by its white plumage and black beak and webbed feet, serves as an excellent subject for such studies. However, academic comprehension of the genetic mechanisms underlying duck plumage coloration remains limited. To this end, the Liancheng duck genome (GCA_039998735.1) was hereby de novo assembled using HiFi reads, and F2 segregating populations were generated from Liancheng and Pekin ducks. The aim was to identify the genetic mechanism of white plumage in Liancheng ducks.</p><p><strong>Results: </strong>In this study, 1.29 Gb Liancheng duck genome was de novo assembled, involving a contig N50 of 12.17 Mb and a scaffold N50 of 83.98 Mb. Beyond the epistatic effect of the MITF gene, genome-wide association study analysis pinpointed a 0.8-Mb genomic region encompassing the PMEL gene. This gene encoded a protein specific to pigment cells and was essential for the formation of fibrillar sheets within melanosomes, the organelles responsible for pigmentation. Additionally, linkage disequilibrium analysis revealed 2 candidate single-nucleotide polymorphisms (Chr33: 5,303,994A>G; 5,303,997A>G) that might alter PMEL transcription, potentially influencing plumage coloration in Liancheng ducks.</p><p><strong>Conclusions: </strong>Our study has assembled a high-quality genome for the Liancheng duck and has presented compelling evidence that the white plumage characteristic of this breed is attributable to the PMEL gene. Overall, these findings offer significant insights and direction for future studies and breeding programs aimed at understanding and manipulating avian plumage coloration.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11727711/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142977794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
NanoMnT: an STR analysis tool for Oxford Nanopore sequencing data driven by a comprehensive analysis of error profile in STR regions. NanoMnT:一个STR分析工具,用于牛津纳米孔测序数据,由STR区域的错误剖面的综合分析驱动。
IF 11.8 2区 生物学
GigaScience Pub Date : 2025-01-06 DOI: 10.1093/gigascience/giaf013
Gyumin Park, Hyunsu An, Han Luo, Jihwan Park
{"title":"NanoMnT: an STR analysis tool for Oxford Nanopore sequencing data driven by a comprehensive analysis of error profile in STR regions.","authors":"Gyumin Park, Hyunsu An, Han Luo, Jihwan Park","doi":"10.1093/gigascience/giaf013","DOIUrl":"10.1093/gigascience/giaf013","url":null,"abstract":"<p><p>Oxford Nanopore Technology (ONT) sequencing is a third-generation sequencing technology that enables cost-effective long-read sequencing, with broad applications in biological research. However, its high sequencing error rate in low-complexity regions hampers its applications in short tandem repeat (STR)-related research. To address this, we generated a comprehensive STR error profile of ONT by analyzing publicly available Nanopore sequencing datasets. We show that the sequencing error rate is influenced not only by STR length but also by the repeat unit and the flanking sequences of STR regions. Interestingly, certain flanking sequences were associated with higher sequencing accuracy, suggesting that certain STR loci are more suitable for Nanopore sequencing compared to other loci. While base quality scores of substitution errors within the STR regions were lower than those of correctly sequenced bases, such patterns were not observed for indel errors. Furthermore, choosing the most recent basecaller version and using the super accuracy model significantly improved STR sequencing accuracy. Finally, we present NanoMnT, a lightweight Python tool that corrects STR sequencing errors in sequencing data and estimates STR allele sizes. NanoMnT leverages the characteristics of ONT when estimating STR allele size and exhibits superior results for 1-bp- and 2-bp repeat STR compared to existing tools. By integrating our findings, we improved STR allele estimation accuracy for Ax10 repeats from 55% to 78% and up to 85% when excluding loci with unfavorable flanking sequences. Using NanoMnT, we present the utility of our findings by identifying microsatellite instability status in cancer sequencing data. NanoMnT is publicly available at https://github.com/18parkky/NanoMnT.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11912559/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143648038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Genomes reveal pervasive distant hybridization in nature among cyprinid fishes. 基因组揭示了自然界中鲤科鱼类之间普遍存在的远端杂交。
IF 11.8 2区 生物学
GigaScience Pub Date : 2025-01-06 DOI: 10.1093/gigascience/giae117
Li Ren, Xiaolong Tu, Mengxue Luo, Qizhi Liu, Jialin Cui, Xin Gao, Hong Zhang, Yakui Tai, Yiyan Zeng, Mengdan Li, Chang Wu, Wuhui Li, Jing Wang, Dongdong Wu, Shaojun Liu
{"title":"Genomes reveal pervasive distant hybridization in nature among cyprinid fishes.","authors":"Li Ren, Xiaolong Tu, Mengxue Luo, Qizhi Liu, Jialin Cui, Xin Gao, Hong Zhang, Yakui Tai, Yiyan Zeng, Mengdan Li, Chang Wu, Wuhui Li, Jing Wang, Dongdong Wu, Shaojun Liu","doi":"10.1093/gigascience/giae117","DOIUrl":"10.1093/gigascience/giae117","url":null,"abstract":"<p><strong>Background: </strong>Genomic data have unveiled a fascinating aspect of the evolutionary past, showing that the mingling of different species through hybridization has left its mark on the histories of numerous life forms. However, the relationship between hybridization events and the origins of cyprinid fishes remains unclear.</p><p><strong>Results: </strong>In this study, we generated de novo assembled genomes of 8 cyprinid fishes and conducted phylogenetic analyses on 24 species. Widespread allele sharing across species boundaries was observed within 7 subfamilies of cyprinid fishes. Based on a systematic analysis of multiple tissues, we found that the testis exhibited a conserved pattern of divergence between the herbivorous Megalobrama amblycephala and the carnivorous Culter alburnus, suggesting a potential link to incomplete reproductive isolation. Significant differences in the expression of 4 genes (dpp2, ctrl, psb7, and ppce) in the liver and intestine, accompanied by variations in enzyme activities, indicated swift divergence in digestive enzyme secretion. Moreover, we identified introgressed genes linked to organ development in sympatric fishes with analogous feeding habits within the Cultrinae and Leuciscinae subfamilies.</p><p><strong>Conclusions: </strong>Our findings highlight the significant role played by incomplete reproductive isolation and frequent gene flow events, particularly those associated with the development of digestive organs, in driving speciation among cyprinid fishes in diverse freshwater ecosystems.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11779505/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143065175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction to: Habitat suitability maps for Australian flora and fauna under CMIP6 climate scenarios. 修正:CMIP6气候情景下澳大利亚动植物栖息地适宜性图。
IF 11.8 2区 生物学
GigaScience Pub Date : 2025-01-06 DOI: 10.1093/gigascience/giae031
{"title":"Correction to: Habitat suitability maps for Australian flora and fauna under CMIP6 climate scenarios.","authors":"","doi":"10.1093/gigascience/giae031","DOIUrl":"10.1093/gigascience/giae031","url":null,"abstract":"","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11880536/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143556459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TransHLA: a Hybrid Transformer model for HLA-presented epitope detection. transshla: hla呈递表位检测的混合变压器模型。
IF 11.8 2区 生物学
GigaScience Pub Date : 2025-01-06 DOI: 10.1093/gigascience/giaf008
Tianchi Lu, Xueying Wang, Wan Nie, Miaozhe Huo, Shuaicheng Li
{"title":"TransHLA: a Hybrid Transformer model for HLA-presented epitope detection.","authors":"Tianchi Lu, Xueying Wang, Wan Nie, Miaozhe Huo, Shuaicheng Li","doi":"10.1093/gigascience/giaf008","DOIUrl":"10.1093/gigascience/giaf008","url":null,"abstract":"<p><strong>Background: </strong>Precise prediction of epitope presentation on human leukocyte antigen (HLA) molecules is crucial for advancing vaccine development and immunotherapy. Conventional HLA-peptide binding affinity prediction tools often focus on specific alleles and lack a universal approach for comprehensive HLA site analysis. This limitation hinders efficient filtering of invalid peptide segments.</p><p><strong>Results: </strong>We introduce TransHLA, a pioneering tool designed for epitope prediction across all HLA alleles, integrating Transformer and Residue CNN architectures. TransHLA utilizes the ESM2 large language model for sequence and structure embeddings, achieving high predictive accuracy. For HLA class I, it reaches an accuracy of 84.72% and an area under the curve (AUC) of 91.95% on IEDB test data. For HLA class II, it achieves 79.94% accuracy and an AUC of 88.14%. Our case studies using datasets like CEDAR and VDJdb demonstrate that TransHLA surpasses existing models in specificity and sensitivity for identifying immunogenic epitopes and neoepitopes.</p><p><strong>Conclusions: </strong>TransHLA significantly enhances vaccine design and immunotherapy by efficiently identifying broadly reactive peptides. Our resources, including data and code, are publicly accessible at https://github.com/SkywalkerLuke/TransHLA.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11878767/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143556462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cerebellocerebral connectivity predicts body mass index: a new open-source Python-based framework for connectome-based predictive modeling. 小脑脑连通性预测身体质量指数:一个新的基于python的基于连接体预测建模的开源框架。
IF 11.8 2区 生物学
GigaScience Pub Date : 2025-01-06 DOI: 10.1093/gigascience/giaf010
Tobias Bachmann, Karsten Mueller, Simon N A Kusnezow, Matthias L Schroeter, Paolo Piaggi, Christopher M Weise
{"title":"Cerebellocerebral connectivity predicts body mass index: a new open-source Python-based framework for connectome-based predictive modeling.","authors":"Tobias Bachmann, Karsten Mueller, Simon N A Kusnezow, Matthias L Schroeter, Paolo Piaggi, Christopher M Weise","doi":"10.1093/gigascience/giaf010","DOIUrl":"10.1093/gigascience/giaf010","url":null,"abstract":"<p><strong>Background: </strong>The cerebellum is one of the major central nervous structures consistently altered in obesity. Its role in higher cognitive function, parts of which are affected by obesity, is mediated through projections to and from the cerebral cortex. We therefore investigated the relationship between body mass index (BMI) and cerebellocerebral connectivity.</p><p><strong>Methods: </strong>We utilized the Human Connectome Project's Young Adults dataset, including functional magnetic resonance imaging (fMRI) and behavioral data, to perform connectome-based predictive modeling (CPM) restricted to cerebellocerebral connectivity of resting-state fMRI and task-based fMRI. We developed a Python-based open-source framework to perform CPM, a data-driven technique with built-in cross-validation to establish brain-behavior relationships. Significance was assessed with permutation analysis.</p><p><strong>Results: </strong>We found that (i) cerebellocerebral connectivity predicted BMI, (ii) task-general cerebellocerebral connectivity predicted BMI more reliably than resting-state fMRI and individual task-based fMRI separately, (iii) predictive networks derived this way overlapped with established functional brain networks (namely, frontoparietal networks, the somatomotor network, the salience network, and the default mode network), and (iv) we found there was an inverse overlap between networks predictive of BMI and networks predictive of cognitive measures adversely affected by overweight/obesity.</p><p><strong>Conclusions: </strong>Our results suggest obesity-specific alterations in cerebellocerebral connectivity, specifically with regard to task execution. With brain areas and brain networks relevant to task performance implicated, these alterations seem to reflect a neurobiological substrate for task performance adversely affected by obesity.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11899596/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143614577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HVSeeker: a deep-learning-based method for identification of host and viral DNA sequences. HVSeeker:一种基于深度学习的宿主和病毒DNA序列识别方法。
IF 11.8 2区 生物学
GigaScience Pub Date : 2025-01-06 DOI: 10.1093/gigascience/giaf037
Abdullatif Al-Najim, Sven Hauns, Van Dinh Tran, Rolf Backofen, Omer S Alkhnbashi
{"title":"HVSeeker: a deep-learning-based method for identification of host and viral DNA sequences.","authors":"Abdullatif Al-Najim, Sven Hauns, Van Dinh Tran, Rolf Backofen, Omer S Alkhnbashi","doi":"10.1093/gigascience/giaf037","DOIUrl":"https://doi.org/10.1093/gigascience/giaf037","url":null,"abstract":"<p><strong>Background: </strong>Bacteriophages are among the most abundant organisms on Earth, significantly impacting ecosystems and human society. The identification of viral sequences, especially novel ones, from mixed metagenomes is a critical first step in analyzing the viral components of host samples. This plays a key role in many downstream tasks. However, this is a challenging task due to their rapid evolution rate. The identification process typically involves two steps: distinguishing viral sequences from the host and identifying if they come from novel viral genomes. Traditional metagenomic techniques that rely on sequence similarity with known entities often fall short, especially when dealing with short or novel genomes. Meanwhile, deep learning has demonstrated its efficacy across various domains, including the bioinformatics field.</p><p><strong>Results: </strong>We have developed HVSeeker-a host/virus seeker method-based on deep learning to distinguish between bacterial and phage sequences. HVSeeker consists of two separate models: one analyzing DNA sequences and the other focusing on proteins. In addition to the robust architecture of HVSeeker, three distinct preprocessing methods were introduced to enhance the learning process: padding, contigs assembly, and sliding window. This method has shown promising results on sequences with various lengths, ranging from 200 to 1,500 base pairs. Tested on both NCBI and IMGVR databases, HVSeeker outperformed several methods from the literature such as Seeker, Rnn-VirSeeker, DeepVirFinder, and PPR-Meta. Moreover, when compared with other methods on benchmark datasets, HVSeeker has shown better performance, establishing its effectiveness in identifying unknown phage genomes.</p><p><strong>Conclusions: </strong>These results demonstrate the exceptional structure of HVSeeker, which encompasses both the preprocessing methods and the model design. The advancements provided by HVSeeker are significant for identifying viral genomes and developing new therapeutic approaches, such as phage therapy. Therefore, HVSeeker serves as an essential tool in prokaryotic and phage taxonomy, offering a crucial first step toward analyzing the host-viral component of samples by identifying the host and viral sequences in mixed metagenomes.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12080225/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144077444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Telomere-to-telomere genome of common bean (Phaseolus vulgaris L., YP4). 豌豆(Phaseolus vulgaris L., YP4)的端粒-端粒基因组。
IF 11.8 2区 生物学
GigaScience Pub Date : 2025-01-06 DOI: 10.1093/gigascience/giaf001
Yan Wang, Xiaopeng Hao, Chunhai Chen, Haigang Wang, Peng Gao, Xukui Yang, Xue Dong, Huibin Qin, Meng Li, Sen Hou, Jianbo Jian, Jianwu Chang, Jing Wu, Zhixin Mu
{"title":"Telomere-to-telomere genome of common bean (Phaseolus vulgaris L., YP4).","authors":"Yan Wang, Xiaopeng Hao, Chunhai Chen, Haigang Wang, Peng Gao, Xukui Yang, Xue Dong, Huibin Qin, Meng Li, Sen Hou, Jianbo Jian, Jianwu Chang, Jing Wu, Zhixin Mu","doi":"10.1093/gigascience/giaf001","DOIUrl":"https://doi.org/10.1093/gigascience/giaf001","url":null,"abstract":"<p><strong>Background: </strong>Common bean is a significant grain legume in human diets. However, the lack of a complete reference genome for common beans has hindered efforts to improve agronomic cultivars.</p><p><strong>Findings: </strong>Herein, we present the first telomere-to-telomere (T2T) genome assembly of common bean (Phaseolus vulgaris L., YP4) using PacBio High-Fidelity reads, ONT ultra-long sequencing, and Hi-C technologies. The assembly resulted in a genome size of 560.30 Mb with an N50 of 55.11 Mb, exhibiting high completeness and accuracy (BUSCO score: 99.5%, quality value (QV): 54.86). The sequences were anchored into 11 chromosomes, with 20 of 22 telomeres identified, leading to the formation of 9 T2T pseudomolecules. Furthermore, we identified repetitive elements accounting for 61.20% of the genome and predicted 29,925 protein-coding genes. Phylogenetic analysis suggested an estimated divergence time of approximately 11.6 million years ago between P. vulgaris and Vigna angularis. Comparative genome analysis revealed the expanded gene families and variations between YP4 and G19833 associated with defense response.</p><p><strong>Conclusions: </strong>The T2T reference genome and genomic insights presented here are crucial for future genetic studies not only in common bean but also in other legumes.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12077395/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144077126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信