GigaScience最新文献

筛选
英文 中文
Chromosome-level genome of the poultry shaft louse Menopon gallinae provides insight into the host-switching and adaptive evolution of parasitic lice. 家禽轴虱 Menopon gallinae 染色体水平的基因组有助于深入了解寄生虱的宿主转换和适应性进化。
IF 3.5 2区 生物学
GigaScience Pub Date : 2024-01-02 DOI: 10.1093/gigascience/giae004
Ye Xu, Ling Ma, Shanlin Liu, Yanxin Liang, Qiaoqiao Liu, Zhixin He, Li Tian, Yuange Duan, Wanzhi Cai, Hu Li, Fan Song
{"title":"Chromosome-level genome of the poultry shaft louse Menopon gallinae provides insight into the host-switching and adaptive evolution of parasitic lice.","authors":"Ye Xu, Ling Ma, Shanlin Liu, Yanxin Liang, Qiaoqiao Liu, Zhixin He, Li Tian, Yuange Duan, Wanzhi Cai, Hu Li, Fan Song","doi":"10.1093/gigascience/giae004","DOIUrl":"10.1093/gigascience/giae004","url":null,"abstract":"<p><strong>Background: </strong>Lice (Psocodea: Phthiraptera) are one important group of parasites that infects birds and mammals. It is believed that the ancestor of parasitic lice originated on the ancient avian host, and ancient mammals acquired these parasites via host-switching from birds. Here we present the first chromosome-level genome of Menopon gallinae in Amblycera (earliest diverging lineage of parasitic lice). We explore the transition of louse host-switching from birds to mammals at the genomic level by identifying numerous idiosyncratic genomic variations.</p><p><strong>Results: </strong>The assembled genome is 155 Mb in length, with a contig N50 of 27.42 Mb. Hi-C scaffolding assigned 97% of the bases to 5 chromosomes. The genome of M. gallinae retains a basal insect repertoire of 11,950 protein-coding genes. By comparing the genomes of lice to those of multiple representative insects in other orders, we discovered that gene families of digestion, detoxification, and immunity-related are generally conserved between bird lice and mammal lice, while mammal lice have undergone a significant reduction in genes related to chemosensory systems and temperature. This suggests that mammal lice have lost some of these genes through the adaption to environment and temperatures after host-switching. Furthermore, 7 genes related to hematophagy were positively selected in mammal lice, suggesting their involvement in the hematophagous behavior.</p><p><strong>Conclusions: </strong>Our high-quality genome of M. gallinae provides a valuable resource for comparative genomic research in Phthiraptera and facilitates further studies on adaptive evolution of host-switching within parasitic lice.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"13 1","pages":""},"PeriodicalIF":3.5,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10904027/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139899653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Leveraging citizen science for monitoring urban forageable plants. 利用公民科学监测城市可食用植物。
IF 3.5 2区 生物学
GigaScience Pub Date : 2024-01-02 DOI: 10.1093/gigascience/giae007
Filipi Miranda Soares, Luís Ferreira Pires, Maria Carolina Garcia, Yamine Bouzembrak, Lidio Coradin, Natalia Pirani Ghilardi-Lopes, Rubens Rangel Silva, Aline Martins de Carvalho, Benildes Coura Moreira Dos Santos Maculan, Sheina Koffler, Uiara Bandineli Montedo, Debora Pignatari Drucker, Raquel Santiago, Anand Gavai, Maria Clara Peres de Carvalho, Ana Carolina da Silva Lima, Hillary Dandara Elias Gabriel, Stephanie Gabriele Mendonça de França, Karoline Reis de Almeida, Bárbara Junqueira Dos Santos, Antonio Mauro Saraiva
{"title":"Leveraging citizen science for monitoring urban forageable plants.","authors":"Filipi Miranda Soares, Luís Ferreira Pires, Maria Carolina Garcia, Yamine Bouzembrak, Lidio Coradin, Natalia Pirani Ghilardi-Lopes, Rubens Rangel Silva, Aline Martins de Carvalho, Benildes Coura Moreira Dos Santos Maculan, Sheina Koffler, Uiara Bandineli Montedo, Debora Pignatari Drucker, Raquel Santiago, Anand Gavai, Maria Clara Peres de Carvalho, Ana Carolina da Silva Lima, Hillary Dandara Elias Gabriel, Stephanie Gabriele Mendonça de França, Karoline Reis de Almeida, Bárbara Junqueira Dos Santos, Antonio Mauro Saraiva","doi":"10.1093/gigascience/giae007","DOIUrl":"10.1093/gigascience/giae007","url":null,"abstract":"<p><p>Urbanization brings forth social challenges in emerging countries such as Brazil, encompassing food scarcity, health deterioration, air pollution, and biodiversity loss. Despite this, urban areas like the city of São Paulo still boast ample green spaces, offering opportunities for nature appreciation and conservation, enhancing city resilience and livability. Citizen science is a collaborative endeavor between professional scientists and nonprofessional scientists in scientific research that may help to understand the dynamics of urban ecosystems. We believe citizen science has the potential to promote human and nature connection in urban areas and provide useful data on urban biodiversity.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"13 ","pages":""},"PeriodicalIF":3.5,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10914215/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140039095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RicePilaf: a post-GWAS/QTL dashboard to integrate pangenomic, coexpression, regulatory, epigenomic, ontology, pathway, and text-mining information to provide functional insights into rice QTLs and GWAS loci. RicePilaf:GWAS/QTL 后仪表板,用于整合泛基因组学、共表达、调控、表观基因组学、本体论、通路和文本挖掘信息,为水稻 QTL 和 GWAS 基因座提供功能性见解。
IF 11.8 2区 生物学
GigaScience Pub Date : 2024-01-02 DOI: 10.1093/gigascience/giae013
Anish M S Shrestha, Mark Edward M Gonzales, Phoebe Clare L Ong, Pierre Larmande, Hyun-Sook Lee, Ji-Ung Jeung, Ajay Kohli, Dmytro Chebotarov, Ramil P Mauleon, Jae-Sung Lee, Kenneth L McNally
{"title":"RicePilaf: a post-GWAS/QTL dashboard to integrate pangenomic, coexpression, regulatory, epigenomic, ontology, pathway, and text-mining information to provide functional insights into rice QTLs and GWAS loci.","authors":"Anish M S Shrestha, Mark Edward M Gonzales, Phoebe Clare L Ong, Pierre Larmande, Hyun-Sook Lee, Ji-Ung Jeung, Ajay Kohli, Dmytro Chebotarov, Ramil P Mauleon, Jae-Sung Lee, Kenneth L McNally","doi":"10.1093/gigascience/giae013","DOIUrl":"10.1093/gigascience/giae013","url":null,"abstract":"<p><strong>Background: </strong>As the number of genome-wide association study (GWAS) and quantitative trait locus (QTL) mappings in rice continues to grow, so does the already long list of genomic loci associated with important agronomic traits. Typically, loci implicated by GWAS/QTL analysis contain tens to hundreds to thousands of single-nucleotide polmorphisms (SNPs)/genes, not all of which are causal and many of which are in noncoding regions. Unraveling the biological mechanisms that tie the GWAS regions and QTLs to the trait of interest is challenging, especially since it requires collating functional genomics information about the loci from multiple, disparate data sources.</p><p><strong>Results: </strong>We present RicePilaf, a web app for post-GWAS/QTL analysis, that performs a slew of novel bioinformatics analyses to cross-reference GWAS results and QTL mappings with a host of publicly available rice databases. In particular, it integrates (i) pangenomic information from high-quality genome builds of multiple rice varieties, (ii) coexpression information from genome-scale coexpression networks, (iii) ontology and pathway information, (iv) regulatory information from rice transcription factor databases, (v) epigenomic information from multiple high-throughput epigenetic experiments, and (vi) text-mining information extracted from scientific abstracts linking genes and traits. We demonstrate the utility of RicePilaf by applying it to analyze GWAS peaks of preharvest sprouting and genes underlying yield-under-drought QTLs.</p><p><strong>Conclusions: </strong>RicePilaf enables rice scientists and breeders to shed functional light on their GWAS regions and QTLs, and it provides them with a means to prioritize SNPs/genes for further experiments. The source code, a Docker image, and a demo version of RicePilaf are publicly available at https://github.com/bioinfodlsu/rice-pilaf.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"13 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11148593/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141237423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CheRRI-Accurate classification of the biological relevance of putative RNA-RNA interaction sites. CheRRI--对假定的 RNA-RNA 相互作用位点的生物学相关性进行精确分类。
IF 3.5 2区 生物学
GigaScience Pub Date : 2024-01-02 DOI: 10.1093/gigascience/giae022
Teresa Müller, Stefan Mautner, Pavankumar Videm, Florian Eggenhofer, Martin Raden, Rolf Backofen
{"title":"CheRRI-Accurate classification of the biological relevance of putative RNA-RNA interaction sites.","authors":"Teresa Müller, Stefan Mautner, Pavankumar Videm, Florian Eggenhofer, Martin Raden, Rolf Backofen","doi":"10.1093/gigascience/giae022","DOIUrl":"10.1093/gigascience/giae022","url":null,"abstract":"<p><strong>Background: </strong>RNA-RNA interactions are key to a wide range of cellular functions. The detection of potential interactions helps to understand the underlying processes. However, potential interactions identified via in silico or experimental high-throughput methods can lack precision because of a high false-positive rate.</p><p><strong>Results: </strong>We present CheRRI, the first tool to evaluate the biological relevance of putative RNA-RNA interaction sites. CheRRI filters candidates via a machine learning-based model trained on experimental RNA-RNA interactome data. Its unique setup combines interactome data and an established thermodynamic prediction tool to integrate experimental data with state-of-the-art computational models. Applying these data to an automated machine learning approach provides the opportunity to not only filter data for potential false positives but also tailor the underlying interaction site model to specific needs.</p><p><strong>Conclusions: </strong>CheRRI is a stand-alone postprocessing tool to filter either predicted or experimentally identified potential RNA-RNA interactions on a genomic level to enhance the quality of interaction candidates. It is easy to install (via conda, pip packages), use (via Galaxy), and integrate into existing RNA-RNA interaction pipelines.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"13 ","pages":""},"PeriodicalIF":3.5,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11152173/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141261603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PhageGE: an interactive web platform for exploratory analysis and visualization of bacteriophage genomes. PhageGE:噬菌体基因组探索性分析和可视化互动网络平台。
IF 3.5 2区 生物学
GigaScience Pub Date : 2024-01-02 DOI: 10.1093/gigascience/giae074
Jinxin Zhao, Jiru Han, Yu-Wei Lin, Yan Zhu, Michael Aichem, Dimitar Garkov, Phillip J Bergen, Sue C Nang, Jian-Zhong Ye, Tieli Zhou, Tony Velkov, Jiangning Song, Falk Schreiber, Jian Li
{"title":"PhageGE: an interactive web platform for exploratory analysis and visualization of bacteriophage genomes.","authors":"Jinxin Zhao, Jiru Han, Yu-Wei Lin, Yan Zhu, Michael Aichem, Dimitar Garkov, Phillip J Bergen, Sue C Nang, Jian-Zhong Ye, Tieli Zhou, Tony Velkov, Jiangning Song, Falk Schreiber, Jian Li","doi":"10.1093/gigascience/giae074","DOIUrl":"10.1093/gigascience/giae074","url":null,"abstract":"<p><strong>Background: </strong>Antimicrobial resistance is a serious threat to global health. Due to the stagnant antibiotic discovery pipeline, bacteriophages (phages) have been proposed as an alternative therapy for the treatment of infections caused by multidrug-resistant pathogens. Genomic features play an important role in phage pharmacology. However, our knowledge of phage genomics is sparse, and the use of existing bioinformatic pipelines and tools requires considerable bioinformatic expertise. These challenges have substantially limited the clinical translation of phage therapy.</p><p><strong>Findings: </strong>We have developed PhageGE (Phage Genome Explorer), a user-friendly graphical interface application for the interactive analysis of phage genomes. PhageGE enables users to perform key analyses, including phylogenetic analysis, visualization of phylogenetic trees, prediction of phage life cycle, and comparative analysis of phage genome annotations. The new R Shiny web server, PhageGE, integrates existing R packages and combines them with several newly developed functions to facilitate these analyses. Additionally, the web server provides interactive visualization capabilities and allows users to directly export publication-quality images.</p><p><strong>Conclusions: </strong>PhageGE is a valuable tool that simplifies the analysis of phage genome data and may expedite the development and clinical translation of phage therapy. PhageGE is publicly available at https://jason-zhao.shinyapps.io/PhageGE_Update/.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"13 ","pages":""},"PeriodicalIF":3.5,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11423353/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142344887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AltaiR: a C toolkit for alignment-free and temporal analysis of multi-FASTA data. AltaiR:用于多 FASTA 数据无配准和时序分析的 C 语言工具包。
IF 11.8 2区 生物学
GigaScience Pub Date : 2024-01-02 DOI: 10.1093/gigascience/giae086
Jorge M Silva, Armando J Pinho, Diogo Pratas
{"title":"AltaiR: a C toolkit for alignment-free and temporal analysis of multi-FASTA data.","authors":"Jorge M Silva, Armando J Pinho, Diogo Pratas","doi":"10.1093/gigascience/giae086","DOIUrl":"10.1093/gigascience/giae086","url":null,"abstract":"<p><strong>Background: </strong>Most viral genome sequences generated during the latest pandemic have presented new challenges for computational analysis. Analyzing millions of viral genomes in multi-FASTA format is computationally demanding, especially when using alignment-based methods. Most existing methods are not designed to handle such large datasets, often requiring the analysis to be divided into smaller parts to obtain results using available computational resources.</p><p><strong>Findings: </strong>We introduce AltaiR, a toolkit for analyzing multiple sequences in multi-FASTA format using exclusively alignment-free methodologies. AltaiR enables the identification of singularity and similarity patterns within sequences and computes static and temporal dynamics without restrictions on the number or size of input sequences. It automatically filters low-quality, biased, or deviant data. We demonstrate AltaiR's capabilities by analyzing more than 1.5 million full severe acute respiratory virus coronavirus 2 sequences, revealing interesting observations regarding viral genome characteristics over time, such as shifts in nucleotide composition, decreases in average Kolmogorov sequence complexity, and the evolution of the smallest sequences not found in the human host.</p><p><strong>Conclusions: </strong>AltaiR can identify temporal characteristics and trends in large numbers of sequences, making it ideal for scenarios involving endemic or epidemic outbreaks with vast amounts of available sequence data. Implemented in C with multithreading and methodological optimizations, AltaiR is computationally efficient, flexible, and dependency-free. It accepts any sequence in FASTA format, including amino acid sequences. The complete toolkit is freely available at https://github.com/cobilab/altair.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"13 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11590114/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142715752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Construction and analysis of telomere-to-telomere genomes for 2 sweet oranges: Longhuihong and Newhall (Citrus sinensis). 构建和分析两种甜橙的端粒-端粒基因组:龙汇红和纽荷尔(Citrus sinensis)。
IF 11.8 2区 生物学
GigaScience Pub Date : 2024-01-02 DOI: 10.1093/gigascience/giae084
Lin Hong, Xin-Dong Xu, Lei Yang, Min Wang, Shuang Li, Haijian Yang, Si-Ying Ye, Ling-Ling Chen, Jia-Ming Song
{"title":"Construction and analysis of telomere-to-telomere genomes for 2 sweet oranges: Longhuihong and Newhall (Citrus sinensis).","authors":"Lin Hong, Xin-Dong Xu, Lei Yang, Min Wang, Shuang Li, Haijian Yang, Si-Ying Ye, Ling-Ling Chen, Jia-Ming Song","doi":"10.1093/gigascience/giae084","DOIUrl":"10.1093/gigascience/giae084","url":null,"abstract":"<p><strong>Background: </strong>Sweet orange (Citrus sinensis Osbeck) is a fruit crop of high nutritional value that is widely consumed around the world. However, its susceptibility to low-temperature stress limits its cultivation and production in regions prone to frost damage, severely impacting the sustainable development of the sweet orange industry. Therefore, developing cold-resistant sweet orange varieties is of great necessity. Traditional hybrid breeding methods are not feasible due to the polyembryonic phenomenon in sweet oranges, necessitating the enhancement of its germplasm through molecular breeding. High-quality reference genomes are valuable for studying crop resistance to biotic and abiotic stresses. However, the lack of genomic resources for cold-resistant sweet orange varieties has hindered the progress in developing such varieties and researching their molecular mechanisms of cold resistance.</p><p><strong>Findings: </strong>This study integrated PacBio HiFi, ONT, Hi-C, and Illumina sequencing data to assemble telomere-to-telomere (T2T) reference genomes for the cold-resistant sweet orange mutant \"Longhuihong\" (Citrus sinensis [L.] Osb. cv. LHH) and its wild-type counterpart \"Newhall\" (C. sinensis [L.] Osb. cv. Newhall). Comprehensive evaluations based on multiple criteria revealed that both genomes exhibit high continuity, completeness, and accuracy. The genome sizes were 340.28 Mb and 346.33 Mb, with contig N50 of 39.31 Mb and 36.77 Mb, respectively. In total, 31,456 and 30,021 gene models were annotated in the respective genomes. Leveraging these assembled genomes, comparative genomics analyses were performed, elucidating the evolutionary history of the sweet orange genome. Moreover, the study identified 2,886 structural variants between the 2 genomes, with several SVs located in the upstream, downstream, or intronic regions of homologous genes known to be associated with cold resistance.</p><p><strong>Conclusions: </strong>The study de novo assembled 2 T2T reference genomes of sweet orange varieties exhibiting different levels of cold tolerance. These genomes serve as valuable foundational resources for genomic research and molecular breeding aimed at enhancing cold tolerance in sweet oranges. Additionally, they expand the existing repository of reference genomes and sequencing data resources for C. sinensis. Moreover, these genomes provide a critical data foundation for comparative genomics analyses across different plant species.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"13 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11590112/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142715757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
demuxSNP: supervised demultiplexing single-cell RNA sequencing using cell hashing and SNPs. demuxSNP:使用细胞哈希和snp进行监督的单细胞RNA解复用测序。
IF 11.8 2区 生物学
GigaScience Pub Date : 2024-01-02 DOI: 10.1093/gigascience/giae090
Michael P Lynch, Yufei Wang, Shannan Ho Sui, Laurent Gatto, Aedin C Culhane
{"title":"demuxSNP: supervised demultiplexing single-cell RNA sequencing using cell hashing and SNPs.","authors":"Michael P Lynch, Yufei Wang, Shannan Ho Sui, Laurent Gatto, Aedin C Culhane","doi":"10.1093/gigascience/giae090","DOIUrl":"10.1093/gigascience/giae090","url":null,"abstract":"<p><strong>Background: </strong>Multiplexing single-cell RNA sequencing experiments reduces sequencing cost and facilitates larger-scale studies. However, factors such as cell hashing quality and class size imbalance impact demultiplexing algorithm performance, reducing cost-effectiveness.</p><p><strong>Findings: </strong>We propose a supervised algorithm, demuxSNP, which leverages both cell hashing and genetic variation between individuals (single-nucletotide polymorphisms [SNPs]). demuxSNP addresses fundamental limitations in demultiplexing methods that use only one data modality. Some cells may be confidently demultiplexed using probabilistic hashing methods. demuxSNP uses these data to infer the genotype of singlet and doublet clusters and predict on cells assigned as negative, uncertain, or doublet using a nearest-neighbor approach adapted for missing data.We benchmarked demuxSNP against hashing, genotype-free SNP and hybrid methods on simulated and real data from renal cell cancer. demuxSNP outperformed standalone hashing methods on low-quality hashing data benchmark, improved overall classification accuracy, and allowed more high RNA quality cells to be recovered. Through varying simulated doublet rates, we showed that genotype-free SNP and hybrid methods that leverage them were impacted by class size imbalance and doublet rate. demuxSNP's supervised approach was more robust to doublet rate in experiments with class size imbalance.</p><p><strong>Conclusions: </strong>demuxSNP uses hashing and SNP data to demultiplex datasets with low hashing quality where biological samples are genetically distinct. Unassigned or negative cells with high RNA quality are recovered, making more cells available for analysis. Data simulation and benchmarking pipelines as well as processed benchmarking data for 5-50% doublets are publicly available. demuxSNP is available as an R/Bioconductor package (https://doi.org/doi:10.18129/B9.bioc.demuxSNP).</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"13 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11604057/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142750345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
stMMR: accurate and robust spatial domain identification from spatially resolved transcriptomics with multimodal feature representation. stMMR:从具有多模态特征表示的空间分解转录组学中准确和健壮的空间域识别。
IF 11.8 2区 生物学
GigaScience Pub Date : 2024-01-02 DOI: 10.1093/gigascience/giae089
Daoliang Zhang, Na Yu, Zhiyuan Yuan, Wenrui Li, Xue Sun, Qi Zou, Xiangyu Li, Zhiping Liu, Wei Zhang, Rui Gao
{"title":"stMMR: accurate and robust spatial domain identification from spatially resolved transcriptomics with multimodal feature representation.","authors":"Daoliang Zhang, Na Yu, Zhiyuan Yuan, Wenrui Li, Xue Sun, Qi Zou, Xiangyu Li, Zhiping Liu, Wei Zhang, Rui Gao","doi":"10.1093/gigascience/giae089","DOIUrl":"10.1093/gigascience/giae089","url":null,"abstract":"<p><strong>Background: </strong>Deciphering spatial domains using spatially resolved transcriptomics (SRT) is of great value for characterizing and understanding tissue architecture. However, the inherent heterogeneity and varying spatial resolutions present challenges in the joint analysis of multimodal SRT data.</p><p><strong>Results: </strong>We introduce a multimodal geometric deep learning method, named stMMR, to effectively integrate gene expression, spatial location, and histological information for accurate identifying spatial domains from SRT data. stMMR uses graph convolutional networks and a self-attention module for deep embedding of features within unimodality and incorporates similarity contrastive learning for integrating features across modalities.</p><p><strong>Conclusions: </strong>Comprehensive benchmark analysis on various types of spatial data shows superior performance of stMMR in multiple analyses, including spatial domain identification, pseudo-spatiotemporal analysis, and domain-specific gene discovery. In chicken heart development, stMMR reconstructed the spatiotemporal lineage structures, indicating an accurate developmental sequence. In breast cancer and lung cancer, stMMR clearly delineated the tumor microenvironment and identified marker genes associated with diagnosis and prognosis. Overall, stMMR is capable of effectively utilizing the multimodal information of various SRT data to explore and characterize tissue architectures of homeostasis, development, and tumor.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"13 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11604062/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142750406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PlasGO: enhancing GO-based function prediction for plasmid-encoded proteins based on genetic structure. PlasGO:基于基因结构加强质粒编码蛋白的 GO 功能预测。
IF 11.8 2区 生物学
GigaScience Pub Date : 2024-01-02 DOI: 10.1093/gigascience/giae104
Yongxin Ji, Jiayu Shang, Jiaojiao Guan, Wei Zou, Herui Liao, Xubo Tang, Yanni Sun
{"title":"PlasGO: enhancing GO-based function prediction for plasmid-encoded proteins based on genetic structure.","authors":"Yongxin Ji, Jiayu Shang, Jiaojiao Guan, Wei Zou, Herui Liao, Xubo Tang, Yanni Sun","doi":"10.1093/gigascience/giae104","DOIUrl":"10.1093/gigascience/giae104","url":null,"abstract":"<p><strong>Background: </strong>Plasmid, as a mobile genetic element, plays a pivotal role in facilitating the transfer of traits, such as antimicrobial resistance, among the bacterial community. Annotating plasmid-encoded proteins with the widely used Gene Ontology (GO) vocabulary is a fundamental step in various tasks, including plasmid mobility classification. However, GO prediction for plasmid-encoded proteins faces 2 major challenges: the high diversity of functions and the limited availability of high-quality GO annotations.</p><p><strong>Results: </strong>In this study, we introduce PlasGO, a tool that leverages a hierarchical architecture to predict GO terms for plasmid proteins. PlasGO utilizes a powerful protein language model to learn the local context within protein sentences and a BERT model to capture the global context within plasmid sentences. Additionally, PlasGO allows users to control the precision by incorporating a self-attention confidence weighting mechanism. We rigorously evaluated PlasGO and benchmarked it against 7 state-of-the-art tools in a series of experiments. The experimental results collectively demonstrate that PlasGO has achieved commendable performance. PlasGO significantly expanded the annotations of the plasmid-encoded protein database by assigning high-confidence GO terms to over 95% of previously unannotated proteins, showcasing impressive precision of 0.8229, 0.7941, and 0.8870 for the 3 GO categories, respectively, as measured on the novel protein test set.</p><p><strong>Conclusions: </strong>PlasGO, a hierarchical tool incorporating protein language models and BERT, significantly expanded plasmid protein annotations by predicting high-confidence GO terms. These annotations have been compiled into a database, which will serve as a valuable contribution to downstream plasmid analysis and research.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"13 ","pages":""},"PeriodicalIF":11.8,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11659980/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142863067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信