Genome Biology最新文献

筛选
英文 中文
seqQscorer: automated quality control of next-generation sequencing data using machine learning. seqQscorer:使用机器学习的下一代测序数据的自动质量控制。
IF 12.3 1区 生物学
Genome Biology Pub Date : 2021-03-05 DOI: 10.1186/s13059-021-02294-2
Steffen Albrecht, Maximilian Sprang, Miguel A Andrade-Navarro, Jean-Fred Fontaine
{"title":"seqQscorer: automated quality control of next-generation sequencing data using machine learning.","authors":"Steffen Albrecht,&nbsp;Maximilian Sprang,&nbsp;Miguel A Andrade-Navarro,&nbsp;Jean-Fred Fontaine","doi":"10.1186/s13059-021-02294-2","DOIUrl":"https://doi.org/10.1186/s13059-021-02294-2","url":null,"abstract":"<p><p>Controlling quality of next-generation sequencing (NGS) data files is a necessary but complex task. To address this problem, we statistically characterize common NGS quality features and develop a novel quality control procedure involving tree-based and deep learning classification algorithms. Predictive models, validated on internal and external functional genomics datasets, are to some extent generalizable to data from unseen species. The derived statistical guidelines and predictive models represent a valuable resource for users of NGS data to better understand quality issues and perform automatic quality control. Our guidelines and software are available at https://github.com/salbrec/seqQscorer .</p>","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":"22 1","pages":"75"},"PeriodicalIF":12.3,"publicationDate":"2021-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s13059-021-02294-2","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25447787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Single cell eQTL analysis identifies cell type-specific genetic control of gene expression in fibroblasts and reprogrammed induced pluripotent stem cells. 单细胞eQTL分析确定了成纤维细胞和重编程诱导多能干细胞中基因表达的细胞类型特异性遗传控制。
IF 12.3 1区 生物学
Genome Biology Pub Date : 2021-03-05 DOI: 10.1186/s13059-021-02293-3
Drew Neavin, Quan Nguyen, Maciej S Daniszewski, Helena H Liang, Han Sheng Chiu, Yong Kiat Wee, Anne Senabouth, Samuel W Lukowski, Duncan E Crombie, Grace E Lidgerwood, Damián Hernández, James C Vickers, Anthony L Cook, Nathan J Palpant, Alice Pébay, Alex W Hewitt, Joseph E Powell
{"title":"Single cell eQTL analysis identifies cell type-specific genetic control of gene expression in fibroblasts and reprogrammed induced pluripotent stem cells.","authors":"Drew Neavin, Quan Nguyen, Maciej S Daniszewski, Helena H Liang, Han Sheng Chiu, Yong Kiat Wee, Anne Senabouth, Samuel W Lukowski, Duncan E Crombie, Grace E Lidgerwood, Damián Hernández, James C Vickers, Anthony L Cook, Nathan J Palpant, Alice Pébay, Alex W Hewitt, Joseph E Powell","doi":"10.1186/s13059-021-02293-3","DOIUrl":"10.1186/s13059-021-02293-3","url":null,"abstract":"<p><strong>Background: </strong>The discovery that somatic cells can be reprogrammed to induced pluripotent stem cells (iPSCs) has provided a foundation for in vitro human disease modelling, drug development and population genetics studies. Gene expression plays a critical role in complex disease risk and therapeutic response. However, while the genetic background of reprogrammed cell lines has been shown to strongly influence gene expression, the effect has not been evaluated at the level of individual cells which would provide significant resolution. By integrating single cell RNA-sequencing (scRNA-seq) and population genetics, we apply a framework in which to evaluate cell type-specific effects of genetic variation on gene expression.</p><p><strong>Results: </strong>Here, we perform scRNA-seq on 64,018 fibroblasts from 79 donors and map expression quantitative trait loci (eQTLs) at the level of individual cell types. We demonstrate that the majority of eQTLs detected in fibroblasts are specific to an individual cell subtype. To address if the allelic effects on gene expression are maintained following cell reprogramming, we generate scRNA-seq data in 19,967 iPSCs from 31 reprogramed donor lines. We again identify highly cell type-specific eQTLs in iPSCs and show that the eQTLs in fibroblasts almost entirely disappear during reprogramming.</p><p><strong>Conclusions: </strong>This work provides an atlas of how genetic variation influences gene expression across cell subtypes and provides evidence for patterns of genetic architecture that lead to cell type-specific eQTL effects.</p>","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":"22 1","pages":"76"},"PeriodicalIF":12.3,"publicationDate":"2021-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s13059-021-02293-3","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25441071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 48
simATAC: a single-cell ATAC-seq simulation framework. simATAC:单细胞ATAC-seq仿真框架。
IF 12.3 1区 生物学
Genome Biology Pub Date : 2021-03-04 DOI: 10.1186/s13059-021-02270-w
Zeinab Navidi, Lin Zhang, Bo Wang
{"title":"simATAC: a single-cell ATAC-seq simulation framework.","authors":"Zeinab Navidi,&nbsp;Lin Zhang,&nbsp;Bo Wang","doi":"10.1186/s13059-021-02270-w","DOIUrl":"https://doi.org/10.1186/s13059-021-02270-w","url":null,"abstract":"<p><p>Single-cell assay for transposase-accessible chromatin sequencing (scATAC-seq) identifies regulated chromatin accessibility modules at the single-cell resolution. Robust evaluation is critical to the development of scATAC-seq pipelines, which calls for reproducible datasets for benchmarking. We hereby present the simATAC framework, an R package that generates scATAC-seq count matrices that highly resemble real scATAC-seq datasets in library size, sparsity, and chromatin accessibility signals. simATAC deploys statistical models derived from analyzing 90 real scATAC-seq cell groups. simATAC provides a robust and systematic approach to generate in silico scATAC-seq samples with known cell labels for assessing analytical pipelines.</p>","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":"22 1","pages":"74"},"PeriodicalIF":12.3,"publicationDate":"2021-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s13059-021-02270-w","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25430881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
2passtools: two-pass alignment using machine-learning-filtered splice junctions increases the accuracy of intron detection in long-read RNA sequencing. 2passstools:使用机器学习过滤剪接连接的两遍比对提高了长读RNA测序中内含子检测的准确性。
IF 12.3 1区 生物学
Genome Biology Pub Date : 2021-03-01 DOI: 10.1186/s13059-021-02296-0
Matthew T Parker, Katarzyna Knop, Geoffrey J Barton, Gordon G Simpson
{"title":"2passtools: two-pass alignment using machine-learning-filtered splice junctions increases the accuracy of intron detection in long-read RNA sequencing.","authors":"Matthew T Parker,&nbsp;Katarzyna Knop,&nbsp;Geoffrey J Barton,&nbsp;Gordon G Simpson","doi":"10.1186/s13059-021-02296-0","DOIUrl":"https://doi.org/10.1186/s13059-021-02296-0","url":null,"abstract":"<p><p>Transcription of eukaryotic genomes involves complex alternative processing of RNAs. Sequencing of full-length RNAs using long reads reveals the true complexity of processing. However, the relatively high error rates of long-read sequencing technologies can reduce the accuracy of intron identification. Here we apply alignment metrics and machine-learning-derived sequence information to filter spurious splice junctions from long-read alignments and use the remaining junctions to guide realignment in a two-pass approach. This method, available in the software package 2passtools ( https://github.com/bartongroup/2passtools ), improves the accuracy of spliced alignment and transcriptome assembly for species both with and without existing high-quality annotations.</p>","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":"22 1","pages":"72"},"PeriodicalIF":12.3,"publicationDate":"2021-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s13059-021-02296-0","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25418311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Re-evaluating experimental validation in the Big Data Era: a conceptual argument. 重新评估大数据时代的实验验证:一个概念上的争论。
IF 12.3 1区 生物学
Genome Biology Pub Date : 2021-02-24 DOI: 10.1186/s13059-021-02292-4
Mohieddin Jafari, Yuanfang Guan, David C Wedge, Naser Ansari-Pour
{"title":"Re-evaluating experimental validation in the Big Data Era: a conceptual argument.","authors":"Mohieddin Jafari,&nbsp;Yuanfang Guan,&nbsp;David C Wedge,&nbsp;Naser Ansari-Pour","doi":"10.1186/s13059-021-02292-4","DOIUrl":"https://doi.org/10.1186/s13059-021-02292-4","url":null,"abstract":"","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":"22 1","pages":"71"},"PeriodicalIF":12.3,"publicationDate":"2021-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s13059-021-02292-4","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25400441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
MEDALT: single-cell copy number lineage tracing enabling gene discovery. MEDALT:通过单细胞拷贝数系谱追踪发现基因。
IF 12.3 1区 生物学
Genome Biology Pub Date : 2021-02-23 DOI: 10.1186/s13059-021-02291-5
Fang Wang, Qihan Wang, Vakul Mohanty, Shaoheng Liang, Jinzhuang Dou, Jincheng Han, Darlan Conterno Minussi, Ruli Gao, Li Ding, Nicholas Navin, Ken Chen
{"title":"MEDALT: single-cell copy number lineage tracing enabling gene discovery.","authors":"Fang Wang, Qihan Wang, Vakul Mohanty, Shaoheng Liang, Jinzhuang Dou, Jincheng Han, Darlan Conterno Minussi, Ruli Gao, Li Ding, Nicholas Navin, Ken Chen","doi":"10.1186/s13059-021-02291-5","DOIUrl":"10.1186/s13059-021-02291-5","url":null,"abstract":"<p><p>We present a Minimal Event Distance Aneuploidy Lineage Tree (MEDALT) algorithm that infers the evolution history of a cell population based on single-cell copy number (SCCN) profiles, and a statistical routine named lineage speciation analysis (LSA), whichty facilitates discovery of fitness-associated alterations and genes from SCCN lineage trees. MEDALT appears more accurate than phylogenetics approaches in reconstructing copy number lineage. From data from 20 triple-negative breast cancer patients, our approaches effectively prioritize genes that are essential for breast cancer cell fitness and predict patient survival, including those implicating convergent evolution.The source code of our study is available at https://github.com/KChen-lab/MEDALT .</p>","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":"22 1","pages":"70"},"PeriodicalIF":12.3,"publicationDate":"2021-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7901082/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25403623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Megabase-scale methylation phasing using nanopore long reads and NanoMethPhase. 使用纳米孔长读数和NanoMethPhase的百万级甲基化分相。
IF 12.3 1区 生物学
Genome Biology Pub Date : 2021-02-22 DOI: 10.1186/s13059-021-02283-5
Vahid Akbari, Jean-Michel Garant, Kieran O'Neill, Pawan Pandoh, Richard Moore, Marco A Marra, Martin Hirst, Steven J M Jones
{"title":"Megabase-scale methylation phasing using nanopore long reads and NanoMethPhase.","authors":"Vahid Akbari,&nbsp;Jean-Michel Garant,&nbsp;Kieran O'Neill,&nbsp;Pawan Pandoh,&nbsp;Richard Moore,&nbsp;Marco A Marra,&nbsp;Martin Hirst,&nbsp;Steven J M Jones","doi":"10.1186/s13059-021-02283-5","DOIUrl":"https://doi.org/10.1186/s13059-021-02283-5","url":null,"abstract":"<p><p>The ability of nanopore sequencing to simultaneously detect modified nucleotides while producing long reads makes it ideal for detecting and phasing allele-specific methylation. However, there is currently no complete software for detecting SNPs, phasing haplotypes, and mapping methylation to these from nanopore sequence data. Here, we present NanoMethPhase, a software tool to phase 5-methylcytosine from nanopore sequencing. We also present SNVoter, which can post-process nanopore SNV calls to improve accuracy in low coverage regions. Together, these tools can accurately detect allele-specific methylation genome-wide using nanopore sequence data with low coverage of about ten-fold redundancy.</p>","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":"22 1","pages":"68"},"PeriodicalIF":12.3,"publicationDate":"2021-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s13059-021-02283-5","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25395196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
scSorter: assigning cells to known cell types according to marker genes. scSorter:根据标记基因将细胞归入已知的细胞类型。
IF 12.3 1区 生物学
Genome Biology Pub Date : 2021-02-22 DOI: 10.1186/s13059-021-02281-7
Hongyu Guo, Jun Li
{"title":"scSorter: assigning cells to known cell types according to marker genes.","authors":"Hongyu Guo, Jun Li","doi":"10.1186/s13059-021-02281-7","DOIUrl":"10.1186/s13059-021-02281-7","url":null,"abstract":"<p><p>On single-cell RNA-sequencing data, we consider the problem of assigning cells to known cell types, assuming that the identities of cell-type-specific marker genes are given but their exact expression levels are unavailable, that is, without using a reference dataset. Based on an observation that the expected over-expression of marker genes is often absent in a nonnegligible proportion of cells, we develop a method called scSorter. scSorter allows marker genes to express at a low level and borrows information from the expression of non-marker genes. On both simulated and real data, scSorter shows much higher power compared to existing methods.</p>","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":"22 1","pages":"69"},"PeriodicalIF":12.3,"publicationDate":"2021-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7898451/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25395193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FlsnRNA-seq: protoplasting-free full-length single-nucleus RNA profiling in plants. FlsnRNA-seq:植物原生质体无全长单核RNA分析。
IF 12.3 1区 生物学
Genome Biology Pub Date : 2021-02-19 DOI: 10.1186/s13059-021-02288-0
Yanping Long, Zhijian Liu, Jinbu Jia, Weipeng Mo, Liang Fang, Dongdong Lu, Bo Liu, Hong Zhang, Wei Chen, Jixian Zhai
{"title":"FlsnRNA-seq: protoplasting-free full-length single-nucleus RNA profiling in plants.","authors":"Yanping Long, Zhijian Liu, Jinbu Jia, Weipeng Mo, Liang Fang, Dongdong Lu, Bo Liu, Hong Zhang, Wei Chen, Jixian Zhai","doi":"10.1186/s13059-021-02288-0","DOIUrl":"10.1186/s13059-021-02288-0","url":null,"abstract":"<p><p>The broad application of single-cell RNA profiling in plants has been hindered by the prerequisite of protoplasting that requires digesting the cell walls from different types of plant tissues. Here, we present a protoplasting-free approach, flsnRNA-seq, for large-scale full-length RNA profiling at a single-nucleus level in plants using isolated nuclei. Combined with 10x Genomics and Nanopore long-read sequencing, we validate the robustness of this approach in Arabidopsis root cells and the developing endosperm. Sequencing results demonstrate that it allows for uncovering alternative splicing and polyadenylation-related RNA isoform information at the single-cell level, which facilitates characterizing cell identities.</p>","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":"22 1","pages":"66"},"PeriodicalIF":12.3,"publicationDate":"2021-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s13059-021-02288-0","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25386338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 47
ReSeq simulates realistic Illumina high-throughput sequencing data. ReSeq模拟真实的Illumina高通量测序数据。
IF 12.3 1区 生物学
Genome Biology Pub Date : 2021-02-19 DOI: 10.1186/s13059-021-02265-7
Stephan Schmeing, Mark D Robinson
{"title":"ReSeq simulates realistic Illumina high-throughput sequencing data.","authors":"Stephan Schmeing,&nbsp;Mark D Robinson","doi":"10.1186/s13059-021-02265-7","DOIUrl":"https://doi.org/10.1186/s13059-021-02265-7","url":null,"abstract":"<p><p>In high-throughput sequencing data, performance comparisons between computational tools are essential for making informed decisions at each step of a project. Simulations are a critical part of method comparisons, but for standard Illumina sequencing of genomic DNA, they are often oversimplified, which leads to optimistic results for most tools. ReSeq improves the authenticity of synthetic data by extracting and reproducing key components from real data. Major advancements are the inclusion of systematic errors, a fragment-based coverage model and sampling-matrix estimates based on two-dimensional margins. These improvements lead to more faithful performance evaluations. ReSeq is available at https://github.com/schmeing/ReSeq .</p>","PeriodicalId":48922,"journal":{"name":"Genome Biology","volume":"22 1","pages":"67"},"PeriodicalIF":12.3,"publicationDate":"2021-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s13059-021-02265-7","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25386439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信