Annual Review of Biomedical Data Science最新文献

筛选
英文 中文
Sketching and Sublinear Data Structures in Genomics 基因组学中的草图和亚线性数据结构
IF 6
Annual Review of Biomedical Data Science Pub Date : 2019-07-20 DOI: 10.1146/ANNUREV-BIODATASCI-072018-021156
G. Marçais, Brad Solomon, Robert Patro, Carl Kingsford
{"title":"Sketching and Sublinear Data Structures in Genomics","authors":"G. Marçais, Brad Solomon, Robert Patro, Carl Kingsford","doi":"10.1146/ANNUREV-BIODATASCI-072018-021156","DOIUrl":"https://doi.org/10.1146/ANNUREV-BIODATASCI-072018-021156","url":null,"abstract":"Large-scale genomics demands computational methods that scale sublinearly with the growth of data. We review several data structures and sketching techniques that have been used in genomic analysis methods. Specifically, we focus on four key ideas that take different approaches to achieve sublinear space usage and processing time: compressed full-text indices, approximate membership query data structures, locality-sensitive hashing, and minimizers schemes. We describe these techniques at a high level and give several representative applications of each.","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":"1 1","pages":""},"PeriodicalIF":6.0,"publicationDate":"2019-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1146/ANNUREV-BIODATASCI-072018-021156","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41454479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 35
Genomic Data Compression 基因组数据压缩
IF 6
Annual Review of Biomedical Data Science Pub Date : 2019-07-20 DOI: 10.1146/ANNUREV-BIODATASCI-072018-021229
M. Hernaez, Dmitri S. Pavlichin, T. Weissman, Idoia Ochoa
{"title":"Genomic Data Compression","authors":"M. Hernaez, Dmitri S. Pavlichin, T. Weissman, Idoia Ochoa","doi":"10.1146/ANNUREV-BIODATASCI-072018-021229","DOIUrl":"https://doi.org/10.1146/ANNUREV-BIODATASCI-072018-021229","url":null,"abstract":"Recently, there has been growing interest in genome sequencing, driven by advances in sequencing technology, in terms of both efficiency and affordability. These developments have allowed many to envision whole-genome sequencing as an invaluable tool for both personalized medical care and public health. As a result, increasingly large and ubiquitous genomic data sets are being generated. This poses a significant challenge for the storage and transmission of these data. Already, it is more expensive to store genomic data for a decade than it is to obtain the data in the first place. This situation calls for efficient representations of genomic information. In this review, we emphasize the need for designing specialized compressors tailored to genomic data and describe the main solutions already proposed. We also give general guidelines for storing these data and conclude with our thoughts on the future of genomic formats and compressors.","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":""},"PeriodicalIF":6.0,"publicationDate":"2019-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1146/ANNUREV-BIODATASCI-072018-021229","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46626764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Scientific Discovery Games for Biomedical Research. 生物医学研究的科学发现游戏。
IF 6
Annual Review of Biomedical Data Science Pub Date : 2019-07-01 DOI: 10.1146/annurev-biodatasci-072018-021139
Rhiju Das, Benjamin Keep, Peter Washington, Ingmar H Riedel-Kruse
{"title":"Scientific Discovery Games for Biomedical Research.","authors":"Rhiju Das,&nbsp;Benjamin Keep,&nbsp;Peter Washington,&nbsp;Ingmar H Riedel-Kruse","doi":"10.1146/annurev-biodatasci-072018-021139","DOIUrl":"https://doi.org/10.1146/annurev-biodatasci-072018-021139","url":null,"abstract":"<p><p>Over the past decade, scientific discovery games (SDGs) have emerged as a viable approach for biomedical research, engaging hundreds of thousands of volunteer players and resulting in numerous scientific publications. After describing the origins of this novel research approach, we review the scientific output of SDGs across molecular modeling, sequence alignment, neuroscience, pathology, cellular biology, genomics, and human cognition. We find compelling results and technical innovations arising in problem-oriented games such as Foldit and Eterna and in data-oriented games such as EyeWire and Project Discovery. We discuss emergent properties of player communities shared across different projects, including the diversity of communities and the extraordinary contributions of some volunteers, such as paper writing. Finally, we highlight connections to artificial intelligence, biological cloud laboratories, new game genres, science education, and open science that may drive the next generation of SDGs.</p>","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":"2 1","pages":"253-279"},"PeriodicalIF":6.0,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1146/annurev-biodatasci-072018-021139","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39221797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
RNA Sequencing Data: Hitchhiker's Guide to Expression Analysis RNA测序数据:表达分析的漫游指南
IF 6
Annual Review of Biomedical Data Science Pub Date : 2018-10-17 DOI: 10.1146/ANNUREV-BIODATASCI-072018-021255
K. Van den Berge, Katharina M. Hembach, C. Soneson, S. Tiberi, L. Clement, M. Love, Robert Patro, M. Robinson
{"title":"RNA Sequencing Data: Hitchhiker's Guide to Expression Analysis","authors":"K. Van den Berge, Katharina M. Hembach, C. Soneson, S. Tiberi, L. Clement, M. Love, Robert Patro, M. Robinson","doi":"10.1146/ANNUREV-BIODATASCI-072018-021255","DOIUrl":"https://doi.org/10.1146/ANNUREV-BIODATASCI-072018-021255","url":null,"abstract":"Gene expression is the fundamental level at which the results of various genetic and regulatory programs are observable. The measurement of transcriptome-wide gene expression has convincingly switched from microarrays to sequencing in a matter of years. RNA sequencing (RNA-seq) provides a quantitative and open system for profiling transcriptional outcomes on a large scale and therefore facilitates a large diversity of applications, including basic science studies, but also agricultural or clinical situations. In the past 10 years or so, much has been learned about the characteristics of the RNA-seq data sets, as well as the performance of the myriad of methods developed. In this review, we give an overview of the developments in RNA-seq data analysis, including experimental design, with an explicit focus on the quantification of gene expression and statistical approachesfor differential expression. We also highlight emerging data types, such as single-cell RNA-seq and gene expression profiling using long-read technologies.","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":""},"PeriodicalIF":6.0,"publicationDate":"2018-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1146/ANNUREV-BIODATASCI-072018-021255","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48762878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 83
Visualization of Biomedical Data 生物医学数据的可视化
IF 6
Annual Review of Biomedical Data Science Pub Date : 2018-07-20 DOI: 10.1146/ANNUREV-BIODATASCI-080917-013424
S. O’Donoghue, B. Baldi, S. Clark, A. Darling, J. Hogan, Sandeep Kaur, L. Maier-Hein, Davis J. McCarthy, W. Moore, Esther Stenau, J. Swedlow, Jenny Vuong, J. Procter
{"title":"Visualization of Biomedical Data","authors":"S. O’Donoghue, B. Baldi, S. Clark, A. Darling, J. Hogan, Sandeep Kaur, L. Maier-Hein, Davis J. McCarthy, W. Moore, Esther Stenau, J. Swedlow, Jenny Vuong, J. Procter","doi":"10.1146/ANNUREV-BIODATASCI-080917-013424","DOIUrl":"https://doi.org/10.1146/ANNUREV-BIODATASCI-080917-013424","url":null,"abstract":"The rapid increase in volume and complexity of biomedical data requires changes in research, communication, and clinical practices. This includes learning how to effectively integrate automated analysis with high–data density visualizations that clearly express complex phenomena. In this review, we summarize key principles and resources from data visualization research that help address this difficult challenge. We then survey how visualization is being used in a selection of emerging biomedical research areas, including three-dimensional genomics, single-cell RNA sequencing (RNA-seq), the protein structure universe, phosphoproteomics, augmented reality–assisted surgery, and metagenomics. While specific research areas need highly tailored visualizations, there are common challenges that can be addressed with general methods and strategies. Also common, however, are poor visualization practices. We outline ongoing initiatives aimed at improving visualization practices in biomedical research via better tools, peer-to-peer learning, and interdisciplinary collaboration with computer scientists, science communicators, and graphic designers. These changes are revolutionizing how we see and think about our data.","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":""},"PeriodicalIF":6.0,"publicationDate":"2018-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1146/ANNUREV-BIODATASCI-080917-013424","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48064895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 63
Computational Methods for Understanding Mass Spectrometry–Based Shotgun Proteomics Data 理解基于质谱的霰弹枪蛋白质组学数据的计算方法
IF 6
Annual Review of Biomedical Data Science Pub Date : 2018-07-20 DOI: 10.1146/ANNUREV-BIODATASCI-080917-013516
Pavel Sinitcyn, J. Rudolph, J. Cox
{"title":"Computational Methods for Understanding Mass Spectrometry–Based Shotgun Proteomics Data","authors":"Pavel Sinitcyn, J. Rudolph, J. Cox","doi":"10.1146/ANNUREV-BIODATASCI-080917-013516","DOIUrl":"https://doi.org/10.1146/ANNUREV-BIODATASCI-080917-013516","url":null,"abstract":"Computational proteomics is the data science concerned with the identification and quantification of proteins from high-throughput data and the biological interpretation of their concentration changes, posttranslational modifications, interactions, and subcellular localizations. Today, these data most often originate from mass spectrometry–based shotgun proteomics experiments. In this review, we survey computational methods for the analysis of such proteomics data, focusing on the explanation of the key concepts. Starting with mass spectrometric feature detection, we then cover methods for the identification of peptides. Subsequently, protein inference and the control of false discovery rates are highly important topics covered. We then discuss methods for the quantification of peptides and proteins. A section on downstream data analysis covers exploratory statistics, network analysis, machine learning, and multiomics data integration. Finally, we discuss current developments and provide an outlook on what the near future of computational proteomics might bear.","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":""},"PeriodicalIF":6.0,"publicationDate":"2018-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1146/ANNUREV-BIODATASCI-080917-013516","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43457511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 102
Opportunities and Challenges of Whole-Cell and -Tissue Simulations of the Outer Retina in Health and Disease 外视网膜全细胞和组织模拟在健康和疾病中的机遇和挑战
IF 6
Annual Review of Biomedical Data Science Pub Date : 2018-07-20 DOI: 10.1146/ANNUREV-BIODATASCI-080917-013356
P. Luthert, Luis Serrano, C. Kiel
{"title":"Opportunities and Challenges of Whole-Cell and -Tissue Simulations of the Outer Retina in Health and Disease","authors":"P. Luthert, Luis Serrano, C. Kiel","doi":"10.1146/ANNUREV-BIODATASCI-080917-013356","DOIUrl":"https://doi.org/10.1146/ANNUREV-BIODATASCI-080917-013356","url":null,"abstract":"Visual processing starts in the outer retina, where photoreceptor cells sense photons that trigger electrical responses. Retinal pigment epithelial cells are located external to the photoreceptor layer and have critical functions in supporting cell and tissue homeostasis and thus sustaining a healthy retina. The high level of specialization makes the retina vulnerable to alterations that promote retinal degeneration. In this review, we discuss opportunities and challenges in proposing whole-cell and -tissue simulations of the human outer retina. An implicit position taken throughout this review is that mapping diverse data sets onto integrative computational models is likely to be a pivotal approach to understanding complex disease and developing novel interventions.","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":""},"PeriodicalIF":6.0,"publicationDate":"2018-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1146/ANNUREV-BIODATASCI-080917-013356","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44532075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Science as a Culinary Art: How Data Science and Informatics Will Change Knowledge Discovery for Everyone 科学作为烹饪艺术:数据科学和信息学将如何改变每个人的知识发现
IF 6
Annual Review of Biomedical Data Science Pub Date : 2018-07-20 DOI: 10.1146/ANNUREV-BD-01-041718-100011
N. Tatonetti
{"title":"Science as a Culinary Art: How Data Science and Informatics Will Change Knowledge Discovery for Everyone","authors":"N. Tatonetti","doi":"10.1146/ANNUREV-BD-01-041718-100011","DOIUrl":"https://doi.org/10.1146/ANNUREV-BD-01-041718-100011","url":null,"abstract":"There are 7.6 billion scientists on this planet. Every one of us uses the scientific method in our daily lives. We are continually forming new hypotheses—the fastest route for the morning commute, the best strategy for keeping an orchid healthy, or the appropriate cooking time for a bone-in ribeye. We then test these hypotheses against our observations, reevaluate and adjust our views, and then do it all over again. Granted, these are not the rigorous randomized experiments used by research laboratories, but not all knowledge comes from controlled studies. The example of cooking is especially interesting, as I personally think culinary science to be humanity’s most advanced. For 1.9 million years (1), nearly every human has come up with new ideas about how to prepare food. Today alone, billions will form hypotheses about the right combination of spices, temperatures,andwinepairings.Eachofthesehypotheseswillbetested,evaluatedfortheirsuccess, and accepted or rejected, ultimately contributing to the body of human culinary knowledge. Imaginehowadvancedmedicinewouldbeifeveryhumanwasequippedtoformandtestbiomedical research hypotheses the way that we do for cooking! Not only would the mass of knowledge be greater, but it would arguably be more useful as well. The knowledge generated would naturally be contextual—in other words, knowledge specific to particular regions or subpopulations would emerge. Medicine as a scientific discipline will especially benefit from contextual knowledge. The needs and risks of those living in, say, sub-Saharan Africa are much different than those of Inuits living near the Arctic Circle. The push toward precision medicine is evidence that contextual knowledge is recognized as necessary to advance human health. Contextual knowledge made possible by newly available data,","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":""},"PeriodicalIF":6.0,"publicationDate":"2018-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1146/ANNUREV-BD-01-041718-100011","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49021964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Large-Scale Analysis of Genetic and Clinical Patient Data 遗传和临床患者数据的大规模分析
IF 6
Annual Review of Biomedical Data Science Pub Date : 2018-07-20 DOI: 10.1146/ANNUREV-BIODATASCI-080917-013508
M. Ritchie
{"title":"Large-Scale Analysis of Genetic and Clinical Patient Data","authors":"M. Ritchie","doi":"10.1146/ANNUREV-BIODATASCI-080917-013508","DOIUrl":"https://doi.org/10.1146/ANNUREV-BIODATASCI-080917-013508","url":null,"abstract":"Biomedical data science has experienced an explosion of new data over the past decade. Abundant genetic and genomic data are increasingly available in large, diverse data sets due to the maturation of modern molecular technologies. Along with these molecular data, dense, rich phenotypic data are also available on comprehensive clinical data sets from health care provider organizations, clinical trials, population health registries, and epidemiologic studies. The methods and approaches for interrogating these large genetic/genomic and clinical data sets continue to evolve rapidly, as our understanding of the questions and challenges continue to emerge. In this review, the state-of-the-art methodologies for genetic/genomic analysis along with complex phenomics will be discussed. This field is changing and adapting to the novel data types made available, as well as technological advances in computation and machine learning. Thus, I will also discuss the future challenges in this exciting and innovative space. The promises of precision medicine rely heavily on the ability to marry complex genetic/genomic data with clinical phenotypes in meaningful ways.","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":""},"PeriodicalIF":6.0,"publicationDate":"2018-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1146/ANNUREV-BIODATASCI-080917-013508","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46186041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
From Tissues to Cell Types and Back: Single-Cell Gene Expression Analysis of Tissue Architecture 从组织到细胞类型再返回:组织结构的单细胞基因表达分析
IF 6
Annual Review of Biomedical Data Science Pub Date : 2018-07-20 DOI: 10.1146/ANNUREV-BIODATASCI-080917-013452
Xi Chen, S. Teichmann, K. Meyer
{"title":"From Tissues to Cell Types and Back: Single-Cell Gene Expression Analysis of Tissue Architecture","authors":"Xi Chen, S. Teichmann, K. Meyer","doi":"10.1146/ANNUREV-BIODATASCI-080917-013452","DOIUrl":"https://doi.org/10.1146/ANNUREV-BIODATASCI-080917-013452","url":null,"abstract":"With the recent transformative developments in single-cell genomics and, in particular, single-cell gene expression analysis, it is now possible to study tissues at the single-cell level, rather than having to rely on data from bulk measurements. Here we review the rapid developments in single-cell RNA sequencing (scRNA-seq) protocols that have the potential for unbiased identification and profiling of all cell types within a tissue or organism. In addition, novel approaches for spatial profiling of gene expression allow us to map individual cells and cell types back into the three-dimensional context of organs. The combination of in-depth single-cell and spatial gene expression data will reveal tissue architecture in unprecedented detail, generating a wealth of biological knowledge and a better understanding of many diseases.","PeriodicalId":29775,"journal":{"name":"Annual Review of Biomedical Data Science","volume":" ","pages":""},"PeriodicalIF":6.0,"publicationDate":"2018-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1146/ANNUREV-BIODATASCI-080917-013452","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48410668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 77
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信