Data Come First: Discussion of “Co-citation and Co-authorship Networks of Statisticians”

IF 4.6 Q2 MATERIALS SCIENCE, BIOMATERIALS
D. Donoho
{"title":"Data Come First: Discussion of “Co-citation and Co-authorship Networks of Statisticians”","authors":"D. Donoho","doi":"10.1080/07350015.2022.2055356","DOIUrl":null,"url":null,"abstract":"I salute the authors for their gift to the world of this new dataset! They have clearly invested plenty of time, effort, and IQ points in the study of the statistics literature as a bibliometric laboratory, and our field will grow and develop because of this dataset, as well as methodology the authors developed and/or fine-tuned with those data. Strikingly, the article also conveys a great deal of enthusiasm for the data! This seems such a departure from the pattern of many articles in statistics today. The enthusiastic spirit reminds me of some classic work by great figures in the history of statistics, who often were fascinated by new kinds of data which were just becoming available in their day, and who were inspired by the new data to invent fundamental new statistical tools and mathematical machinery. Francis Galton was interested in the relationships between father’s height and son’s height, himself compiling an extensive bivariate dataset of such heights, leading to the invention of the bivariate normal distribution and the correlation coefficient. Time and time again, new types of data came first, new types of models and methodology later. Indeed, this seems almost inevitable. As new technologies come onstream, new kinds of measurements become available, and new settings for data analysis and statistical inference emerge. This is plain to see in recent decades, where computational biology produced gene expression data, DNA sequence data, SNP data, and RNA-Seq data, each new data type leading to interesting methodological challenges and scientific progress. For me, each effort by a statistics researcher to understand a newly available type of data enlarges our field; it should be a primary part of the career of statisticians to cultivate an interest in cultivating new types of datasets, so that new methodology can be discovered and developed.","PeriodicalId":2,"journal":{"name":"ACS Applied Bio Materials","volume":null,"pages":null},"PeriodicalIF":4.6000,"publicationDate":"2022-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Bio Materials","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1080/07350015.2022.2055356","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATERIALS SCIENCE, BIOMATERIALS","Score":null,"Total":0}
引用次数: 0

Abstract

I salute the authors for their gift to the world of this new dataset! They have clearly invested plenty of time, effort, and IQ points in the study of the statistics literature as a bibliometric laboratory, and our field will grow and develop because of this dataset, as well as methodology the authors developed and/or fine-tuned with those data. Strikingly, the article also conveys a great deal of enthusiasm for the data! This seems such a departure from the pattern of many articles in statistics today. The enthusiastic spirit reminds me of some classic work by great figures in the history of statistics, who often were fascinated by new kinds of data which were just becoming available in their day, and who were inspired by the new data to invent fundamental new statistical tools and mathematical machinery. Francis Galton was interested in the relationships between father’s height and son’s height, himself compiling an extensive bivariate dataset of such heights, leading to the invention of the bivariate normal distribution and the correlation coefficient. Time and time again, new types of data came first, new types of models and methodology later. Indeed, this seems almost inevitable. As new technologies come onstream, new kinds of measurements become available, and new settings for data analysis and statistical inference emerge. This is plain to see in recent decades, where computational biology produced gene expression data, DNA sequence data, SNP data, and RNA-Seq data, each new data type leading to interesting methodological challenges and scientific progress. For me, each effort by a statistics researcher to understand a newly available type of data enlarges our field; it should be a primary part of the career of statisticians to cultivate an interest in cultivating new types of datasets, so that new methodology can be discovered and developed.
数据至上:“统计学家的共同引用和合作网络”讨论
我向作者们向这个新数据集的世界致敬!他们显然已经投入了大量的时间、精力和智商,作为一个文献计量学实验室来研究统计文献,我们的领域将因为这个数据集以及作者开发和/或对这些数据进行微调的方法而成长和发展。引人注目的是,这篇文章还表达了对数据的极大热情!这似乎与当今许多统计学文章的模式大相径庭。这种热情的精神让我想起了统计史上一些伟大人物的经典作品,他们经常被他们那个时代刚刚出现的新数据所吸引,并受到新数据的启发,发明了基本的新统计工具和数学机制。弗朗西斯·高尔顿对父亲身高和儿子身高之间的关系很感兴趣,他自己编制了一个关于这种身高的广泛的二元数据集,从而发明了二元正态分布和相关系数。一次又一次,新类型的数据先出现,然后是新类型的模型和方法。事实上,这似乎是不可避免的。随着新技术的出现,新的测量方法变得可行,数据分析和统计推断的新设置也出现了。这在最近几十年显而易见,计算生物学产生了基因表达数据、DNA序列数据、SNP数据和RNA-Seq数据,每一种新的数据类型都带来了有趣的方法论挑战和科学进步。对我来说,统计研究者为理解一种新的可用数据类型所做的每一次努力都扩大了我们的研究领域;培养培养新型数据集的兴趣应该是统计学家职业生涯的主要部分,这样才能发现和开发新的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
ACS Applied Bio Materials
ACS Applied Bio Materials Chemistry-Chemistry (all)
CiteScore
9.40
自引率
2.10%
发文量
464
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信