Data Come First: Discussion of “Co-citation and Co-authorship Networks of Statisticians”

IF 2.9 2区数学 Q1 ECONOMICS

Journal of Business & Economic Statistics Pub Date : 2022-04-03 DOI:10.1080/07350015.2022.2055356

D. Donoho

{"title":"Data Come First: Discussion of “Co-citation and Co-authorship Networks of Statisticians”","authors":"D. Donoho","doi":"10.1080/07350015.2022.2055356","DOIUrl":null,"url":null,"abstract":"I salute the authors for their gift to the world of this new dataset! They have clearly invested plenty of time, effort, and IQ points in the study of the statistics literature as a bibliometric laboratory, and our field will grow and develop because of this dataset, as well as methodology the authors developed and/or fine-tuned with those data. Strikingly, the article also conveys a great deal of enthusiasm for the data! This seems such a departure from the pattern of many articles in statistics today. The enthusiastic spirit reminds me of some classic work by great figures in the history of statistics, who often were fascinated by new kinds of data which were just becoming available in their day, and who were inspired by the new data to invent fundamental new statistical tools and mathematical machinery. Francis Galton was interested in the relationships between father’s height and son’s height, himself compiling an extensive bivariate dataset of such heights, leading to the invention of the bivariate normal distribution and the correlation coefficient. Time and time again, new types of data came first, new types of models and methodology later. Indeed, this seems almost inevitable. As new technologies come onstream, new kinds of measurements become available, and new settings for data analysis and statistical inference emerge. This is plain to see in recent decades, where computational biology produced gene expression data, DNA sequence data, SNP data, and RNA-Seq data, each new data type leading to interesting methodological challenges and scientific progress. For me, each effort by a statistics researcher to understand a newly available type of data enlarges our field; it should be a primary part of the career of statisticians to cultivate an interest in cultivating new types of datasets, so that new methodology can be discovered and developed.","PeriodicalId":50247,"journal":{"name":"Journal of Business & Economic Statistics","volume":"40 1","pages":"491 - 491"},"PeriodicalIF":2.9000,"publicationDate":"2022-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Business & Economic Statistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1080/07350015.2022.2055356","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ECONOMICS","Score":null,"Total":0}

引用次数: 0

Abstract

I salute the authors for their gift to the world of this new dataset! They have clearly invested plenty of time, effort, and IQ points in the study of the statistics literature as a bibliometric laboratory, and our field will grow and develop because of this dataset, as well as methodology the authors developed and/or fine-tuned with those data. Strikingly, the article also conveys a great deal of enthusiasm for the data! This seems such a departure from the pattern of many articles in statistics today. The enthusiastic spirit reminds me of some classic work by great figures in the history of statistics, who often were fascinated by new kinds of data which were just becoming available in their day, and who were inspired by the new data to invent fundamental new statistical tools and mathematical machinery. Francis Galton was interested in the relationships between father’s height and son’s height, himself compiling an extensive bivariate dataset of such heights, leading to the invention of the bivariate normal distribution and the correlation coefficient. Time and time again, new types of data came first, new types of models and methodology later. Indeed, this seems almost inevitable. As new technologies come onstream, new kinds of measurements become available, and new settings for data analysis and statistical inference emerge. This is plain to see in recent decades, where computational biology produced gene expression data, DNA sequence data, SNP data, and RNA-Seq data, each new data type leading to interesting methodological challenges and scientific progress. For me, each effort by a statistics researcher to understand a newly available type of data enlarges our field; it should be a primary part of the career of statisticians to cultivate an interest in cultivating new types of datasets, so that new methodology can be discovered and developed.

查看原文本刊更多论文

数据至上：“统计学家的共同引用和合作网络”讨论

我向作者们向这个新数据集的世界致敬!他们显然已经投入了大量的时间、精力和智商，作为一个文献计量学实验室来研究统计文献，我们的领域将因为这个数据集以及作者开发和/或对这些数据进行微调的方法而成长和发展。引人注目的是，这篇文章还表达了对数据的极大热情!这似乎与当今许多统计学文章的模式大相径庭。这种热情的精神让我想起了统计史上一些伟大人物的经典作品，他们经常被他们那个时代刚刚出现的新数据所吸引，并受到新数据的启发，发明了基本的新统计工具和数学机制。弗朗西斯·高尔顿对父亲身高和儿子身高之间的关系很感兴趣，他自己编制了一个关于这种身高的广泛的二元数据集，从而发明了二元正态分布和相关系数。一次又一次，新类型的数据先出现，然后是新类型的模型和方法。事实上，这似乎是不可避免的。随着新技术的出现，新的测量方法变得可行，数据分析和统计推断的新设置也出现了。这在最近几十年显而易见，计算生物学产生了基因表达数据、DNA序列数据、SNP数据和RNA-Seq数据，每一种新的数据类型都带来了有趣的方法论挑战和科学进步。对我来说，统计研究者为理解一种新的可用数据类型所做的每一次努力都扩大了我们的研究领域;培养培养新型数据集的兴趣应该是统计学家职业生涯的主要部分，这样才能发现和开发新的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Business & Economic Statistics 数学-统计学与概率论

CiteScore

5.00

自引率

6.70%

发文量

审稿时长

>12 weeks

期刊介绍： The Journal of Business and Economic Statistics (JBES) publishes a range of articles, primarily applied statistical analyses of microeconomic, macroeconomic, forecasting, business, and finance related topics. More general papers in statistics, econometrics, computation, simulation, or graphics are also appropriate if they are immediately applicable to the journal''s general topics of interest. Articles published in JBES contain significant results, high-quality methodological content, excellent exposition, and usually include a substantive empirical application.