Exploring variability during data preparation: a way to connect data, chance, and context when working with complex public datasets

IF 1.5 4区教育学 Q2 EDUCATION & EDUCATIONAL RESEARCH

Mathematical Thinking and Learning Pub Date : 2021-05-24 DOI:10.1080/10986065.2021.1922838

Michelle Wilkerson, Kathryn A. Lanouette, Rebecca Shareff

{"title":"Exploring variability during data preparation: a way to connect data, chance, and context when working with complex public datasets","authors":"Michelle Wilkerson, Kathryn A. Lanouette, Rebecca Shareff","doi":"10.1080/10986065.2021.1922838","DOIUrl":null,"url":null,"abstract":"ABSTRACT Data preparation (also called “wrangling” or “cleaning”) – the evaluation and manipulation of data prior to formal analysis – is often dismissed as a precursor to meaningful engagement with a dataset. Here, we re-envision data preparation in light of calls to prepare students for a data-rich world. Traditionally, curricular statistics explorations involve data that are derived from observations that students record themselves or that reflect familiar, relatively closed systems. In contrast, pre-constructed public datasets are much larger in scope and involve temporal, geographic, and other dimensions that complicate inference and blur boundaries between “signal” and “noise.” As a result, students have fewer opportunities to consider sources of variability in such datasets. Due to these constraints, we argue that data preparation becomes an important site for students to reason about variability with public data. Through analyses of repeated task-based interviews with five pairs of adolescent participants, we find that specific actions during data preparation, such as filtering data or calculating new measures, presented opportunities to engage leaners with variability as they prepared and analyzed several public socioscientific datasets. More broadly, our study highlights some changes to theory and curriculum in statistics education that are necessitated by a focus on “big data literacy”.","PeriodicalId":46800,"journal":{"name":"Mathematical Thinking and Learning","volume":"24 1","pages":"312 - 330"},"PeriodicalIF":1.5000,"publicationDate":"2021-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/10986065.2021.1922838","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mathematical Thinking and Learning","FirstCategoryId":"95","ListUrlMain":"https://doi.org/10.1080/10986065.2021.1922838","RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}

引用次数: 6

Abstract

ABSTRACT Data preparation (also called “wrangling” or “cleaning”) – the evaluation and manipulation of data prior to formal analysis – is often dismissed as a precursor to meaningful engagement with a dataset. Here, we re-envision data preparation in light of calls to prepare students for a data-rich world. Traditionally, curricular statistics explorations involve data that are derived from observations that students record themselves or that reflect familiar, relatively closed systems. In contrast, pre-constructed public datasets are much larger in scope and involve temporal, geographic, and other dimensions that complicate inference and blur boundaries between “signal” and “noise.” As a result, students have fewer opportunities to consider sources of variability in such datasets. Due to these constraints, we argue that data preparation becomes an important site for students to reason about variability with public data. Through analyses of repeated task-based interviews with five pairs of adolescent participants, we find that specific actions during data preparation, such as filtering data or calculating new measures, presented opportunities to engage leaners with variability as they prepared and analyzed several public socioscientific datasets. More broadly, our study highlights some changes to theory and curriculum in statistics education that are necessitated by a focus on “big data literacy”.

查看原文本刊更多论文

在数据准备过程中探索可变性:在处理复杂的公共数据集时连接数据、机会和上下文的一种方法

摘要数据准备（也称为“争论”或“清理”）——在正式分析之前对数据进行评估和操作——通常被认为是与数据集进行有意义接触的前兆。在这里，我们根据呼吁学生为数据丰富的世界做好准备，重新设想数据准备。传统上，课程统计探索涉及从学生自己记录的观察中得出的数据，或者反映熟悉的、相对封闭的系统的数据。相比之下，预先构建的公共数据集的范围要大得多，涉及时间、地理和其他维度，这些维度使推理复杂化，并模糊了“信号”和“噪声”之间的边界。因此，学生很少有机会考虑这些数据集的可变性来源。由于这些限制，我们认为数据准备成为学生思考公共数据可变性的重要场所。通过对五对青少年参与者的重复任务型访谈的分析，我们发现，在数据准备过程中的具体行动，如过滤数据或计算新的衡量标准，在准备和分析几个公共社会科学数据集时，提供了让具有可变性的瘦者参与进来的机会。更广泛地说，我们的研究强调了统计学教育理论和课程的一些变化，这些变化是关注“大数据素养”所必需的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Mathematical Thinking and Learning EDUCATION & EDUCATIONAL RESEARCH-

CiteScore

4.40

自引率

6.20%

发文量