Introducing Data Science Techniques by Connecting Database Concepts and dplyr

IF 2.2 Q3 Social Sciences
Jennifer Broatch, S. Dietrich, Don Goelman
{"title":"Introducing Data Science Techniques by Connecting Database Concepts and dplyr","authors":"Jennifer Broatch, S. Dietrich, Don Goelman","doi":"10.1080/10691898.2019.1647768","DOIUrl":null,"url":null,"abstract":"Abstract Early exposure to data science skills, such as relational databases, is essential for students in statistics as well as many other disciplines in an increasingly data driven society. The goal of the presented pedagogy is to introduce undergraduate students to fundamental database concepts and to illuminate the connection between these database concepts and the functionality provided by the dplyr package for R. Specifically, students are introduced to relational database concepts using visualizations that are specifically designed for students with no data science or computing background. These educational tools, which are freely available on the Web, engage students in the learning process through a dynamic presentation that gently introduces relational databases and how to ask questions of data stored in a relational database. The visualizations are specifically designed for self-study by students, including a formative self-assessment feature. Students are then assigned a corresponding statistics lesson to utilize statistical software in R within the dplyr framework and to emphasize the need for these database skills. This article describes a pilot experience of introducing this pedagogy into a calculus-based introductory statistics course for mathematics and statistics majors, and provides a brief evaluation of the student perspective of the experience. Supplementary materials for this article are available online.","PeriodicalId":45775,"journal":{"name":"Journal of Statistics Education","volume":"27 1","pages":"147 - 153"},"PeriodicalIF":2.2000,"publicationDate":"2019-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/10691898.2019.1647768","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Statistics Education","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/10691898.2019.1647768","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Social Sciences","Score":null,"Total":0}
引用次数: 12

Abstract

Abstract Early exposure to data science skills, such as relational databases, is essential for students in statistics as well as many other disciplines in an increasingly data driven society. The goal of the presented pedagogy is to introduce undergraduate students to fundamental database concepts and to illuminate the connection between these database concepts and the functionality provided by the dplyr package for R. Specifically, students are introduced to relational database concepts using visualizations that are specifically designed for students with no data science or computing background. These educational tools, which are freely available on the Web, engage students in the learning process through a dynamic presentation that gently introduces relational databases and how to ask questions of data stored in a relational database. The visualizations are specifically designed for self-study by students, including a formative self-assessment feature. Students are then assigned a corresponding statistics lesson to utilize statistical software in R within the dplyr framework and to emphasize the need for these database skills. This article describes a pilot experience of introducing this pedagogy into a calculus-based introductory statistics course for mathematics and statistics majors, and provides a brief evaluation of the student perspective of the experience. Supplementary materials for this article are available online.
通过连接数据库概念和dplyr介绍数据科学技术
摘要在日益数据驱动的社会中,早期接触数据科学技能,如关系数据库,对于统计学和许多其他学科的学生来说是至关重要的。所提出的教学法的目标是向本科生介绍基本的数据库概念,并阐明这些数据库概念与由R的dplyr包提供的功能之间的联系,使用专门为没有数据科学或计算背景的学生设计的可视化,向学生介绍关系数据库概念。这些教育工具在网上免费提供,通过动态演示让学生参与到学习过程中,该演示温和地介绍关系数据库以及如何询问存储在关系数据库中的数据问题。可视化是专门为学生自学而设计的,包括形成性自我评估功能。然后给学生分配相应的统计学课程,在dplyr框架内使用R中的统计软件,并强调对这些数据库技能的需求。本文描述了将这种教学法引入数学和统计学专业基于微积分的统计学入门课程的试点经验,并对学生的经验进行了简要评估。本文的补充材料可在线获取。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Statistics Education
Journal of Statistics Education EDUCATION, SCIENTIFIC DISCIPLINES-
CiteScore
1.20
自引率
0.00%
发文量
0
审稿时长
12 weeks
期刊介绍: The "Datasets and Stories" department of the Journal of Statistics Education provides a forum for exchanging interesting datasets and discussing ways they can be used effectively in teaching statistics. This section of JSE is described fully in the article "Datasets and Stories: Introduction and Guidelines" by Robin H. Lock and Tim Arnold (1993). The Journal of Statistics Education maintains a Data Archive that contains the datasets described in "Datasets and Stories" articles, as well as additional datasets useful to statistics teachers. Lock and Arnold (1993) describe several criteria that will be considered before datasets are placed in the JSE Data Archive.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信