Sabrina Luxin Wang, A. Y. Zhang, Samuel Messer, A. Wiesner, Dennis K. Pearl
{"title":"Student-Developed Shiny Applications for Teaching Statistics","authors":"Sabrina Luxin Wang, A. Y. Zhang, Samuel Messer, A. Wiesner, Dennis K. Pearl","doi":"10.1080/26939169.2021.1995545","DOIUrl":"https://doi.org/10.1080/26939169.2021.1995545","url":null,"abstract":"Abstract This article describes a suite of student-created Shiny apps for teaching statistics and a field test of their short-term effectiveness. To date, more than 50 Shiny apps and a growing collection of associated lesson plans, designed to enrich the teaching of both introductory and upper division statistics courses, have been developed. The apps are available for free use and their open source code can be adapted as desired. We report on the experimental testing of four of these Shiny apps to examine short-term learning outcomes in an introductory statistical concepts course.","PeriodicalId":34851,"journal":{"name":"Journal of Statistics and Data Science Education","volume":"29 1","pages":"218 - 227"},"PeriodicalIF":1.7,"publicationDate":"2021-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45773220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"How to Get Away With Statistics: Gamification of Multivariate Statistics","authors":"Jacopo Di Iorio, S. Vantini","doi":"10.1080/26939169.2021.1997128","DOIUrl":"https://doi.org/10.1080/26939169.2021.1997128","url":null,"abstract":"Abstract In this article, we discuss our attempt to teach applied statistics techniques typically taught in advanced courses, such as clustering and principal component analysis, to a non-mathematical educated audience. Considering the negative attitude and inclination toward mathematical disciplines of our students we introduce them to our topics using four different games. The four games are all user-centric, score-based arcade experiences intended to be played under the supervision of an instructor. They are developed using the Shiny web-based application framework for R. In every activity students have to follow the instructions and to interact with plots to minimize a score with a statistical meaning. No other knowledge than elementary geometry and Euclidean distance is required to complete the tasks. Results from a student questionnaire give us some confidence that the experience has benefited students, not only in terms of their ability to understand and use the explained methods but also regarding their confidence and overall satisfaction with the course. This fact suggests that these or similar activities could greatly improve the diffusion of statistical thinking at different levels of education.","PeriodicalId":34851,"journal":{"name":"Journal of Statistics and Data Science Education","volume":"29 1","pages":"241 - 250"},"PeriodicalIF":1.7,"publicationDate":"2021-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42389743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Trends in Teaching Advanced Placement Statistics: Results from a National Survey","authors":"Hollylynne S. Lee, Taylor Harrison","doi":"10.1080/26939169.2021.1965509","DOIUrl":"https://doi.org/10.1080/26939169.2021.1965509","url":null,"abstract":"Abstract This study provides a glimpse into the professional learning, beliefs, and practices of high school teachers of Advanced Placement (AP) Statistics. Data are from a survey of 445 AP Statistics teachers in late 2018. Results indicate many AP Statistics teachers have taken several statistics courses and engage in professional development related to statistics sponsored by the College Board (summer institutes, exam readings, and online community). They generally do not engage with resources developed by the American Statistical Association and the statistics education community. While AP statistics teachers structure class time with student–student interaction and use student-centered activities, they generally do not use statistics-specific technology tools and rarely engage students with datasets larger than 100 cases or with multiple variables. Teachers’ beliefs about teaching statistics do not always reflect their teaching practices. Personal time to improve, time with students (especially those on a blocked semester schedule), structure of curriculum and exam schedule, and lack of access to technology often prevent teachers from making changes to their practices. Findings call for targeted efforts to reach high school statistics teachers, engage them more in the statistics education community, and encourage curriculum and instructional approaches that more closely align with recommendations and trends in college-level introductory statistics.","PeriodicalId":34851,"journal":{"name":"Journal of Statistics and Data Science Education","volume":"29 1","pages":"317 - 327"},"PeriodicalIF":1.7,"publicationDate":"2021-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48294609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancement of the Command-Line Environment for use in the Introductory Statistics Course and Beyond","authors":"D. Gerbing","doi":"10.1080/26939169.2021.1999871","DOIUrl":"https://doi.org/10.1080/26939169.2021.1999871","url":null,"abstract":"ABSTRACT R and Python are commonly used software languages for data analytics. Using these languages as the course software for the introductory course gives students practical skills for applying statistical concepts to data analysis. However, the reliance upon the command line is perceived by the typical nontechnical introductory student as sufficiently esoteric that its use detracts from the teaching of statistical concepts and data analysis. An R package was developed based on the successive feedback of hundreds of introductory statistics students over multiple years to provide a set of functions that apply basic statistical principles with command-line R. The package offers gentler error checking and many visualizations and analytics, successfully serving as the course software for teaching and homework. This software includes pedagogical functions, data analytic functions for a variety of analyses, and the foundation for access to the entire R ecosystem and, by extension, any command-line environment.","PeriodicalId":34851,"journal":{"name":"Journal of Statistics and Data Science Education","volume":"29 1","pages":"251 - 266"},"PeriodicalIF":1.7,"publicationDate":"2021-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47377963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Do Students Learn More from Erroneous Code? Exploring Student Performance and Satisfaction in an Error-Free Versus an Error-full SAS® Programming Environment","authors":"H. Hoffman, Angelo F. Elmi","doi":"10.1080/26939169.2021.1967229","DOIUrl":"https://doi.org/10.1080/26939169.2021.1967229","url":null,"abstract":"Abstract Teaching students statistical programming languages while simultaneously teaching them how to debug erroneous code is challenging. The traditional programming course focuses on error-free learning in class while students’ experiences outside of class typically involve error-full learning. While error-free teaching consists of focused lectures emphasizing correct coding, error-full teaching would follow such lectures with debugging sessions. We aimed to explore these two approaches by conducting a pilot study of 18 graduate students who voluntarily attended a SAS programming seminar held weekly from September 2018 through November 2018. Each seminar had a 10-min error-free lecture, 15-min programming assignment, 5-min break, 10-min error-full lecture, and 15-min programming assignment. We examined student performance and preference. While four students successfully completed both assignments and ten students did not successfully complete either assignment, one student successfully completed only the first assignment that directly followed the error-free lecture and three students successfully completed only the second assignment that directly followed the error-full lecture. Of the 15 students who responded, twelve (80%) preferred error-full to error-free learning. We will evaluate error-full learning on a larger scale in an introductory SAS course. Supplemental files are available online for this article.","PeriodicalId":34851,"journal":{"name":"Journal of Statistics and Data Science Education","volume":"29 1","pages":"228 - 240"},"PeriodicalIF":1.7,"publicationDate":"2021-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47691692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Note from the Editor","authors":"J. Witmer","doi":"10.1080/26939169.2021.1959224","DOIUrl":"https://doi.org/10.1080/26939169.2021.1959224","url":null,"abstract":"","PeriodicalId":34851,"journal":{"name":"Journal of Statistics and Data Science Education","volume":"29 1","pages":"155 - 155"},"PeriodicalIF":1.7,"publicationDate":"2021-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41647866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Letter to the Journal of Statistics and Data Science Education — A Call for Review of “OkCupid Data for Introductory Statistics and Data Science Courses” by Albert Y. Kim and Adriana Escobedo-Land","authors":"Tiffany Xiao, Yifan Ma","doi":"10.1080/26939169.2021.1930812","DOIUrl":"https://doi.org/10.1080/26939169.2021.1930812","url":null,"abstract":"As Big Data continues to rise in popularity, so does an increased need for protection against potential misuses of data. We are a group of undergraduate Statistical and Data Science major students from Smith College that are actively engaged in ethical discussions concerning the use of data in our society. It can be challenging to predict future trends and technologies in data science that could cause concerns. However, we believe that some essential protections and procedures should be in place to help prevent misuses of data. In particular, we are writing to you to address our concerns with the article “OkCupid Data for Introductory Statistics and Data Science Courses” by Albert Y. Kim and Adriana Escobedo-Land that was published in your journal (Kim and Escobedo-Land 2015). In light of ethical concerns surrounding the article, herein we describe the background of how the dataset was found to contain identifiable information. We communicated this to the authors, who correspondingly corrected the article. In our opinion, there is no doubt that the dataset presented in the article holds pedagogical value as well as research value. One aspect of the educational value of the dataset is the fact that the context of possible analysis could better drive students’ interests. The research value of the data lies within the self-reported nature of the dataset, which usually is the private property of corporations and could be hard to obtain for researchers in universities. Another context in which the pedagogical value of the dataset remains is where students could use this as a case study in discussions of the ethical implications of such data, even practicing anonymization skills with the data. However, we do believe that for the dataset to be used for pedagogical purposes, further anonymizations to the dataset were necessary. Some ways that datasets like this one could be better anonymized in the future include removing unimportant variables that have identification power disproportionate to their value to research. For example, in the case of the OkCupid dataset associated with the paper, the time the data was collected could be removed, since this fact is not particularly essential but can be used for identification. Other sources of concern for this dataset are the variables that reveal geographical and temporal information on individuals. Another method could","PeriodicalId":34851,"journal":{"name":"Journal of Statistics and Data Science Education","volume":"29 1","pages":"214 - 215"},"PeriodicalIF":1.7,"publicationDate":"2021-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/26939169.2021.1930812","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42510109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Rethlefsen, H. Norton, Sarah L. Meyer, Katherine A. MacWilkinson, Plato L. Smith II, Haoyang Ye
{"title":"Interdisciplinary Approaches and Strategies from Research Reproducibility 2020: Educating for Reproducibility","authors":"M. Rethlefsen, H. Norton, Sarah L. Meyer, Katherine A. MacWilkinson, Plato L. Smith II, Haoyang Ye","doi":"10.1080/26939169.2022.2104767","DOIUrl":"https://doi.org/10.1080/26939169.2022.2104767","url":null,"abstract":"Abstract Research Reproducibility: Educating for Reproducibility, Pathways to Research Integrity was an interdisciplinary, conference hosted virtually by the University of Florida in December 2020. This event brought together educators, researchers, students, policy makers, and industry representatives from across the globe to explore best practices, innovations, and new ideas for education around reproducibility and replicability. Emphasizing a broad view of rigor and reproducibility, the conference touched on many aspects of introducing learners to transparency, rigorous study design, data science, data management, replications, and more. Transdisciplinary themes emerged from the panels, keynote, and submitted papers and poster presentations. The identified themes included lifelong learning, cultivating bottom-up change, “sneaking in” learning, just-in-time learning, targeting learners by career stage, learning by doing, learning how to learn, establishing communities of practice, librarians as interdisciplinary leaders, teamwork skills, rewards and incentives, and implementing top-down change. For each of these themes, we share ideas, practices, and actions as discussed by the conference speakers and attendees.","PeriodicalId":34851,"journal":{"name":"Journal of Statistics and Data Science Education","volume":"30 1","pages":"219 - 227"},"PeriodicalIF":1.7,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47260615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Building a Multiple Linear Regression Model With LEGO Brick Data","authors":"Anna D. Peterson, Laura E. Ziegler","doi":"10.1080/26939169.2021.1946450","DOIUrl":"https://doi.org/10.1080/26939169.2021.1946450","url":null,"abstract":"Abstract We present an innovative activity that uses data about LEGO sets to help students self-discover multiple linear regressions. Students are guided to predict the price of a LEGO set posted on Amazon.com (Amazon price) using LEGO characteristics such as the number of pieces, the theme (i.e., product line), and the general size of the pieces. By starting with graphical displays and simple linear regression, students are able to develop additive multiple linear regression models as well as interaction models to accomplish the task. We provide examples of student responses to the activity and suggestions for teachers based on our experiences. Supplementary materials for this article are available online.","PeriodicalId":34851,"journal":{"name":"Journal of Statistics and Data Science Education","volume":"29 1","pages":"297 - 303"},"PeriodicalIF":1.7,"publicationDate":"2021-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/26939169.2021.1946450","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46325594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Might Temporal Logic Improve the Specification of Directed Acyclic Graphs (DAGs)?","authors":"G. Ellison","doi":"10.1080/26939169.2021.1936311","DOIUrl":"https://doi.org/10.1080/26939169.2021.1936311","url":null,"abstract":"Abstract Temporality-driven covariate classification had limited impact on: the specification of directed acyclic graphs (DAGs) by 85 novice analysts (medical undergraduates); or the risk of bias in DAG-informed multivariable models designed to generate causal inference from observational data. Only 71 students (83.5%) managed to complete the “Temporality-driven Covariate Classification” task, and fewer still completed the “DAG Specification” task (77.6%) or both tasks in succession (68.2%). Most students who completed the first task misclassified at least one covariate (84.5%), and misclassification rates were even higher among students who specified a DAG (92.4%). Nonetheless, across the 512 and 517 covariates considered by each of these tasks, “confounders” were far less likely to be misclassified (11/252, 4.4% and 8/261, 3.1%) than “mediators” (70/123, 56.9% and 56/115, 48.7%) or “competing exposures” (93/137, 67.9% and 86/138, 62.3%), respectively. Since estimates of total causal effects are biased in multivariable models that: fail to adjust for “confounders”; or adjust for “mediators” (or “consequences of the outcome”) misclassified as “confounders” or “competing exposures,” a substantial proportion of any models informed by the present study’s DAGs would have generated biased estimates of total causal effects (50/66, 76.8%); and this would have only been slightly lower for models informed by temporality-driven covariate classification alone (47/71, 66.2%). Supplementary materials for this article are available online.","PeriodicalId":34851,"journal":{"name":"Journal of Statistics and Data Science Education","volume":"29 1","pages":"202 - 213"},"PeriodicalIF":1.7,"publicationDate":"2021-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/26939169.2021.1936311","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44493667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}