{"title":"Philosophy as Integral to a Data Science Ethics Course","authors":"Sara Colando, Johanna Hardin","doi":"arxiv-2310.02444","DOIUrl":"https://doi.org/arxiv-2310.02444","url":null,"abstract":"There is wide agreement that ethical considerations are a valuable aspect of\u0000a data science curriculum, and to that end, many data science programs offer\u0000courses in data science ethics. There are not always, however, explicit\u0000connections between data science ethics and the centuries-old work on ethics\u0000within the discipline of philosophy. Here, we present a framework for bringing\u0000together key data science practices with ethical topics. The ethical topics\u0000were collated from sixteen data science ethics courses with public-facing\u0000syllabi and reading lists. We encourage individuals who are teaching data\u0000science ethics to engage with the philosophical literature and its connection\u0000to current data science practices, which is rife with potentially morally\u0000charged decision points.","PeriodicalId":501323,"journal":{"name":"arXiv - STAT - Other Statistics","volume":"27 11","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138526885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The fiducial-Bayes fusion: A general theory of statistical inference","authors":"Russell J. Bowater","doi":"arxiv-2310.01533","DOIUrl":"https://doi.org/arxiv-2310.01533","url":null,"abstract":"An overview is presented of a general theory of statistical inference that is\u0000referred to as the fiducial-Bayes fusion. This theory combines organic fiducial\u0000inference and Bayesian inference. The aim is that the reader is given a clear\u0000summary of the conceptual framework of the fiducial-Bayes fusion as well as\u0000pointers to further reading about its more technical aspects. Particular\u0000attention is paid to the issue of how much importance should be attached to the\u0000role of Bayesian inference within this framework. The appendix contains a\u0000substantive example of the application of the theory of the fiducial-Bayes\u0000fusion, which supplements various other examples of the application of this\u0000theory that are referenced in the paper.","PeriodicalId":501323,"journal":{"name":"arXiv - STAT - Other Statistics","volume":"59 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138526884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Data Science at the Singularity","authors":"David Donoho","doi":"arxiv-2310.00865","DOIUrl":"https://doi.org/arxiv-2310.00865","url":null,"abstract":"A purported `AI Singularity' has been in the public eye recently. Mass media\u0000and US national political attention focused on `AI Doom' narratives hawked by\u0000social media influencers. The European Commission is announcing initiatives to\u0000forestall `AI Extinction'. In my opinion, `AI Singularity' is the wrong\u0000narrative for what's happening now; recent happenings signal something else\u0000entirely. Something fundamental to computation-based research really changed in\u0000the last ten years. In certain fields, progress is dramatically more rapid than\u0000previously, as the fields undergo a transition to frictionless reproducibility\u0000(FR). This transition markedly changes the rate of spread of ideas and\u0000practices, affects mindsets, and erases memories of much that came before. The emergence of frictionless reproducibility follows from the maturation of\u00003 data science principles in the last decade. Those principles involve data\u0000sharing, code sharing, and competitive challenges, however implemented in the\u0000particularly strong form of frictionless open services. Empirical Machine\u0000Learning (EML) is todays leading adherent field, and its consequent rapid\u0000changes are responsible for the AI progress we see. Still, other fields can and\u0000do benefit when they adhere to the same principles. Many rapid changes from this maturation are misidentified. The advent of FR\u0000in EML generates a steady flow of innovations; this flow stimulates outsider\u0000intuitions that there's an emergent superpower somewhere in AI. This opens the\u0000way for PR to push worrying narratives: not only `AI Extinction', but also the\u0000supposed monopoly of big tech on AI research. The helpful narrative observes\u0000that the superpower of EML is adherence to frictionless reproducibility\u0000practices; these practices are responsible for the striking progress in AI that\u0000we see everywhere.","PeriodicalId":501323,"journal":{"name":"arXiv - STAT - Other Statistics","volume":"21 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138526886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A review of undergraduate courses in Design of Experiments offered by American universities","authors":"Alan R. Vazquez, Xiaocong Xuan","doi":"arxiv-2309.16961","DOIUrl":"https://doi.org/arxiv-2309.16961","url":null,"abstract":"Design of Experiments (DoE) is a relevant class to undergraduate programs in\u0000the sciences, because it teaches students how to plan, conduct, and analyze\u0000experiments. In the literature on DoE, there are several contributions to its\u0000pedagogy, such as easy-to-use class experiments, virtual experiments, and\u0000software for constructing experimental designs. However, there are virtually no\u0000systematic assessments of the actual DoE pedagogy. To address this issue, we\u0000build the first database of undergraduate DoE courses offered in the United\u0000States of America. The database has records on courses offered from 2019 to\u00002022 by the best universities in the US News Best National Universities ranking\u0000of 2022. Specifically, it has data on 18 general and content-specific features\u0000of 206 courses. To study the DoE pedagogy, we analyze the database using\u0000descriptive statistics and text mining. Our main findings include that most\u0000undergraduate DoE courses follow the textbook \"Design of and Analysis of\u0000Experiments\" by Douglas Montgomery, use the R software, and emphasize the\u0000learning of multifactor designs, randomization restrictions, data analysis, and\u0000applications. Based on our analysis, we provide instructors with\u0000recommendations and teaching material to enhance their DoE courses. The\u0000database and material are included in the supplementary material.","PeriodicalId":501323,"journal":{"name":"arXiv - STAT - Other Statistics","volume":"59 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138526864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shira Viel, Maria Tackett, Sarwari Das, Joseph Choo
{"title":"Classroom Community amid Covid-19: A Mixed-Methods Study of Undergraduate Students in Introductory Mathematics and Statistics","authors":"Shira Viel, Maria Tackett, Sarwari Das, Joseph Choo","doi":"arxiv-2309.11739","DOIUrl":"https://doi.org/arxiv-2309.11739","url":null,"abstract":"A strong sense of classroom community is associated with many positive\u0000learning outcomes and is a critical contributor to undergraduate students'\u0000persistence in STEM, particularly for women and students of color. This\u0000manuscript describes a mixed-methods investigation into the relationship\u0000between classroom community and course attributes in introductory undergraduate\u0000mathematics and statistics courses, mediated by student demographics. The\u0000primary quantitative instrument is the validated Classroom Community Scale -\u0000Short Form survey. Data were collected from online courses in the 2020-21\u0000academic year along with hybrid and in-person courses in Fall 2021 and analyzed\u0000using structural equation modeling. These quantitative results are complemented\u0000and contextualized by thematic and textual analyses of focus group data\u0000gathered using a newly developed protocol piloted at the close of Fall 2021 All\u0000data come from a highly selective private university in the United States.\u0000While the study was conducted amidst the height of the Covid-19 pandemic,\u0000potential ramifications extend more broadly. These preliminary practical\u0000implications of the study include the value of synchronous participation in\u0000fostering connectedness and the importance of attending to students' personal\u0000identities in understanding their experiences of belonging.","PeriodicalId":501323,"journal":{"name":"arXiv - STAT - Other Statistics","volume":"21 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138526883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Rational Aversion to Information","authors":"Sven Neth","doi":"arxiv-2309.12374","DOIUrl":"https://doi.org/arxiv-2309.12374","url":null,"abstract":"Is more information always better? Or are there some situations in which more\u0000information can make us worse off? Good (1966) argues that expected utility\u0000maximizers should always accept more information if the information is\u0000cost-free and relevant. But Good's argument presupposes that you are certain\u0000you will update by conditionalization. If we relax this assumption and allow\u0000agents to be uncertain about updating, these agents can be rationally required\u0000to reject free and relevant information. Since there are good reasons to be\u0000uncertain about updating, rationality can require you to prefer ignorance.","PeriodicalId":501323,"journal":{"name":"arXiv - STAT - Other Statistics","volume":"28 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138526866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"How does international guidance for statistical practice align with the ASA Ethical Guidelines?","authors":"Rochelle E. Tractenberg, Jennifer Park","doi":"arxiv-2309.08713","DOIUrl":"https://doi.org/arxiv-2309.08713","url":null,"abstract":"Gillikin (2017) defines a 'practice standard' as a document to 'define the\u0000way the profession's body of knowledge is ethically translated into day-to-day\u0000activities' (Gillikin 2017, p. 1). Such documents fulfill three objectives:\u0000they 1) define the profession; 2) communicate uniform standards to\u0000stakeholders; and 3) reduce conflicts between personal and professional conduct\u0000(Gillikin, 2017 p. 2). However, there are many guidelines - this is due to\u0000different purposes that guidance writers may have, as well as to the fact that\u0000there are different audiences for the many guidance documents. The existence of\u0000diverse statements do not necessarily make it clear that there are\u0000commonalities; and while some statements are explicitly aspirational,\u0000professionals as well as the public need to know that ethically-trained\u0000practitioners follow accepted practice standards. This paper applies the\u0000methodological approach described in Tractenberg (2023) and demonstrated in\u0000Park and Tractenberg (2023) to study alignment among international guidance for\u0000official statistics, and between these guidance documents and the ASA Ethical\u0000Guidelines for Statistical Practice functioning as an ethical practice standard\u0000(Tractenberg, 2022-A, 2022-B; after Gillikin 2017). In the spirit of exchanging\u0000experiences and lessons learned, we discuss how our findings could inform\u0000closer examination, clarification, and, if beneficial, possible revision of\u0000guidance in the future.","PeriodicalId":501323,"journal":{"name":"arXiv - STAT - Other Statistics","volume":"3 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138526960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Creating Community in a Data Science Classroom","authors":"David Kane","doi":"arxiv-2309.06983","DOIUrl":"https://doi.org/arxiv-2309.06983","url":null,"abstract":"A community is a collection of people who know and care about each other. The\u0000vast majority of college courses are not communities. This is especially true\u0000of statistics and data science courses, both because our classes are larger and\u0000because we are more likely to lecture. However, it is possible to create a\u0000community in your classroom. This article offers an idiosyncratic set of\u0000practices for creating community. I have used these techniques successfully in\u0000first and second semester statistics courses with enrollments ranging from 40\u0000to 120. The key steps are knowing names, cold calling, classroom seating, a\u0000shallow learning curve, Study Halls, Recitations and rotating-one-on-one final\u0000project presentations.","PeriodicalId":501323,"journal":{"name":"arXiv - STAT - Other Statistics","volume":"57 6","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138526956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"How do ASA Ethical Guidelines Support U.S. Guidelines for Official Statistics?","authors":"Jennifer Park, Rochelle E. Tractenberg","doi":"arxiv-2309.07180","DOIUrl":"https://doi.org/arxiv-2309.07180","url":null,"abstract":"In 2022, the American Statistical Association revised its Ethical Guidelines\u0000for Statistical Practice. Originally issued in 1982, these Guidelines describe\u0000responsibilities of the 'ethical statistical practitioner' to their profession,\u0000to their research subjects, as well as to their community of practice. These\u0000guidelines are intended as a framework to assist decision-making by\u0000statisticians working across academic, research, and government environments.\u0000For the first time, the 2022 Guidelines describe the ethical obligations of\u0000organizations and institutions that use statistical practice. This paper\u0000examines alignment between the ASA Ethical Guidelines and other\u0000long-established normative guidelines for US official statistics: the OMB\u0000Statistical Policy Directives 1, 2, and 2a NASEM Principles and Practices, and\u0000the OMB Data Ethics Tenets. Our analyses ask how the recently updated ASA\u0000Ethical Guidelines can support these guidelines for federal statistics and data\u0000science. The analysis uses a form of qualitative content analysis, the\u0000alignment model, to identify patterns of alignment, and potential for tensions,\u0000within and across guidelines. The paper concludes with recommendations to\u0000policy makers when using ethical guidance to establish parameters for policy\u0000change and the administrative and technical controls that necessarily follow.","PeriodicalId":501323,"journal":{"name":"arXiv - STAT - Other Statistics","volume":"27 7","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138526833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Monografía de Estadística Bayesiana","authors":"Arturo Erdely, Eduardo Gutiérrez-Peña","doi":"arxiv-2309.06601","DOIUrl":"https://doi.org/arxiv-2309.06601","url":null,"abstract":"Course notes about an introduction to Bayesian Statistics. First, an\u0000explanation of the bayesian paradigm is motivated and explained in detail\u0000(first three chapters). Then, a brief introduction to the basics about Decision\u0000Theory in chapter four, which is self contained, with the purpose of\u0000introducing parametrica bayesian inference as a decision problem in chapter\u0000five.","PeriodicalId":501323,"journal":{"name":"arXiv - STAT - Other Statistics","volume":"28 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138526863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}