{"title":"Passed the Turing Test: Living in Turing Futures","authors":"Bernardo Gonçalves","doi":"arxiv-2409.07656","DOIUrl":"https://doi.org/arxiv-2409.07656","url":null,"abstract":"The world has seen the emergence of machines based on pretrained models,\u0000transformers, also known as generative artificial intelligences for their\u0000ability to produce various types of content, including text, images, audio, and\u0000synthetic data. Without resorting to preprogramming or special tricks, their\u0000intelligence grows as they learn from experience, and to ordinary people, they\u0000can appear human-like in conversation. This means that they can pass the Turing\u0000test, and that we are now living in one of many possible Turing futures where\u0000machines can pass for what they are not. However, the learning machines that\u0000Turing imagined would pass his imitation tests were machines inspired by the\u0000natural development of the low-energy human cortex. They would be raised like\u0000human children and naturally learn the ability to deceive an observer. These\u0000``child machines,'' Turing hoped, would be powerful enough to have an impact on\u0000society and nature.","PeriodicalId":501112,"journal":{"name":"arXiv - CS - Computers and Society","volume":"12 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142184054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Falaah Arif Khan, Denys Herasymuk, Nazar Protsiv, Julia Stoyanovich
{"title":"Still More Shades of Null: A Benchmark for Responsible Missing Value Imputation","authors":"Falaah Arif Khan, Denys Herasymuk, Nazar Protsiv, Julia Stoyanovich","doi":"arxiv-2409.07510","DOIUrl":"https://doi.org/arxiv-2409.07510","url":null,"abstract":"We present Shades-of-NULL, a benchmark for responsible missing value\u0000imputation. Our benchmark includes state-of-the-art imputation techniques, and\u0000embeds them into the machine learning development lifecycle. We model realistic\u0000missingness scenarios that go beyond Rubin's classic Missing Completely at\u0000Random (MCAR), Missing At Random (MAR) and Missing Not At Random (MNAR), to\u0000include multi-mechanism missingness (when different missingness patterns\u0000co-exist in the data) and missingness shift (when the missingness mechanism\u0000changes between training and test). Another key novelty of our work is that we\u0000evaluate imputers holistically, based on the predictive performance, fairness\u0000and stability of the models that are trained and tested on the data they\u0000produce. We use Shades-of-NULL to conduct a large-scale empirical study involving\u000020,952 experimental pipelines, and find that, while there is no single\u0000best-performing imputation approach for all missingness types, interesting\u0000performance patterns do emerge when comparing imputer performance in simpler\u0000vs. more complex missingness scenarios. Further, while predictive performance,\u0000fairness and stability can be seen as orthogonal, we identify trade-offs among\u0000them that arise due to the combination of missingness scenario, the choice of\u0000an imputer, and the architecture of the model trained on the data\u0000post-imputation. We make Shades-of-NULL publicly available, and hope to enable\u0000researchers to comprehensively and rigorously evaluate new missing value\u0000imputation methods on a wide range of evaluation metrics, in plausible and\u0000socially meaningful missingness scenarios.","PeriodicalId":501112,"journal":{"name":"arXiv - CS - Computers and Society","volume":"26 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142184055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Detection and Classification of Twitter Users' Opinions on Drought Crises in Iran Using Machine Learning Techniques","authors":"Somayeh Labafi, Leila Rabiei, Zeinab Rajabi","doi":"arxiv-2409.07611","DOIUrl":"https://doi.org/arxiv-2409.07611","url":null,"abstract":"The main objective of this research is to identify and classify the opinions\u0000of Persian-speaking Twitter users related to drought crises in Iran and\u0000subsequently develop a model for detecting these opinions on the platform. To\u0000achieve this, a model has been developed using machine learning and text mining\u0000methods to detect the opinions of Persian-speaking Twitter users regarding the\u0000drought issues in Iran. The statistical population for the research included\u000042,028 drought-related tweets posted over a one-year period. These tweets were\u0000extracted from Twitter using keywords related to the drought crises in Iran.\u0000Subsequently, a sample of 2,300 tweets was qualitatively analyzed, labeled,\u0000categorized, and examined. Next, a four-category classification of users`\u0000opinions regarding drought crises and Iranians' resilience to these crises was\u0000identified. Based on these four categories, a machine learning model based on\u0000logistic regression was trained to predict and detect various opinions in\u0000Twitter posts. The developed model exhibits an accuracy of 66.09% and an\u0000F-score of 60%, indicating that this model has good performance for detecting\u0000Iranian Twitter users' opinions regarding drought crises. The ability to detect\u0000opinions regarding drought crises on platforms like Twitter using machine\u0000learning methods can intelligently represent the resilience level of the\u0000Iranian society in the face of these crises, and inform policymakers in this\u0000area about changes in public opinion.","PeriodicalId":501112,"journal":{"name":"arXiv - CS - Computers and Society","volume":"13 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142184053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards Fairer Health Recommendations: finding informative unbiased samples via Word Sense Disambiguation","authors":"Gavin Butts, Pegah Emdad, Jethro Lee, Shannon Song, Chiman Salavati, Willmar Sosa Diaz, Shiri Dori-Hacohen, Fabricio Murai","doi":"arxiv-2409.07424","DOIUrl":"https://doi.org/arxiv-2409.07424","url":null,"abstract":"There have been growing concerns around high-stake applications that rely on\u0000models trained with biased data, which consequently produce biased predictions,\u0000often harming the most vulnerable. In particular, biased medical data could\u0000cause health-related applications and recommender systems to create outputs\u0000that jeopardize patient care and widen disparities in health outcomes. A recent\u0000framework titled Fairness via AI posits that, instead of attempting to correct\u0000model biases, researchers must focus on their root causes by using AI to debias\u0000data. Inspired by this framework, we tackle bias detection in medical curricula\u0000using NLP models, including LLMs, and evaluate them on a gold standard dataset\u0000containing 4,105 excerpts annotated by medical experts for bias from a large\u0000corpus. We build on previous work by coauthors which augments the set of\u0000negative samples with non-annotated text containing social identifier terms.\u0000However, some of these terms, especially those related to race and ethnicity,\u0000can carry different meanings (e.g., \"white matter of spinal cord\"). To address\u0000this issue, we propose the use of Word Sense Disambiguation models to refine\u0000dataset quality by removing irrelevant sentences. We then evaluate fine-tuned\u0000variations of BERT models as well as GPT models with zero- and few-shot\u0000prompting. We found LLMs, considered SOTA on many NLP tasks, unsuitable for\u0000bias detection, while fine-tuned BERT models generally perform well across all\u0000evaluated metrics.","PeriodicalId":501112,"journal":{"name":"arXiv - CS - Computers and Society","volume":"157 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142184057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Legal Fact Prediction: Task Definition and Dataset Construction","authors":"Junkai Liu, Yujie Tong, Hui Huang, Shuyuan Zheng, Muyun Yang, Peicheng Wu, Makoto Onizuka, Chuan Xiao","doi":"arxiv-2409.07055","DOIUrl":"https://doi.org/arxiv-2409.07055","url":null,"abstract":"Legal facts refer to the facts that can be proven by acknowledged evidence in\u0000a trial. They form the basis for the determination of court judgments. This\u0000paper introduces a novel NLP task: legal fact prediction, which aims to predict\u0000the legal fact based on a list of evidence. The predicted facts can instruct\u0000the parties and their lawyers involved in a trial to strengthen their\u0000submissions and optimize their strategies during the trial. Moreover, since\u0000real legal facts are difficult to obtain before the final judgment, the\u0000predicted facts also serve as an important basis for legal judgment prediction.\u0000We construct a benchmark dataset consisting of evidence lists and ground-truth\u0000legal facts for real civil loan cases, LFPLoan. Our experiments on this dataset\u0000show that this task is non-trivial and requires further considerable research\u0000efforts.","PeriodicalId":501112,"journal":{"name":"arXiv - CS - Computers and Society","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142184058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Insuring Uninsurable Risks from AI: The State as Insurer of Last Resort","authors":"Cristian Trout","doi":"arxiv-2409.06672","DOIUrl":"https://doi.org/arxiv-2409.06672","url":null,"abstract":"Many experts believe that AI systems will sooner or later pose uninsurable\u0000risks, including existential risks. This creates an extreme judgment-proof\u0000problem: few if any parties can be held accountable ex post in the event of\u0000such a catastrophe. This paper proposes a novel solution: a\u0000government-provided, mandatory indemnification program for AI developers. The\u0000program uses risk-priced indemnity fees to induce socially optimal levels of\u0000care. Risk-estimates are determined by surveying experts, including indemnified\u0000developers. The Bayesian Truth Serum mechanism is employed to incent honest and\u0000effortful responses. Compared to alternatives, this approach arguably better\u0000leverages all private information, and provides a clearer signal to indemnified\u0000developers regarding what risks they must mitigate to lower their fees. It's\u0000recommended that collected fees be used to help fund the safety research\u0000developers need, employing a fund matching mechanism (Quadratic Financing) to\u0000induce an optimal supply of this public good. Under Quadratic Financing, safety\u0000research projects would compete for private contributions from developers,\u0000signaling how much each is to be supplemented with public funds.","PeriodicalId":501112,"journal":{"name":"arXiv - CS - Computers and Society","volume":"66 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142184059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Liability and Insurance for Catastrophic Losses: the Nuclear Power Precedent and Lessons for AI","authors":"Cristian Trout","doi":"arxiv-2409.06673","DOIUrl":"https://doi.org/arxiv-2409.06673","url":null,"abstract":"As AI systems become more autonomous and capable, experts warn of them\u0000potentially causing catastrophic losses. Drawing on the successful precedent\u0000set by the nuclear power industry, this paper argues that developers of\u0000frontier AI models should be assigned limited, strict, and exclusive third\u0000party liability for harms resulting from Critical AI Occurrences (CAIOs) -\u0000events that cause or easily could have caused catastrophic losses. Mandatory\u0000insurance for CAIO liability is recommended to overcome developers'\u0000judgment-proofness, mitigate winner's curse dynamics, and leverage insurers'\u0000quasi-regulatory abilities. Based on theoretical arguments and observations\u0000from the analogous nuclear power context, insurers are expected to engage in a\u0000mix of causal risk-modeling, monitoring, lobbying for stricter regulation, and\u0000providing loss prevention guidance in the context of insuring against\u0000heavy-tail risks from AI. While not a substitute for regulation, clear\u0000liability assignment and mandatory insurance can help efficiently allocate\u0000resources to risk-modeling and safe design, facilitating future regulatory\u0000efforts.","PeriodicalId":501112,"journal":{"name":"arXiv - CS - Computers and Society","volume":"73 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142223991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Understanding and Mitigating the Impacts of Differentially Private Census Data on State Level Redistricting","authors":"Christian Cianfarani, Aloni Cohen","doi":"arxiv-2409.06801","DOIUrl":"https://doi.org/arxiv-2409.06801","url":null,"abstract":"Data from the Decennial Census is published only after applying a disclosure\u0000avoidance system (DAS). Data users were shaken by the adoption of differential\u0000privacy in the 2020 DAS, a radical departure from past methods. The change\u0000raises the question of whether redistricting law permits, forbids, or requires\u0000taking account of the effect of disclosure avoidance. Such uncertainty creates\u0000legal risks for redistricters, as Alabama argued in a lawsuit seeking to\u0000prevent the 2020 DAS's deployment. We consider two redistricting settings in\u0000which a data user might be concerned about the impacts of privacy preserving\u0000noise: drawing equal population districts and litigating voting rights cases.\u0000What discrepancies arise if the user does nothing to account for disclosure\u0000avoidance? How might the user adapt her analyses to mitigate those\u0000discrepancies? We study these questions by comparing the official 2010\u0000Redistricting Data to the 2010 Demonstration Data -- created using the 2020 DAS\u0000-- in an analysis of millions of algorithmically generated state legislative\u0000redistricting plans. In both settings, we observe that an analyst may come to\u0000incorrect conclusions if they do not account for noise. With minor adaptations,\u0000though, the underlying policy goals remain achievable: tweaking selection\u0000criteria enables a redistricter to draw balanced plans, and illustrative plans\u0000can still be used as evidence of the maximum number of majority-minority\u0000districts that are possible in a geography. At least for state legislatures,\u0000Alabama's claim that differential privacy ``inhibits a State's right to draw\u0000fair lines'' appears unfounded.","PeriodicalId":501112,"journal":{"name":"arXiv - CS - Computers and Society","volume":"4 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142184056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Personalized Knowledge Tracing through Student Representation Reconstruction and Class Imbalance Mitigation","authors":"Zhiyu Chen, Wei Ji, Jing Xiao, Zitao Liu","doi":"arxiv-2409.06745","DOIUrl":"https://doi.org/arxiv-2409.06745","url":null,"abstract":"Knowledge tracing is a technique that predicts students' future performance\u0000by analyzing their learning process through historical interactions with\u0000intelligent educational platforms, enabling a precise evaluation of their\u0000knowledge mastery. Recent studies have achieved significant progress by\u0000leveraging powerful deep neural networks. These models construct complex input\u0000representations using questions, skills, and other auxiliary information but\u0000overlook individual student characteristics, which limits the capability for\u0000personalized assessment. Additionally, the available datasets in the field\u0000exhibit class imbalance issues. The models that simply predict all responses as\u0000correct without substantial effort can yield impressive accuracy. In this\u0000paper, we propose PKT, a novel approach for personalized knowledge tracing. PKT\u0000reconstructs representations from sequences of interactions with a tutoring\u0000platform to capture latent information about the students. Moreover, PKT\u0000incorporates focal loss to improve prioritize minority classes, thereby\u0000achieving more balanced predictions. Extensive experimental results on four\u0000publicly available educational datasets demonstrate the advanced predictive\u0000performance of PKT in comparison with 16 state-of-the-art models. To ensure the\u0000reproducibility of our research, the code is publicly available at\u0000https://anonymous.4open.science/r/PKT.","PeriodicalId":501112,"journal":{"name":"arXiv - CS - Computers and Society","volume":"23 13 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142227781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The limits of progress in the digital era","authors":"Joaquin Luque","doi":"arxiv-2409.05082","DOIUrl":"https://doi.org/arxiv-2409.05082","url":null,"abstract":"The concept of progress clearly percolates the activities in science,\u0000technology, economy and society. It is a driving vector (probably the main\u0000vector) of our daily activity as researchers. The InThisGen initiative, proudly\u0000displayed in places across the University of Berkeley campus, and its headline\u0000lemma (what can we change in a single generation?) are clear exponents of the\u0000underlying assumption that progress is not only possible but also desirable.\u0000But about the concept of progress two major concerns arise. First of all,\u0000progress means some kind of going forward, that is a direction in a journey.\u0000But deciding the way in the route clearly implies that we are explicit or\u0000implicitly defining the goals, as individuals and as society. That is, the\u0000concept of progress has a set of underlying values. Additionally, the\u0000conceptual paradigm in scientific research (and probably in the whole spirit of\u0000our times) it is assuming some kind of endless progress. It is true that many\u0000technological innovations and their subsequent impact on society have found\u0000resistance, from Luddites to ecologist movements. But the last 150 years (the\u0000age of our university) have been witness of an enormous and general increase in\u0000knowledge, wealth and welfare, showing how progress can be sustained in the\u0000long-term and positively influence the human beings and the society. In this\u0000contribution will try to discuss these bounds, addressing the limits of\u0000materials, scientific knowledge and technological know-how. We will mainly\u0000focus on the limitations in technological knowledge in the software design, a\u0000key aspect of the digital era. Our main thesis, which will be addressed through\u0000the paper, is that there are intrinsic limits to technological knowledge and\u0000the concept of progress should take them into account.","PeriodicalId":501112,"journal":{"name":"arXiv - CS - Computers and Society","volume":"32 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142184060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}