Can you spot the bot? Identifying AI-generated writing in college essays

IF 6.9 Q1 EDUCATION & EDUCATIONAL RESEARCH

International Journal for Educational Integrity Pub Date : 2024-07-08 DOI:10.1007/s40979-024-00158-3

Tal Waltzer, Celeste Pilegard, Gail D. Heyman

{"title":"Can you spot the bot? Identifying AI-generated writing in college essays","authors":"Tal Waltzer, Celeste Pilegard, Gail D. Heyman","doi":"10.1007/s40979-024-00158-3","DOIUrl":null,"url":null,"abstract":"<p>The release of ChatGPT in 2022 has generated extensive speculation about how Artificial Intelligence (AI) will impact the capacity of institutions for higher learning to achieve their central missions of promoting learning and certifying knowledge. Our main questions were whether people could identify AI-generated text and whether factors such as expertise or confidence would predict this ability. The present research provides empirical data to inform these speculations through an assessment given to a convenience sample of 140 college instructors and 145 college students (Study 1) as well as to ChatGPT itself (Study 2). The assessment was administered in an online survey and included an AI Identification Test which presented pairs of essays: In each case, one was written by a college student during an in-class exam and the other was generated by ChatGPT. Analyses with binomial tests and linear modeling suggested that the AI Identification Test was challenging: On average, instructors were able to guess which one was written by ChatGPT only 70% of the time (compared to 60% for students and 63% for ChatGPT). Neither experience with ChatGPT nor content expertise improved performance. Even people who were confident in their abilities struggled with the test. ChatGPT responses reflected much more confidence than human participants despite performing just as poorly. ChatGPT responses on an AI Attitude Assessment measure were similar to those reported by instructors and students except that ChatGPT rated several AI uses more favorably and indicated substantially more optimism about the positive educational benefits of AI. The findings highlight challenges for scholars and practitioners to consider as they navigate the integration of AI in education.</p>","PeriodicalId":44838,"journal":{"name":"International Journal for Educational Integrity","volume":"19 1","pages":""},"PeriodicalIF":6.9000,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal for Educational Integrity","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s40979-024-00158-3","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}

引用次数: 0

Abstract

The release of ChatGPT in 2022 has generated extensive speculation about how Artificial Intelligence (AI) will impact the capacity of institutions for higher learning to achieve their central missions of promoting learning and certifying knowledge. Our main questions were whether people could identify AI-generated text and whether factors such as expertise or confidence would predict this ability. The present research provides empirical data to inform these speculations through an assessment given to a convenience sample of 140 college instructors and 145 college students (Study 1) as well as to ChatGPT itself (Study 2). The assessment was administered in an online survey and included an AI Identification Test which presented pairs of essays: In each case, one was written by a college student during an in-class exam and the other was generated by ChatGPT. Analyses with binomial tests and linear modeling suggested that the AI Identification Test was challenging: On average, instructors were able to guess which one was written by ChatGPT only 70% of the time (compared to 60% for students and 63% for ChatGPT). Neither experience with ChatGPT nor content expertise improved performance. Even people who were confident in their abilities struggled with the test. ChatGPT responses reflected much more confidence than human participants despite performing just as poorly. ChatGPT responses on an AI Attitude Assessment measure were similar to those reported by instructors and students except that ChatGPT rated several AI uses more favorably and indicated substantially more optimism about the positive educational benefits of AI. The findings highlight challenges for scholars and practitioners to consider as they navigate the integration of AI in education.

Abstract Image

查看原文本刊更多论文

你能识别机器人吗？识别大学论文中的人工智能生成写作

2022 年发布的 ChatGPT 引发了人们对人工智能（AI）将如何影响高等院校实现其促进学习和认证知识这一核心使命的广泛猜测。我们的主要问题是，人们能否识别人工智能生成的文本，以及专业知识或信心等因素能否预测这种能力。本研究通过对 140 名大学教师和 145 名大学生（研究 1）以及 ChatGPT 本身（研究 2）的方便抽样进行评估，为这些推测提供了实证数据。评估以在线调查的形式进行，其中包括人工智能识别测试，该测试提供了两对文章：每对文章中，一篇由大学生在课堂考试中撰写，另一篇由 ChatGPT 生成。二项检验和线性建模分析表明，人工智能识别测试具有挑战性：平均而言，教师只有 70% 的时间能猜出哪个是由 ChatGPT 编写的（学生为 60%，ChatGPT 为 63%）。无论是使用 ChatGPT 的经验还是内容方面的专业知识，都无法提高成绩。即使是对自己的能力有信心的人也很难通过测试。尽管表现同样糟糕，但 ChatGPT 的回答却比人类参与者更有信心。ChatGPT 对人工智能态度评估的回答与教师和学生的报告相似，但 ChatGPT 对几种人工智能的使用给予了更高的评价，并对人工智能的积极教育效益表示出了更乐观的态度。研究结果凸显了学者和从业人员在将人工智能融入教育时需要考虑的挑战。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊