An explainable framework for assisting the detection of AI-generated textual content

IF 6.7 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Sen Yan, Zhiyi Wang, David Dobolyi
{"title":"An explainable framework for assisting the detection of AI-generated textual content","authors":"Sen Yan,&nbsp;Zhiyi Wang,&nbsp;David Dobolyi","doi":"10.1016/j.dss.2025.114498","DOIUrl":null,"url":null,"abstract":"<div><div>The recent development of generative AI (GenAI) algorithms has allowed machines to create new content in a realistic way, driving the spread of AI-generated content (AIGC) on the Internet. However, generative AI models and AIGC have exacerbated several societal challenges such as security threats (e.g., misinformation), trust issues, ethical concerns, and intellectual property regulation, calling for effective detection methods and a better understanding of AI-generated vs. human-written content. In this paper, we focus on AI-generated texts produced by large language models (LLMs) and extend prior detection methods by proposing a novel framework that combines semantic information and linguistic features. Based on potential semantic and linguistic differences in AI vs. human writing, we design our Semantic-Linguistic-Detector (SemLinDetector) framework by integrating a transformer-based semantic encoder and a linguistic encoder with parallel linguistic representations. By comparing a series of benchmark models on datasets collected from various LLMs and human writers in multiple domains, our experiments show that the proposed detection framework outperforms other benchmarks in a consistent and robust manner. Moreover, our model interpretability analysis showcases our framework's potential to help understand the reasoning behind prediction outcomes and identify patterns of differences in AI-generated and human-written content. Our research adds to the growing space of GenAI by proposing an effective and responsible detection system to address the risks and challenges of GenAI, offering implications for researchers and practitioners to better understand and regulate AIGC.</div></div>","PeriodicalId":55181,"journal":{"name":"Decision Support Systems","volume":"196 ","pages":"Article 114498"},"PeriodicalIF":6.7000,"publicationDate":"2025-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Decision Support Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167923625000995","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

The recent development of generative AI (GenAI) algorithms has allowed machines to create new content in a realistic way, driving the spread of AI-generated content (AIGC) on the Internet. However, generative AI models and AIGC have exacerbated several societal challenges such as security threats (e.g., misinformation), trust issues, ethical concerns, and intellectual property regulation, calling for effective detection methods and a better understanding of AI-generated vs. human-written content. In this paper, we focus on AI-generated texts produced by large language models (LLMs) and extend prior detection methods by proposing a novel framework that combines semantic information and linguistic features. Based on potential semantic and linguistic differences in AI vs. human writing, we design our Semantic-Linguistic-Detector (SemLinDetector) framework by integrating a transformer-based semantic encoder and a linguistic encoder with parallel linguistic representations. By comparing a series of benchmark models on datasets collected from various LLMs and human writers in multiple domains, our experiments show that the proposed detection framework outperforms other benchmarks in a consistent and robust manner. Moreover, our model interpretability analysis showcases our framework's potential to help understand the reasoning behind prediction outcomes and identify patterns of differences in AI-generated and human-written content. Our research adds to the growing space of GenAI by proposing an effective and responsible detection system to address the risks and challenges of GenAI, offering implications for researchers and practitioners to better understand and regulate AIGC.
一个可解释的框架,用于协助检测人工智能生成的文本内容
最近,生成式人工智能(GenAI)算法的发展使机器能够以逼真的方式创造新内容,从而推动了人工智能生成内容(AIGC)在互联网上的传播。然而,生成式人工智能模型和AIGC加剧了一些社会挑战,如安全威胁(例如,错误信息)、信任问题、道德问题和知识产权监管,这需要有效的检测方法,并更好地理解人工智能生成的内容与人类编写的内容。在本文中,我们将重点放在由大型语言模型(llm)生成的人工智能生成文本上,并通过提出一个结合语义信息和语言特征的新框架来扩展先验检测方法。基于人工智能与人类写作中潜在的语义和语言差异,我们通过集成基于转换器的语义编码器和具有并行语言表示的语言编码器来设计语义-语言-检测器(SemLinDetector)框架。通过比较从多个领域的各种法学硕士和人类作家收集的数据集上的一系列基准模型,我们的实验表明,所提出的检测框架以一致和稳健的方式优于其他基准。此外,我们的模型可解释性分析展示了我们的框架的潜力,可以帮助理解预测结果背后的原因,并识别人工智能生成和人工编写内容的差异模式。本研究提出了一种有效的、负责任的检测系统来应对GenAI的风险和挑战,为研究人员和从业人员更好地理解和监管AIGC提供了启示,为GenAI的发展提供了空间。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Decision Support Systems
Decision Support Systems 工程技术-计算机:人工智能
CiteScore
14.70
自引率
6.70%
发文量
119
审稿时长
13 months
期刊介绍: The common thread of articles published in Decision Support Systems is their relevance to theoretical and technical issues in the support of enhanced decision making. The areas addressed may include foundations, functionality, interfaces, implementation, impacts, and evaluation of decision support systems (DSSs).
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信