Paria Sarzaeim, Aarya Mayurpalsingh Doshi, Qusay H. Mahmoud
{"title":"A Framework for Detecting AI-Generated Text in Research Publications","authors":"Paria Sarzaeim, Aarya Mayurpalsingh Doshi, Qusay H. Mahmoud","doi":"10.58190/icat.2023.28","DOIUrl":null,"url":null,"abstract":"The use of generative artificial intelligence is becoming increasingly prevalent in creating content in various formats such as text, video, and image. However, there is a need to distinguish between content that has been generated by humans and content that has been generated by AI as misuse of these technologies can raise scientific and social challenges. Moreover, there are concerns about the reliability and comprehensiveness of the content generated by AI without human validation. This paper presents a framework for AI-generated text. The prototype implementation of the proposed approach is to train a model using predefined datasets and deploy this model on a cloud-based service to predict whether a text was created by a human or AI. This approach is specifically focused on assessing the accuracy of scientific writings and research papers rather than general text. The proposed framework is compared with recently developed tools such as OpenAI Text Classifier, ZeroGPT, and Turnitin. The results show that training a text classifier can be highly useful in detecting whether a text is written by a human or AI. The source code and dataset are made open source so others can experiment with the prototype implementation and use it for future research.","PeriodicalId":20592,"journal":{"name":"PROCEEDINGS OF THE III INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES IN MATERIALS SCIENCE, MECHANICAL AND AUTOMATION ENGINEERING: MIP: Engineering-III – 2021","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PROCEEDINGS OF THE III INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES IN MATERIALS SCIENCE, MECHANICAL AND AUTOMATION ENGINEERING: MIP: Engineering-III – 2021","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.58190/icat.2023.28","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The use of generative artificial intelligence is becoming increasingly prevalent in creating content in various formats such as text, video, and image. However, there is a need to distinguish between content that has been generated by humans and content that has been generated by AI as misuse of these technologies can raise scientific and social challenges. Moreover, there are concerns about the reliability and comprehensiveness of the content generated by AI without human validation. This paper presents a framework for AI-generated text. The prototype implementation of the proposed approach is to train a model using predefined datasets and deploy this model on a cloud-based service to predict whether a text was created by a human or AI. This approach is specifically focused on assessing the accuracy of scientific writings and research papers rather than general text. The proposed framework is compared with recently developed tools such as OpenAI Text Classifier, ZeroGPT, and Turnitin. The results show that training a text classifier can be highly useful in detecting whether a text is written by a human or AI. The source code and dataset are made open source so others can experiment with the prototype implementation and use it for future research.