{"title":"Distinguishing Human-Written and ChatGPT-Generated Text Using Machine Learning","authors":"Hosam Alamleh, A. A. AlQahtani, A. ElSaid","doi":"10.1109/SIEDS58326.2023.10137767","DOIUrl":null,"url":null,"abstract":"The use of sophisticated Artificial Intelligence (AI) language models, including ChatGPT, has led to growing concerns regarding the ability to distinguish between human-written and AI-generated text in academic and scholarly settings. This study seeks to evaluate the effectiveness of machine learning algorithms in differentiating between human-written and AI-generated text. To accomplish this, we collected responses from Computer Science students for both essay and programming assignments. We then trained and evaluated several machine learning models, including Logistic Regression (LR), Decision Trees (DT), Support Vector Machines (SVM), Neural Networks (NN), and Random Forests (RF), based on accuracy, computational efficiency, and confusion matrices. By comparing the performance of these models, we identified the most suitable one for the task at hand. The use of machine learning algorithms for detecting text generated by AI has significant potential for applications in content moderation, plagiarism detection, and quality control for text generation systems, thereby contributing to the preservation of academic integrity in the face of rapidly advancing AI-driven content generation.","PeriodicalId":267464,"journal":{"name":"2023 Systems and Information Engineering Design Symposium (SIEDS)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 Systems and Information Engineering Design Symposium (SIEDS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SIEDS58326.2023.10137767","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
The use of sophisticated Artificial Intelligence (AI) language models, including ChatGPT, has led to growing concerns regarding the ability to distinguish between human-written and AI-generated text in academic and scholarly settings. This study seeks to evaluate the effectiveness of machine learning algorithms in differentiating between human-written and AI-generated text. To accomplish this, we collected responses from Computer Science students for both essay and programming assignments. We then trained and evaluated several machine learning models, including Logistic Regression (LR), Decision Trees (DT), Support Vector Machines (SVM), Neural Networks (NN), and Random Forests (RF), based on accuracy, computational efficiency, and confusion matrices. By comparing the performance of these models, we identified the most suitable one for the task at hand. The use of machine learning algorithms for detecting text generated by AI has significant potential for applications in content moderation, plagiarism detection, and quality control for text generation systems, thereby contributing to the preservation of academic integrity in the face of rapidly advancing AI-driven content generation.