{"title":"Network Analysis for the investigation of rater effects in language assessment: A comparison of ChatGPT vs human raters","authors":"Iasonas Lamprianou","doi":"10.1016/j.rmal.2025.100205","DOIUrl":null,"url":null,"abstract":"<div><div>A recent study by Yamashita (2024) showcases the usefulness of Many-Facet Rasch Model (MFRM) for the analysis of rater effects within the context of Automated Essay Scoring (AES). Building upon Yamashita's work, we break new ground by using Network Analysis (NA) to interrogate the same dataset comparing ChatGPT and human raters for the evaluation of 136 essays. We replicate the analysis of the original study and show a near-perfect agreement between the results of NA and MFRM. We extend the original study by providing strong evidence of halo effect in the data (including the ChatGPT ratings) and propose two new statistics to assess the consistency of raters. We also present simulation studies to show that the NA estimation algorithm is robust, even with small and sparse datasets. Finally, we provide practical guidelines for researchers seeking to use NA with their own datasets. We argue that NA can complement established methodologies, such as the MFRM, but can also be used independently, leveraging its strong visual representations. Relevant algorithms and R code are provided in the Online Appendix to support researchers and practitioners in replicating our findings.</div></div>","PeriodicalId":101075,"journal":{"name":"Research Methods in Applied Linguistics","volume":"4 2","pages":"Article 100205"},"PeriodicalIF":0.0000,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Research Methods in Applied Linguistics","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772766125000266","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
A recent study by Yamashita (2024) showcases the usefulness of Many-Facet Rasch Model (MFRM) for the analysis of rater effects within the context of Automated Essay Scoring (AES). Building upon Yamashita's work, we break new ground by using Network Analysis (NA) to interrogate the same dataset comparing ChatGPT and human raters for the evaluation of 136 essays. We replicate the analysis of the original study and show a near-perfect agreement between the results of NA and MFRM. We extend the original study by providing strong evidence of halo effect in the data (including the ChatGPT ratings) and propose two new statistics to assess the consistency of raters. We also present simulation studies to show that the NA estimation algorithm is robust, even with small and sparse datasets. Finally, we provide practical guidelines for researchers seeking to use NA with their own datasets. We argue that NA can complement established methodologies, such as the MFRM, but can also be used independently, leveraging its strong visual representations. Relevant algorithms and R code are provided in the Online Appendix to support researchers and practitioners in replicating our findings.