通过人工智能聊天机器人评估任务解决情况的评分标准开发与验证

IF 2.4 Q1 EDUCATION & EDUCATIONAL RESEARCH
Mohammad Hmoud, Hadeel Swaity, Eman Anjass, Eva María Aguaded-Ramírez
{"title":"通过人工智能聊天机器人评估任务解决情况的评分标准开发与验证","authors":"Mohammad Hmoud, Hadeel Swaity, Eman Anjass, Eva María Aguaded-Ramírez","doi":"10.34190/ejel.22.6.3292","DOIUrl":null,"url":null,"abstract":"This research aimed to develop and validate a rubric to assess Artificial Intelligence (AI) chatbots' effectiveness in accomplishing tasks, particularly within educational contexts. Given the rapidly growing integration of AI in various sectors, including education, a systematic and robust tool for evaluating AI chatbot performance is essential. This investigation involved a rigorous process including expert involvement to ensure content validity, as well as the application of statistical tests for assessing internal consistency and reliability. Factor analysis also revealed two significant domains, \"Quality of Content\" and \"Quality of Expression\", which further enhanced the construct validity of the evaluation scale. The results from this investigation robustly affirm the reliability and validity of the developed rubric, thus marking a significant advancement in the sphere of AI chatbot performance evaluation within educational contexts. Nonetheless, the study simultaneously emphasizes the requirement for additional validation research, specifically those entailing a variety of tasks and diverse AI chatbots, to further corroborate these findings. The ramifications of this research are profound, offering both researchers and practitioners engaged in chatbot development and evaluation a comprehensive and validated framework for the assessment of chatbot performance.","PeriodicalId":46105,"journal":{"name":"Electronic Journal of e-Learning","volume":null,"pages":null},"PeriodicalIF":2.4000,"publicationDate":"2024-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Rubric Development and Validation for Assessing Tasks' Solving via AI Chatbots\",\"authors\":\"Mohammad Hmoud, Hadeel Swaity, Eman Anjass, Eva María Aguaded-Ramírez\",\"doi\":\"10.34190/ejel.22.6.3292\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This research aimed to develop and validate a rubric to assess Artificial Intelligence (AI) chatbots' effectiveness in accomplishing tasks, particularly within educational contexts. Given the rapidly growing integration of AI in various sectors, including education, a systematic and robust tool for evaluating AI chatbot performance is essential. This investigation involved a rigorous process including expert involvement to ensure content validity, as well as the application of statistical tests for assessing internal consistency and reliability. Factor analysis also revealed two significant domains, \\\"Quality of Content\\\" and \\\"Quality of Expression\\\", which further enhanced the construct validity of the evaluation scale. The results from this investigation robustly affirm the reliability and validity of the developed rubric, thus marking a significant advancement in the sphere of AI chatbot performance evaluation within educational contexts. Nonetheless, the study simultaneously emphasizes the requirement for additional validation research, specifically those entailing a variety of tasks and diverse AI chatbots, to further corroborate these findings. The ramifications of this research are profound, offering both researchers and practitioners engaged in chatbot development and evaluation a comprehensive and validated framework for the assessment of chatbot performance.\",\"PeriodicalId\":46105,\"journal\":{\"name\":\"Electronic Journal of e-Learning\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.4000,\"publicationDate\":\"2024-05-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Electronic Journal of e-Learning\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.34190/ejel.22.6.3292\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"EDUCATION & EDUCATIONAL RESEARCH\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Electronic Journal of e-Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.34190/ejel.22.6.3292","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}
引用次数: 0

摘要

这项研究旨在开发和验证一个标准,用于评估人工智能(AI)聊天机器人完成任务的效率,特别是在教育领域。鉴于人工智能与包括教育在内的各行各业的快速融合,一个系统、强大的人工智能聊天机器人性能评估工具至关重要。本次调查采用了严格的流程,包括专家参与以确保内容的有效性,以及应用统计测试来评估内部一致性和可靠性。因子分析还揭示了 "内容质量 "和 "表达质量 "这两个重要领域,进一步增强了评价量表的建构效度。研究结果有力地证实了所开发量表的可靠性和有效性,从而标志着人工智能聊天机器人在教育背景下的性能评估领域取得了重大进展。尽管如此,本研究同时强调需要进行更多的验证研究,特别是涉及各种任务和不同人工智能聊天机器人的研究,以进一步证实这些发现。这项研究意义深远,它为从事聊天机器人开发和评估的研究人员和从业人员提供了一个全面、有效的聊天机器人性能评估框架。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Rubric Development and Validation for Assessing Tasks' Solving via AI Chatbots
This research aimed to develop and validate a rubric to assess Artificial Intelligence (AI) chatbots' effectiveness in accomplishing tasks, particularly within educational contexts. Given the rapidly growing integration of AI in various sectors, including education, a systematic and robust tool for evaluating AI chatbot performance is essential. This investigation involved a rigorous process including expert involvement to ensure content validity, as well as the application of statistical tests for assessing internal consistency and reliability. Factor analysis also revealed two significant domains, "Quality of Content" and "Quality of Expression", which further enhanced the construct validity of the evaluation scale. The results from this investigation robustly affirm the reliability and validity of the developed rubric, thus marking a significant advancement in the sphere of AI chatbot performance evaluation within educational contexts. Nonetheless, the study simultaneously emphasizes the requirement for additional validation research, specifically those entailing a variety of tasks and diverse AI chatbots, to further corroborate these findings. The ramifications of this research are profound, offering both researchers and practitioners engaged in chatbot development and evaluation a comprehensive and validated framework for the assessment of chatbot performance.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Electronic Journal of e-Learning
Electronic Journal of e-Learning EDUCATION & EDUCATIONAL RESEARCH-
CiteScore
5.90
自引率
18.20%
发文量
34
审稿时长
20 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信