Piyanut Xuto, Piyaporn Prasitwattanaseree, Tareewan Chaiboonruang, Sujitra Chaiwuth, Podjanee Khwanngern, Chadchadaporn Nuntakwang, Karnjana Nimarangkul, Wara Suwansin, Lawitra Khiaokham, Daniel Bressington
{"title":"Development and Evaluation of an AI-Assisted Answer Assessment (4A) for Cognitive Assessments in Nursing Education.","authors":"Piyanut Xuto, Piyaporn Prasitwattanaseree, Tareewan Chaiboonruang, Sujitra Chaiwuth, Podjanee Khwanngern, Chadchadaporn Nuntakwang, Karnjana Nimarangkul, Wara Suwansin, Lawitra Khiaokham, Daniel Bressington","doi":"10.3390/nursrep15030080","DOIUrl":null,"url":null,"abstract":"<p><p>Artificial intelligence (AI) can potentially enhance cognitive assessment practices in maternal and child health nursing education. <b>Objectives</b>: To evaluate the reliability, accuracy and precision, and external validity of an AI-assisted answer assessment (4A) program for cognitive assessments in nursing education. <b>Methods</b>: This study is a validation study. Initially, 170 nursing students from northern Thailand participated, with 52 randomly selected for detailed testing. Agreement testing between the 4A program and human experts was conducted using the intraclass correlation coefficient (ICC). Accuracy and precision testing compared 4A scores with human expert assessments via the McNemar test. External validation involved 138 participants to compare the 4A program's assessments against national examination outcomes using logistic regression. <b>Results</b>: Results indicated a high level of consistency between the 4A program and human experts (ICC = 0.886). With an accuracy of 0.808 and a precision of 0.913, compared to the human expert's accuracy of 0.923 and precision of 1.000. The McNemar test (χ<sup>2</sup> = 0.4, <i>p</i> = 0.527) showed no significant difference in evaluation performance between AI and human experts. Higher scores on the 4A program significantly predicted success in the national nursing examination (OR: 1.124, <i>p</i> = 0.031). <b>Conclusions</b>: The 4A program demonstrates potential in reliably assessing nursing students' cognitive abilities and predicting exam success. This study advocates for the continued integration of AI in educational assessments and the importance of refining AI systems to better align with traditional assessment methods.</p>","PeriodicalId":40753,"journal":{"name":"Nursing Reports","volume":"15 3","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11945599/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nursing Reports","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/nursrep15030080","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"NURSING","Score":null,"Total":0}
引用次数: 0
Abstract
Artificial intelligence (AI) can potentially enhance cognitive assessment practices in maternal and child health nursing education. Objectives: To evaluate the reliability, accuracy and precision, and external validity of an AI-assisted answer assessment (4A) program for cognitive assessments in nursing education. Methods: This study is a validation study. Initially, 170 nursing students from northern Thailand participated, with 52 randomly selected for detailed testing. Agreement testing between the 4A program and human experts was conducted using the intraclass correlation coefficient (ICC). Accuracy and precision testing compared 4A scores with human expert assessments via the McNemar test. External validation involved 138 participants to compare the 4A program's assessments against national examination outcomes using logistic regression. Results: Results indicated a high level of consistency between the 4A program and human experts (ICC = 0.886). With an accuracy of 0.808 and a precision of 0.913, compared to the human expert's accuracy of 0.923 and precision of 1.000. The McNemar test (χ2 = 0.4, p = 0.527) showed no significant difference in evaluation performance between AI and human experts. Higher scores on the 4A program significantly predicted success in the national nursing examination (OR: 1.124, p = 0.031). Conclusions: The 4A program demonstrates potential in reliably assessing nursing students' cognitive abilities and predicting exam success. This study advocates for the continued integration of AI in educational assessments and the importance of refining AI systems to better align with traditional assessment methods.
期刊介绍:
Nursing Reports is an open access, peer-reviewed, online-only journal that aims to influence the art and science of nursing by making rigorously conducted research accessible and understood to the full spectrum of practicing nurses, academics, educators and interested members of the public. The journal represents an exhilarating opportunity to make a unique and significant contribution to nursing and the wider community by addressing topics, theories and issues that concern the whole field of Nursing Science, including research, practice, policy and education. The primary intent of the journal is to present scientifically sound and influential empirical and theoretical studies, critical reviews and open debates to the global community of nurses. Short reports, opinions and insight into the plight of nurses the world-over will provide a voice for those of all cultures, governments and perspectives. The emphasis of Nursing Reports will be on ensuring that the highest quality of evidence and contribution is made available to the greatest number of nurses. Nursing Reports aims to make original, evidence-based, peer-reviewed research available to the global community of nurses and to interested members of the public. In addition, reviews of the literature, open debates on professional issues and short reports from around the world are invited to contribute to our vibrant and dynamic journal. All published work will adhere to the most stringent ethical standards and journalistic principles of fairness, worth and credibility. Our journal publishes Editorials, Original Articles, Review articles, Critical Debates, Short Reports from Around the Globe and Letters to the Editor.