Leveraging artificial intelligence chatbots for anemia prevention: A comparative study of ChatGPT-3.5, copilot, and Gemini outputs against Google Search results
{"title":"Leveraging artificial intelligence chatbots for anemia prevention: A comparative study of ChatGPT-3.5, copilot, and Gemini outputs against Google Search results","authors":"Shinya Ito , Emi Furukawa , Tsuyoshi Okuhara , Hiroko Okada , Takahiro Kiuchi","doi":"10.1016/j.pecinn.2025.100390","DOIUrl":null,"url":null,"abstract":"<div><h3>Aim</h3><div>This study evaluated the understandability, actionability, and readability of text on anemia generated by artificial intelligence (AI) chatbots.</div></div><div><h3>Methods</h3><div>This cross-sectional study compared texts generated by ChatGPT-3.5, Microsoft Copilot, and Google Gemini at three levels: “normal,” “6th grade,” and “PEMAT-P version.” Additionally, texts retrieved from the top eight Google Search results for relevant keywords were included for comparison. All texts were written in Japanese. The Japanese version of the PEMAT-P was used to assess understandability and actionability, while jReadability was used for readability. A systematic comparison was conducted to identify the strengths and weaknesses of each source.</div></div><div><h3>Results</h3><div>Texts generated by Gemini at the 6th-grade level (<em>n</em> = 26, 86.7 %) and PEMAT-P version (<em>n</em> = 27, 90.0 %), as well as ChatGPT-3.5 at the normal level (<em>n</em> = 21, 80.8 %), achieved significantly higher scores (≥70 %) for understandability and actionability compared to Google Search results (<em>n</em> = 17, 25.4 %, <em>p</em> < 0.001). For readability, Copilot and Gemini texts demonstrated significantly higher percentages of “very readable” to “somewhat difficult” levels than texts retrieved from Google Search (<em>p</em> = 0.000–0.007).</div></div><div><h3>Innovation</h3><div>This study is the first to objectively and quantitatively evaluate the understandability and actionability of educational materials on anemia prevention. By utilizing PEMAT-P and jReadability, the study demonstrated the superiority of Gemini in terms of understandability and readability through measurable data. This innovative approach highlights the potential of AI chatbots as a novel method for providing public health information and addressing health disparities.</div></div><div><h3>Conclusion</h3><div>AI-generated texts on anemia were found to be more readable and easier to understand than traditional web-based texts, with Gemini demonstrating the highest level of understandability. Moving forward, improvements in prompts will be necessary to enhance the integration of visual elements that encourage actionable responses in AI chatbots.</div></div>","PeriodicalId":74407,"journal":{"name":"PEC innovation","volume":"6 ","pages":"Article 100390"},"PeriodicalIF":0.0000,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PEC innovation","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772628225000196","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Aim
This study evaluated the understandability, actionability, and readability of text on anemia generated by artificial intelligence (AI) chatbots.
Methods
This cross-sectional study compared texts generated by ChatGPT-3.5, Microsoft Copilot, and Google Gemini at three levels: “normal,” “6th grade,” and “PEMAT-P version.” Additionally, texts retrieved from the top eight Google Search results for relevant keywords were included for comparison. All texts were written in Japanese. The Japanese version of the PEMAT-P was used to assess understandability and actionability, while jReadability was used for readability. A systematic comparison was conducted to identify the strengths and weaknesses of each source.
Results
Texts generated by Gemini at the 6th-grade level (n = 26, 86.7 %) and PEMAT-P version (n = 27, 90.0 %), as well as ChatGPT-3.5 at the normal level (n = 21, 80.8 %), achieved significantly higher scores (≥70 %) for understandability and actionability compared to Google Search results (n = 17, 25.4 %, p < 0.001). For readability, Copilot and Gemini texts demonstrated significantly higher percentages of “very readable” to “somewhat difficult” levels than texts retrieved from Google Search (p = 0.000–0.007).
Innovation
This study is the first to objectively and quantitatively evaluate the understandability and actionability of educational materials on anemia prevention. By utilizing PEMAT-P and jReadability, the study demonstrated the superiority of Gemini in terms of understandability and readability through measurable data. This innovative approach highlights the potential of AI chatbots as a novel method for providing public health information and addressing health disparities.
Conclusion
AI-generated texts on anemia were found to be more readable and easier to understand than traditional web-based texts, with Gemini demonstrating the highest level of understandability. Moving forward, improvements in prompts will be necessary to enhance the integration of visual elements that encourage actionable responses in AI chatbots.