人工智能驱动的抽象生成:在PRISMA-A框架下使用定制提示评估法学硕士。

IF 3.1 2区 医学 Q1 DENTISTRY, ORAL SURGERY & MEDICINE
Gizem Boztaş Demi̇r, Şule Gökmen, Yağızalp Süküt, Kübra Gülnur Topsakal, Serkan Görgülü
{"title":"人工智能驱动的抽象生成:在PRISMA-A框架下使用定制提示评估法学硕士。","authors":"Gizem Boztaş Demi̇r, Şule Gökmen, Yağızalp Süküt, Kübra Gülnur Topsakal, Serkan Görgülü","doi":"10.1186/s12903-025-06982-4","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>This study aimed to assess and compare ChatGPT-4o and Gemini Pro's ability to generate structured abstracts from full-text systematic reviews and meta-analyses in orthodontics, based on adherence to the PRISMA Abstract (PRISMA-A) Checklist, using a customised prompt developed for this purpose.</p><p><strong>Materials and methods: </strong>A total of 162 full-text systematic reviews and meta-analyses published in Q1-ranked orthodontic journals since January 2019 were included. Each full-text article was processed by ChatGPT-4o and Gemini Pro, using a PRISMA-A Checklist-aligned structured prompt. Outputs were scored using a tailored Overall quality Score OQS derived from 11 PRISMA-A checklist. Inter-rater and time-dependent reliability were assessed with Intraclass Correlation Coefficients (ICCs), and model outputs were compared using Mann-Whitney U tests.</p><p><strong>Results: </strong>Both models yielded satisfactory OQS in generating PRISMA-A checklist compliant abstracts; however, ChatGPT-4o consistently achieved higher scores than Gemini Pro. The most notable differences were observed in the \"Included Studies\" and \"Synthesis of Results\" sections, where ChatGPT-4o produced more complete and structurally coherent outputs. ChatGPT-4o achieved a mean OQS of 21.67 (SD 0.58) versus 21.00 (SD 0.71) for Gemini Pro, a difference that was highly significant (p < 0.001).</p><p><strong>Conclusions: </strong>Both LLMs demonstrated the ability to generate PRISMA-A-compliant abstracts from systematic reviews, with ChatGPT-4o consistently achieving higher quality scores than Gemini Pro. While tested in orthodontics, the approach holds potential for broader applications across evidence-based dental and medical research. Systematic reviews and meta-analyses are essential to evidence-based dentistry but can be challenging and time-consuming to report in accordance with established standards. The structured prompt developed in this study may assist researchers in generating PRISMA-A-compliant outputs more efficiently, helping to accelerate the completion and standardisation of high-level clinical evidence reporting.</p>","PeriodicalId":9072,"journal":{"name":"BMC Oral Health","volume":"25 1","pages":"1594"},"PeriodicalIF":3.1000,"publicationDate":"2025-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12512598/pdf/","citationCount":"0","resultStr":"{\"title\":\"AI-driven abstract generating: evaluating LLMs with a tailored prompt under the PRISMA-A framework.\",\"authors\":\"Gizem Boztaş Demi̇r, Şule Gökmen, Yağızalp Süküt, Kübra Gülnur Topsakal, Serkan Görgülü\",\"doi\":\"10.1186/s12903-025-06982-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>This study aimed to assess and compare ChatGPT-4o and Gemini Pro's ability to generate structured abstracts from full-text systematic reviews and meta-analyses in orthodontics, based on adherence to the PRISMA Abstract (PRISMA-A) Checklist, using a customised prompt developed for this purpose.</p><p><strong>Materials and methods: </strong>A total of 162 full-text systematic reviews and meta-analyses published in Q1-ranked orthodontic journals since January 2019 were included. Each full-text article was processed by ChatGPT-4o and Gemini Pro, using a PRISMA-A Checklist-aligned structured prompt. Outputs were scored using a tailored Overall quality Score OQS derived from 11 PRISMA-A checklist. Inter-rater and time-dependent reliability were assessed with Intraclass Correlation Coefficients (ICCs), and model outputs were compared using Mann-Whitney U tests.</p><p><strong>Results: </strong>Both models yielded satisfactory OQS in generating PRISMA-A checklist compliant abstracts; however, ChatGPT-4o consistently achieved higher scores than Gemini Pro. The most notable differences were observed in the \\\"Included Studies\\\" and \\\"Synthesis of Results\\\" sections, where ChatGPT-4o produced more complete and structurally coherent outputs. ChatGPT-4o achieved a mean OQS of 21.67 (SD 0.58) versus 21.00 (SD 0.71) for Gemini Pro, a difference that was highly significant (p < 0.001).</p><p><strong>Conclusions: </strong>Both LLMs demonstrated the ability to generate PRISMA-A-compliant abstracts from systematic reviews, with ChatGPT-4o consistently achieving higher quality scores than Gemini Pro. While tested in orthodontics, the approach holds potential for broader applications across evidence-based dental and medical research. Systematic reviews and meta-analyses are essential to evidence-based dentistry but can be challenging and time-consuming to report in accordance with established standards. The structured prompt developed in this study may assist researchers in generating PRISMA-A-compliant outputs more efficiently, helping to accelerate the completion and standardisation of high-level clinical evidence reporting.</p>\",\"PeriodicalId\":9072,\"journal\":{\"name\":\"BMC Oral Health\",\"volume\":\"25 1\",\"pages\":\"1594\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2025-10-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12512598/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMC Oral Health\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1186/s12903-025-06982-4\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"DENTISTRY, ORAL SURGERY & MEDICINE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Oral Health","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12903-025-06982-4","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"DENTISTRY, ORAL SURGERY & MEDICINE","Score":null,"Total":0}
引用次数: 0

摘要

背景:本研究旨在评估和比较chatgpt - 40和Gemini Pro在正畸学的全文系统综述和荟萃分析中生成结构化摘要的能力,基于对PRISMA摘要(PRISMA- a)清单的遵守,使用为此目的开发的定制提示。材料和方法:纳入2019年1月以来发表在排名第一的正畸期刊上的162篇全文系统综述和荟萃分析。每篇全文文章由chatgpt - 40和Gemini Pro处理,使用PRISMA-A checklist对齐的结构化提示。使用从11个PRISMA-A清单中得出的量身定制的总体质量评分OQS对产出进行评分。用类内相关系数(ICCs)评估评分者间和时间依赖的信度,并使用Mann-Whitney U检验比较模型输出。结果:两种模型在生成符合PRISMA-A检查表的摘要时均获得了满意的OQS;然而,chatgpt - 40的得分始终高于Gemini Pro。最显著的差异出现在“纳入研究”和“结果综合”部分,其中chatgpt - 40产生了更完整和结构连贯的输出。chatgpt - 40的平均OQS为21.67 (SD 0.58),而Gemini Pro的平均OQS为21.00 (SD 0.71),这是一个非常显著的差异(p)。结论:两个LLMs都证明了从系统评价中生成符合prisma - a标准的摘要的能力,chatgpt - 40始终比Gemini Pro获得更高的质量分数。虽然在正畸学中进行了测试,但该方法在基于证据的牙科和医学研究中具有更广泛应用的潜力。系统评价和荟萃分析对循证牙科至关重要,但根据既定标准进行报告可能具有挑战性且耗时。本研究开发的结构化提示可以帮助研究人员更有效地生成符合prisma - a的输出,有助于加快高水平临床证据报告的完成和标准化。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
AI-driven abstract generating: evaluating LLMs with a tailored prompt under the PRISMA-A framework.

Background: This study aimed to assess and compare ChatGPT-4o and Gemini Pro's ability to generate structured abstracts from full-text systematic reviews and meta-analyses in orthodontics, based on adherence to the PRISMA Abstract (PRISMA-A) Checklist, using a customised prompt developed for this purpose.

Materials and methods: A total of 162 full-text systematic reviews and meta-analyses published in Q1-ranked orthodontic journals since January 2019 were included. Each full-text article was processed by ChatGPT-4o and Gemini Pro, using a PRISMA-A Checklist-aligned structured prompt. Outputs were scored using a tailored Overall quality Score OQS derived from 11 PRISMA-A checklist. Inter-rater and time-dependent reliability were assessed with Intraclass Correlation Coefficients (ICCs), and model outputs were compared using Mann-Whitney U tests.

Results: Both models yielded satisfactory OQS in generating PRISMA-A checklist compliant abstracts; however, ChatGPT-4o consistently achieved higher scores than Gemini Pro. The most notable differences were observed in the "Included Studies" and "Synthesis of Results" sections, where ChatGPT-4o produced more complete and structurally coherent outputs. ChatGPT-4o achieved a mean OQS of 21.67 (SD 0.58) versus 21.00 (SD 0.71) for Gemini Pro, a difference that was highly significant (p < 0.001).

Conclusions: Both LLMs demonstrated the ability to generate PRISMA-A-compliant abstracts from systematic reviews, with ChatGPT-4o consistently achieving higher quality scores than Gemini Pro. While tested in orthodontics, the approach holds potential for broader applications across evidence-based dental and medical research. Systematic reviews and meta-analyses are essential to evidence-based dentistry but can be challenging and time-consuming to report in accordance with established standards. The structured prompt developed in this study may assist researchers in generating PRISMA-A-compliant outputs more efficiently, helping to accelerate the completion and standardisation of high-level clinical evidence reporting.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
BMC Oral Health
BMC Oral Health DENTISTRY, ORAL SURGERY & MEDICINE-
CiteScore
3.90
自引率
6.90%
发文量
481
审稿时长
6-12 weeks
期刊介绍: BMC Oral Health is an open access, peer-reviewed journal that considers articles on all aspects of the prevention, diagnosis and management of disorders of the mouth, teeth and gums, as well as related molecular genetics, pathophysiology, and epidemiology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信