生成式预训练变换器在生成牙科科学摘要方面的性能：比较观察研究

IF 1.9 4区教育学 Q3 DENTISTRY, ORAL SURGERY & MEDICINE

European Journal of Dental Education Pub Date : 2024-11-19 DOI:10.1111/eje.13057

Caio Alencar-Palha, Thais Ocampo, Thaisa Pinheiro Silva, Frederico Sampaio Neves, Matheus L. Oliveira

{"title":"生成式预训练变换器在生成牙科科学摘要方面的性能：比较观察研究","authors":"Caio Alencar-Palha, Thais Ocampo, Thaisa Pinheiro Silva, Frederico Sampaio Neves, Matheus L. Oliveira","doi":"10.1111/eje.13057","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Objectives</h3>\n \n <p>To evaluate the performance of a Generative Pre-trained Transformer (GPT) in generating scientific abstracts in dentistry.</p>\n </section>\n \n <section>\n \n <h3> Methods</h3>\n \n <p>Ten scientific articles in dental radiology had their original abstracts collected, while another 10 articles had their methodology and results added to a ChatGPT prompt to generate an abstract. All abstracts were randomised and compiled into a single file for subsequent assessment. Five evaluators classified whether the abstract was generated by a human using a 5-point scale and provided justifications within seven aspects: formatting, information accuracy, orthography, punctuation, terminology, text fluency, and writing style. Furthermore, an online GPT detector provided “Human Score” values, and a plagiarism detector assessed similarity with existing literature.</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>Sensitivity values for detecting human writing ranged from 0.20 to 0.70, with a mean of 0.58; specificity values ranged from 0.40 to 0.90, with a mean of 0.62; and accuracy values ranged from 0.50 to 0.80, with a mean of 0.60. Orthography and Punctuation were the most indicated aspects for the abstract generated by ChatGPT. The GPT detector revealed confidence levels for a “Human Score” of 16.9% for the AI-generated texts and plagiarism levels averaging 35%.</p>\n </section>\n \n <section>\n \n <h3> Conclusion</h3>\n \n <p>The GPT exhibited commendable performance in generating scientific abstracts when evaluated by humans, as the generated abstracts were indistinguishable from those generated by humans. When evaluated by an online GPT detector, the use of GPT became apparent.</p>\n </section>\n </div>","PeriodicalId":50488,"journal":{"name":"European Journal of Dental Education","volume":"29 1","pages":"149-154"},"PeriodicalIF":1.9000,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Performance of a Generative Pre-Trained Transformer in Generating Scientific Abstracts in Dentistry: A Comparative Observational Study\",\"authors\":\"Caio Alencar-Palha, Thais Ocampo, Thaisa Pinheiro Silva, Frederico Sampaio Neves, Matheus L. Oliveira\",\"doi\":\"10.1111/eje.13057\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n \\n <section>\\n \\n <h3> Objectives</h3>\\n \\n <p>To evaluate the performance of a Generative Pre-trained Transformer (GPT) in generating scientific abstracts in dentistry.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Methods</h3>\\n \\n <p>Ten scientific articles in dental radiology had their original abstracts collected, while another 10 articles had their methodology and results added to a ChatGPT prompt to generate an abstract. All abstracts were randomised and compiled into a single file for subsequent assessment. Five evaluators classified whether the abstract was generated by a human using a 5-point scale and provided justifications within seven aspects: formatting, information accuracy, orthography, punctuation, terminology, text fluency, and writing style. Furthermore, an online GPT detector provided “Human Score” values, and a plagiarism detector assessed similarity with existing literature.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Results</h3>\\n \\n <p>Sensitivity values for detecting human writing ranged from 0.20 to 0.70, with a mean of 0.58; specificity values ranged from 0.40 to 0.90, with a mean of 0.62; and accuracy values ranged from 0.50 to 0.80, with a mean of 0.60. Orthography and Punctuation were the most indicated aspects for the abstract generated by ChatGPT. The GPT detector revealed confidence levels for a “Human Score” of 16.9% for the AI-generated texts and plagiarism levels averaging 35%.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Conclusion</h3>\\n \\n <p>The GPT exhibited commendable performance in generating scientific abstracts when evaluated by humans, as the generated abstracts were indistinguishable from those generated by humans. When evaluated by an online GPT detector, the use of GPT became apparent.</p>\\n </section>\\n </div>\",\"PeriodicalId\":50488,\"journal\":{\"name\":\"European Journal of Dental Education\",\"volume\":\"29 1\",\"pages\":\"149-154\"},\"PeriodicalIF\":1.9000,\"publicationDate\":\"2024-11-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"European Journal of Dental Education\",\"FirstCategoryId\":\"95\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/eje.13057\",\"RegionNum\":4,\"RegionCategory\":\"教育学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"DENTISTRY, ORAL SURGERY & MEDICINE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Dental Education","FirstCategoryId":"95","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/eje.13057","RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"DENTISTRY, ORAL SURGERY & MEDICINE","Score":null,"Total":0}

引用次数: 0

摘要

目的：评估生成式预训练变换器（GPT）在生成牙科科学摘要方面的性能：评估生成式预训练变换器（GPT）在生成牙科科学摘要方面的性能：方法：收集了 10 篇牙科放射学科学文章的原始摘要，并将另外 10 篇文章的方法和结果添加到 ChatGPT 提示中以生成摘要。所有摘要都经过随机化处理，并汇编成一个文件供后续评估使用。五位评估员采用 5 级评分法对摘要是否由人工生成进行分类，并从格式、信息准确性、正字法、标点符号、术语、文字流畅性和写作风格等七个方面提供理由。此外，在线 GPT 检测器提供了 "人类得分 "值，剽窃检测器评估了与现有文献的相似性：检测人类写作的灵敏度值在 0.20 至 0.70 之间，平均值为 0.58；特异度值在 0.40 至 0.90 之间，平均值为 0.62；准确度值在 0.50 至 0.80 之间，平均值为 0.60。正字法和标点符号是 ChatGPT 生成的摘要中显示最多的方面。GPT 检测器显示，人工智能生成文本的 "人类得分 "置信度为 16.9%，抄袭率平均为 35%：在由人类进行评估时，GPT 在生成科学摘要方面的表现值得称赞，因为生成的摘要与人类生成的摘要没有区别。当在线 GPT 检测器对其进行评估时，GPT 的使用变得显而易见。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Performance of a Generative Pre-Trained Transformer in Generating Scientific Abstracts in Dentistry: A Comparative Observational Study

Objectives

To evaluate the performance of a Generative Pre-trained Transformer (GPT) in generating scientific abstracts in dentistry.

Methods

Ten scientific articles in dental radiology had their original abstracts collected, while another 10 articles had their methodology and results added to a ChatGPT prompt to generate an abstract. All abstracts were randomised and compiled into a single file for subsequent assessment. Five evaluators classified whether the abstract was generated by a human using a 5-point scale and provided justifications within seven aspects: formatting, information accuracy, orthography, punctuation, terminology, text fluency, and writing style. Furthermore, an online GPT detector provided “Human Score” values, and a plagiarism detector assessed similarity with existing literature.

Results

Sensitivity values for detecting human writing ranged from 0.20 to 0.70, with a mean of 0.58; specificity values ranged from 0.40 to 0.90, with a mean of 0.62; and accuracy values ranged from 0.50 to 0.80, with a mean of 0.60. Orthography and Punctuation were the most indicated aspects for the abstract generated by ChatGPT. The GPT detector revealed confidence levels for a “Human Score” of 16.9% for the AI-generated texts and plagiarism levels averaging 35%.

Conclusion

The GPT exhibited commendable performance in generating scientific abstracts when evaluated by humans, as the generated abstracts were indistinguishable from those generated by humans. When evaluated by an online GPT detector, the use of GPT became apparent.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

European Journal of Dental Education 医学-学科教育

CiteScore

4.10

自引率

16.70%

发文量

127

审稿时长

6-12 weeks

期刊介绍： The aim of the European Journal of Dental Education is to publish original topical and review articles of the highest quality in the field of Dental Education. The Journal seeks to disseminate widely the latest information on curriculum development teaching methodologies assessment techniques and quality assurance in the fields of dental undergraduate and postgraduate education and dental auxiliary personnel training. The scope includes the dental educational aspects of the basic medical sciences the behavioural sciences the interface with medical education information technology and distance learning and educational audit. Papers embodying the results of high-quality educational research of relevance to dentistry are particularly encouraged as are evidence-based reports of novel and established educational programmes and their outcomes.