Emily Rinderknecht, Anna Schmelzer, Anton Kravchuk, Christopher Goßler, Johannes Breyer, Christian Gilfrich, Maximilian Burger, Simon Engelmann, Veronika Saberi, Clemens Kirschner, Dominik von Winning, Roman Mayr, Christian Wülfing, Hendrik Borgmann, Stephan Buse, Maximilian Haas, Matthias May
{"title":"利用大语言模型的高质量Lay摘要:ChatGPT-4与自定义提示在连续系列前列腺癌手稿的功效。","authors":"Emily Rinderknecht, Anna Schmelzer, Anton Kravchuk, Christopher Goßler, Johannes Breyer, Christian Gilfrich, Maximilian Burger, Simon Engelmann, Veronika Saberi, Clemens Kirschner, Dominik von Winning, Roman Mayr, Christian Wülfing, Hendrik Borgmann, Stephan Buse, Maximilian Haas, Matthias May","doi":"10.3390/curroncol32020102","DOIUrl":null,"url":null,"abstract":"<p><p>Clear and accessible lay summaries are essential for enhancing the public understanding of scientific knowledge. This study aimed to evaluate whether ChatGPT-4 can generate high-quality lay summaries that are both accurate and comprehensible for prostate cancer research in <i>Current Oncology</i>. To achieve this, it systematically assessed ChatGPT-4's ability to summarize 80 prostate cancer articles published in the journal between July 2022 and June 2024 using two distinct prompt designs: a basic \"simple\" prompt and an enhanced \"extended\" prompt. Readability was assessed using established metrics, including the Flesch-Kincaid Reading Ease (FKRE), while content quality was evaluated with a 5-point Likert scale for alignment with source material. The extended prompt demonstrated significantly higher readability (median FKRE: 40.9 vs. 29.1, <i>p</i> < 0.001), better alignment with quality thresholds (86.2% vs. 47.5%, <i>p</i> < 0.001), and reduced the required reading level, making content more accessible. Both prompt designs produced content with high comprehensiveness (median Likert score: 5). This study highlights the critical role of tailored prompt engineering in optimizing large language models (LLMs) for medical communication. Limitations include the exclusive focus on prostate cancer, the use of predefined prompts without iterative refinement, and the absence of a direct comparison with human-crafted summaries. These findings underscore the transformative potential of LLMs like ChatGPT-4 to streamline the creation of lay summaries, reduce researchers' workload, and enhance public engagement. Future research should explore prompt variability, incorporate patient feedback, and extend applications across broader medical domains.</p>","PeriodicalId":11012,"journal":{"name":"Current oncology","volume":"32 2","pages":""},"PeriodicalIF":2.8000,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11854015/pdf/","citationCount":"0","resultStr":"{\"title\":\"Leveraging Large Language Models for High-Quality Lay Summaries: Efficacy of ChatGPT-4 with Custom Prompts in a Consecutive Series of Prostate Cancer Manuscripts.\",\"authors\":\"Emily Rinderknecht, Anna Schmelzer, Anton Kravchuk, Christopher Goßler, Johannes Breyer, Christian Gilfrich, Maximilian Burger, Simon Engelmann, Veronika Saberi, Clemens Kirschner, Dominik von Winning, Roman Mayr, Christian Wülfing, Hendrik Borgmann, Stephan Buse, Maximilian Haas, Matthias May\",\"doi\":\"10.3390/curroncol32020102\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Clear and accessible lay summaries are essential for enhancing the public understanding of scientific knowledge. This study aimed to evaluate whether ChatGPT-4 can generate high-quality lay summaries that are both accurate and comprehensible for prostate cancer research in <i>Current Oncology</i>. To achieve this, it systematically assessed ChatGPT-4's ability to summarize 80 prostate cancer articles published in the journal between July 2022 and June 2024 using two distinct prompt designs: a basic \\\"simple\\\" prompt and an enhanced \\\"extended\\\" prompt. Readability was assessed using established metrics, including the Flesch-Kincaid Reading Ease (FKRE), while content quality was evaluated with a 5-point Likert scale for alignment with source material. The extended prompt demonstrated significantly higher readability (median FKRE: 40.9 vs. 29.1, <i>p</i> < 0.001), better alignment with quality thresholds (86.2% vs. 47.5%, <i>p</i> < 0.001), and reduced the required reading level, making content more accessible. Both prompt designs produced content with high comprehensiveness (median Likert score: 5). This study highlights the critical role of tailored prompt engineering in optimizing large language models (LLMs) for medical communication. Limitations include the exclusive focus on prostate cancer, the use of predefined prompts without iterative refinement, and the absence of a direct comparison with human-crafted summaries. These findings underscore the transformative potential of LLMs like ChatGPT-4 to streamline the creation of lay summaries, reduce researchers' workload, and enhance public engagement. Future research should explore prompt variability, incorporate patient feedback, and extend applications across broader medical domains.</p>\",\"PeriodicalId\":11012,\"journal\":{\"name\":\"Current oncology\",\"volume\":\"32 2\",\"pages\":\"\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2025-02-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11854015/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Current oncology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.3390/curroncol32020102\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ONCOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Current oncology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3390/curroncol32020102","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
摘要
清晰易懂的外行摘要对于增进公众对科学知识的理解至关重要。本研究旨在评估ChatGPT-4是否能够为《Current Oncology》的前列腺癌研究生成高质量的、准确的、可理解的概要。为了实现这一目标,它系统地评估了ChatGPT-4总结2022年7月至2024年6月期间发表在该杂志上的80篇前列腺癌文章的能力,使用了两种不同的提示设计:基本的“简单”提示和增强的“扩展”提示。可读性使用已建立的指标进行评估,包括Flesch-Kincaid Reading Ease (FKRE),而内容质量则使用李克特5分制来评估与原始材料的一致性。扩展提示显示出更高的可读性(FKRE中位数:40.9 vs. 29.1, p < 0.001),更好地符合质量阈值(86.2% vs. 47.5%, p < 0.001),并降低了所需的阅读水平,使内容更易于访问。两种提示设计都产生了高综合性的内容(Likert中位数得分:5)。本研究强调了定制提示工程在优化医学交流大型语言模型(llm)中的关键作用。局限性包括只关注前列腺癌,使用预定义的提示而没有迭代改进,以及缺乏与人工制作的摘要的直接比较。这些发现强调了像ChatGPT-4这样的法学硕士在简化外行摘要的创建、减少研究人员的工作量和提高公众参与度方面的变革潜力。未来的研究应该探索即时变异性,纳入患者反馈,并将应用扩展到更广泛的医学领域。
Leveraging Large Language Models for High-Quality Lay Summaries: Efficacy of ChatGPT-4 with Custom Prompts in a Consecutive Series of Prostate Cancer Manuscripts.
Clear and accessible lay summaries are essential for enhancing the public understanding of scientific knowledge. This study aimed to evaluate whether ChatGPT-4 can generate high-quality lay summaries that are both accurate and comprehensible for prostate cancer research in Current Oncology. To achieve this, it systematically assessed ChatGPT-4's ability to summarize 80 prostate cancer articles published in the journal between July 2022 and June 2024 using two distinct prompt designs: a basic "simple" prompt and an enhanced "extended" prompt. Readability was assessed using established metrics, including the Flesch-Kincaid Reading Ease (FKRE), while content quality was evaluated with a 5-point Likert scale for alignment with source material. The extended prompt demonstrated significantly higher readability (median FKRE: 40.9 vs. 29.1, p < 0.001), better alignment with quality thresholds (86.2% vs. 47.5%, p < 0.001), and reduced the required reading level, making content more accessible. Both prompt designs produced content with high comprehensiveness (median Likert score: 5). This study highlights the critical role of tailored prompt engineering in optimizing large language models (LLMs) for medical communication. Limitations include the exclusive focus on prostate cancer, the use of predefined prompts without iterative refinement, and the absence of a direct comparison with human-crafted summaries. These findings underscore the transformative potential of LLMs like ChatGPT-4 to streamline the creation of lay summaries, reduce researchers' workload, and enhance public engagement. Future research should explore prompt variability, incorporate patient feedback, and extend applications across broader medical domains.
期刊介绍:
Current Oncology is a peer-reviewed, Canadian-based and internationally respected journal. Current Oncology represents a multidisciplinary medium encompassing health care workers in the field of cancer therapy in Canada to report upon and to review progress in the management of this disease.
We encourage submissions from all fields of cancer medicine, including radiation oncology, surgical oncology, medical oncology, pediatric oncology, pathology, and cancer rehabilitation and survivorship. Articles published in the journal typically contain information that is relevant directly to clinical oncology practice, and have clear potential for application to the current or future practice of cancer medicine.