Exploring the role of artificial intelligence, large language models: Comparing patient-focused information and clinical decision support capabilities to the gynecologic oncology guidelines.
Lee Reicher, Guy Lutsker, Nadav Michaan, Dan Grisaru, Ido Laskov
{"title":"Exploring the role of artificial intelligence, large language models: Comparing patient-focused information and clinical decision support capabilities to the gynecologic oncology guidelines.","authors":"Lee Reicher, Guy Lutsker, Nadav Michaan, Dan Grisaru, Ido Laskov","doi":"10.1002/ijgo.15869","DOIUrl":null,"url":null,"abstract":"<p><p>Gynecologic cancer requires personalized care to improve outcomes. Large language models (LLMs) hold the potential to provide intelligent question-answering with reliable information about medical queries in clear and plain English, which can be understood by both healthcare providers and patients. We aimed to evaluate two freely available LLMs (ChatGPT and Google's Bard) in answering questions regarding the management of gynecologic cancer. The LLMs' performances were evaluated by developing a set questions that addressed common gynecologic oncologic findings from a patient's perspective and more complex questions to elicit recommendations from a clinician's perspective. Each question was presented to the LLM interface, and the responses generated by the artificial intelligence (AI) model were recorded. The responses were assessed based on the adherence to the National Comprehensive Cancer Network and European Society of Gynecological Oncology guidelines. This evaluation aimed to determine the accuracy and appropriateness of the information provided by LLMs. We showed that the models provided largely appropriate responses to questions regarding common cervical cancer screening tests and BRCA-related questions. Less useful answers were received to complex and controversial gynecologic oncology cases, as assessed by reviewing the common guidelines. ChatGPT and Bard lacked knowledge of regional guideline variations, However, it provided practical and multifaceted advice to patients and caregivers regarding the next steps of management and follow up. We conclude that LLMs may have a role as an adjunct informational tool to improve outcomes.</p>","PeriodicalId":14164,"journal":{"name":"International Journal of Gynecology & Obstetrics","volume":" ","pages":"419-427"},"PeriodicalIF":2.6000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11726133/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Gynecology & Obstetrics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1002/ijgo.15869","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/8/20 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"OBSTETRICS & GYNECOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Gynecologic cancer requires personalized care to improve outcomes. Large language models (LLMs) hold the potential to provide intelligent question-answering with reliable information about medical queries in clear and plain English, which can be understood by both healthcare providers and patients. We aimed to evaluate two freely available LLMs (ChatGPT and Google's Bard) in answering questions regarding the management of gynecologic cancer. The LLMs' performances were evaluated by developing a set questions that addressed common gynecologic oncologic findings from a patient's perspective and more complex questions to elicit recommendations from a clinician's perspective. Each question was presented to the LLM interface, and the responses generated by the artificial intelligence (AI) model were recorded. The responses were assessed based on the adherence to the National Comprehensive Cancer Network and European Society of Gynecological Oncology guidelines. This evaluation aimed to determine the accuracy and appropriateness of the information provided by LLMs. We showed that the models provided largely appropriate responses to questions regarding common cervical cancer screening tests and BRCA-related questions. Less useful answers were received to complex and controversial gynecologic oncology cases, as assessed by reviewing the common guidelines. ChatGPT and Bard lacked knowledge of regional guideline variations, However, it provided practical and multifaceted advice to patients and caregivers regarding the next steps of management and follow up. We conclude that LLMs may have a role as an adjunct informational tool to improve outcomes.
期刊介绍:
The International Journal of Gynecology & Obstetrics publishes articles on all aspects of basic and clinical research in the fields of obstetrics and gynecology and related subjects, with emphasis on matters of worldwide interest.