Dimitrios Chatziisaak, Pascal Burri, Moritz Sparn, Dieter Hahnloser, Thomas Steffen, Stephan Bischofberger
{"title":"Concordance of ChatGPT artificial intelligence decision-making in colorectal cancer multidisciplinary meetings: retrospective study.","authors":"Dimitrios Chatziisaak, Pascal Burri, Moritz Sparn, Dieter Hahnloser, Thomas Steffen, Stephan Bischofberger","doi":"10.1093/bjsopen/zraf040","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The objective of this study was to evaluate the concordance between therapeutic recommendations proposed by a multidisciplinary team meeting and those generated by a large language model (ChatGPT) for colorectal cancer. Although multidisciplinary teams represent the 'standard' for decision-making in cancer treatment, they require significant resources and may be susceptible to human bias. Artificial intelligence, particularly large language models such as ChatGPT, has the potential to enhance or optimize the decision-making processes. The present study examines the potential for integrating artificial intelligence into clinical practice by comparing multidisciplinary team decisions with those generated by ChatGPT.</p><p><strong>Methods: </strong>A retrospective, single-centre study was conducted involving consecutive patients with newly diagnosed colorectal cancer discussed at our multidisciplinary team meeting. The pre- and post-therapeutic multidisciplinary team meeting recommendations were assessed for concordance compared with ChatGPT-4.</p><p><strong>Results: </strong>One hundred consecutive patients with newly diagnosed colorectal cancer of all stages were included. In the pretherapeutic discussions, complete concordance was observed in 72.5%, with partial concordance in 10.2% and discordance in 17.3%. For post-therapeutic discussions, the concordance increased to 82.8%; 11.8% of decisions displayed partial concordance and 5.4% demonstrated discordance. Discordance was more frequent in patients older than 77 years and with an American Society of Anesthesiologists classification ≥ III.</p><p><strong>Conclusion: </strong>There is substantial concordance between the recommendations generated by ChatGPT and those provided by traditional multidisciplinary team meetings, indicating the potential utility of artificial intelligence in supporting clinical decision-making for colorectal cancer management.</p>","PeriodicalId":9028,"journal":{"name":"BJS Open","volume":"9 3","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2025-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12056934/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BJS Open","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/bjsopen/zraf040","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"SURGERY","Score":null,"Total":0}
引用次数: 0
Abstract
Background: The objective of this study was to evaluate the concordance between therapeutic recommendations proposed by a multidisciplinary team meeting and those generated by a large language model (ChatGPT) for colorectal cancer. Although multidisciplinary teams represent the 'standard' for decision-making in cancer treatment, they require significant resources and may be susceptible to human bias. Artificial intelligence, particularly large language models such as ChatGPT, has the potential to enhance or optimize the decision-making processes. The present study examines the potential for integrating artificial intelligence into clinical practice by comparing multidisciplinary team decisions with those generated by ChatGPT.
Methods: A retrospective, single-centre study was conducted involving consecutive patients with newly diagnosed colorectal cancer discussed at our multidisciplinary team meeting. The pre- and post-therapeutic multidisciplinary team meeting recommendations were assessed for concordance compared with ChatGPT-4.
Results: One hundred consecutive patients with newly diagnosed colorectal cancer of all stages were included. In the pretherapeutic discussions, complete concordance was observed in 72.5%, with partial concordance in 10.2% and discordance in 17.3%. For post-therapeutic discussions, the concordance increased to 82.8%; 11.8% of decisions displayed partial concordance and 5.4% demonstrated discordance. Discordance was more frequent in patients older than 77 years and with an American Society of Anesthesiologists classification ≥ III.
Conclusion: There is substantial concordance between the recommendations generated by ChatGPT and those provided by traditional multidisciplinary team meetings, indicating the potential utility of artificial intelligence in supporting clinical decision-making for colorectal cancer management.