Ji Hyun Chang, Amir Ashraf-Ganjouei, Isabel Friesner, Ryzen Benson, Travis Zack, Sumi Sinha, Jason Chan, Steve Braunstein, Amy Lin, Lisa Singer, Julian C Hong
{"title":"Unsupervised Large Language Models to Identify Topics in Cancer Center Patient Portal Messages.","authors":"Ji Hyun Chang, Amir Ashraf-Ganjouei, Isabel Friesner, Ryzen Benson, Travis Zack, Sumi Sinha, Jason Chan, Steve Braunstein, Amy Lin, Lisa Singer, Julian C Hong","doi":"10.1200/CCI-25-00102","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>The increasing use of patient portal messages has enhanced patient-provider communication. However, the high volume of these messages has also contributed to physician burnout.</p><p><strong>Methods: </strong>Patient-generated portal messages sent to a single cancer center from 2011 to 2023 were extracted. BERTopic, a natural language processing topic modeling technique based on large language models, was optimized. For further categorization, the topic words were labeled using GPT-4, followed by review by two oncologists. Uniform Manifold Approximation and Projection was used for dimensionality reduction and visualizing topics. Message volume changes over time were assessed using a Student's <i>t</i> test.</p><p><strong>Results: </strong>A total of 2,280,851 messages were analyzed. The monthly average number of messages increased from 2,071 in 2012 to 43,430 in 2022 (<i>P</i> < .001). There was a significant rise in message volume after the COVID-19 pandemic, with a posterior probability of a causal effect of 96.4% (<i>P</i> = .04). Scheduling-related messages were the most frequent across departments, whereas symptoms and health concerns were second or third most common topics. In medical oncology and surgical oncology, topics on prescriptions and medications were more common compared with radiation oncology and gynecologic oncology. Despite concurrent institutional changes in self-scheduling systems, scheduling-related messages did not decrease over time.</p><p><strong>Conclusion: </strong>The substantial increase in patient portal messages, particularly scheduling-related inquiries, underscores the need for streamlined communication to reduce the burden on health care providers. These findings highlight the need for strategies to manage message volume and mitigate physician burnout, laying groundwork for artificial intelligence-driven future triage systems to improve message management and patient care.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2500102"},"PeriodicalIF":2.8000,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12490804/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JCO Clinical Cancer Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1200/CCI-25-00102","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: The increasing use of patient portal messages has enhanced patient-provider communication. However, the high volume of these messages has also contributed to physician burnout.
Methods: Patient-generated portal messages sent to a single cancer center from 2011 to 2023 were extracted. BERTopic, a natural language processing topic modeling technique based on large language models, was optimized. For further categorization, the topic words were labeled using GPT-4, followed by review by two oncologists. Uniform Manifold Approximation and Projection was used for dimensionality reduction and visualizing topics. Message volume changes over time were assessed using a Student's t test.
Results: A total of 2,280,851 messages were analyzed. The monthly average number of messages increased from 2,071 in 2012 to 43,430 in 2022 (P < .001). There was a significant rise in message volume after the COVID-19 pandemic, with a posterior probability of a causal effect of 96.4% (P = .04). Scheduling-related messages were the most frequent across departments, whereas symptoms and health concerns were second or third most common topics. In medical oncology and surgical oncology, topics on prescriptions and medications were more common compared with radiation oncology and gynecologic oncology. Despite concurrent institutional changes in self-scheduling systems, scheduling-related messages did not decrease over time.
Conclusion: The substantial increase in patient portal messages, particularly scheduling-related inquiries, underscores the need for streamlined communication to reduce the burden on health care providers. These findings highlight the need for strategies to manage message volume and mitigate physician burnout, laying groundwork for artificial intelligence-driven future triage systems to improve message management and patient care.