{"title":"Elucidating Celecoxib's Preventive Effect in Capecitabine-Induced Hand-Foot Syndrome Using Medical Natural Language Processing.","authors":"Masami Tsuchiya, Yoshimasa Kawazoe, Kiminori Shimamoto, Tomohisa Seki, Shungo Imai, Hayato Kizaki, Emiko Shinohara, Shuntaro Yada, Shoko Wakamiya, Eiji Aramaki, Satoko Hori","doi":"10.1200/CCI-25-00096","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Capecitabine, an oral anticancer agent, frequently causes hand-foot syndrome (HFS), affecting patients' quality of life and treatment adherence. However, such symptomatic toxicities are often difficult to detect in structured electronic health record (EHR) data. This study primarily aimed to validate a natural language processing (NLP) approach to identifying capecitabine-induced HFS from unstructured clinical text and demonstrate its application in evaluating medication-associated adverse event trends in real-world settings.</p><p><strong>Methods: </strong>We conducted a retrospective cohort study using EHRs from the University of Tokyo Hospital (2004-2021). HFS cases were identified using the MedNERN-CR-JA NLP model. After propensity score matching, we compared capecitabine users with and without celecoxib and assessed time to HFS onset using Cox proportional hazards models. NLP-based HFS detection was validated through manual annotation of aggregated clinical notes. Negative control and sensitivity analyses ensured robustness.</p><p><strong>Results: </strong>Among 44,502 patients with cancer, 669 capecitabine users were analyzed. HFS incidence was significantly higher among capecitabine users (hazard ratio [HR], 1.93 [95% CI, 1.48 to 2.52]; <i>P</i> < .001) compared with nonusers. Celecoxib use showed a suggestive association with a reduced HFS risk (HR, 0.51 [95% CI, 0.24 to 1.07]; <i>P</i> = .073). The NLP model demonstrated high accuracy in identifying HFS, achieving a precision of 0.875, recall of 1.000, and F<sub>1</sub> score of 0.933, based on manual annotation of patient-level clinical notes. Outcome trends remained consistent when using manually annotated HFS case labels instead of NLP-detected events, supporting the method's robustness.</p><p><strong>Conclusion: </strong>These findings demonstrate the effectiveness of NLP in detecting HFS from real-world clinical records. The application to celecoxib-HFS detection illustrates the potential utility of this approach for retrospective safety analysis. Further work is needed to evaluate generalizability across diverse clinical settings.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2500096"},"PeriodicalIF":2.8000,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12341754/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JCO Clinical Cancer Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1200/CCI-25-00096","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/8/12 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: Capecitabine, an oral anticancer agent, frequently causes hand-foot syndrome (HFS), affecting patients' quality of life and treatment adherence. However, such symptomatic toxicities are often difficult to detect in structured electronic health record (EHR) data. This study primarily aimed to validate a natural language processing (NLP) approach to identifying capecitabine-induced HFS from unstructured clinical text and demonstrate its application in evaluating medication-associated adverse event trends in real-world settings.
Methods: We conducted a retrospective cohort study using EHRs from the University of Tokyo Hospital (2004-2021). HFS cases were identified using the MedNERN-CR-JA NLP model. After propensity score matching, we compared capecitabine users with and without celecoxib and assessed time to HFS onset using Cox proportional hazards models. NLP-based HFS detection was validated through manual annotation of aggregated clinical notes. Negative control and sensitivity analyses ensured robustness.
Results: Among 44,502 patients with cancer, 669 capecitabine users were analyzed. HFS incidence was significantly higher among capecitabine users (hazard ratio [HR], 1.93 [95% CI, 1.48 to 2.52]; P < .001) compared with nonusers. Celecoxib use showed a suggestive association with a reduced HFS risk (HR, 0.51 [95% CI, 0.24 to 1.07]; P = .073). The NLP model demonstrated high accuracy in identifying HFS, achieving a precision of 0.875, recall of 1.000, and F1 score of 0.933, based on manual annotation of patient-level clinical notes. Outcome trends remained consistent when using manually annotated HFS case labels instead of NLP-detected events, supporting the method's robustness.
Conclusion: These findings demonstrate the effectiveness of NLP in detecting HFS from real-world clinical records. The application to celecoxib-HFS detection illustrates the potential utility of this approach for retrospective safety analysis. Further work is needed to evaluate generalizability across diverse clinical settings.