Prognosis of p16 and Human Papillomavirus Discordant Oropharyngeal Cancers and the Exploration of Using Natural Language Processing to Analyze Free-Text Pathology Reports.
Ethan Shin, Justin Choi, Tony K W Hung, Chester Poon, Nadeem Riaz, Yao Yu, Jung Julie Kang
{"title":"Prognosis of p16 and Human Papillomavirus Discordant Oropharyngeal Cancers and the Exploration of Using Natural Language Processing to Analyze Free-Text Pathology Reports.","authors":"Ethan Shin, Justin Choi, Tony K W Hung, Chester Poon, Nadeem Riaz, Yao Yu, Jung Julie Kang","doi":"10.1200/CCI-24-00177","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Treatment deintensification for human papillomavirus-positive (HPV+)-associated oropharyngeal cancer (OPC) has been the catalyst of experts worldwide. In situ hybridization is optimal to identify HPV+ OPC, but immunohistochemistry for its surrogate p16INK4a (p16) is standard-of-care given its availability and sensitivity. HPV testing is not required for clinical management, so treatments are often administered on the basis of p16 status alone. However, the prognosis of p16/HPV discordant tumors is uncertain.</p><p><strong>Materials and methods: </strong>This cohort study included 727 consecutive patients with OPC with digitized unstructured pathology reports receiving curative radiation therapy at an academic cancer center. Natural language processing (NLP) methods were used to classify biomarker status and compared against manually derived classification. Patients were excluded if either p16 or HPV testing was not performed or equivocal. Primary end points were progression-free survival (PFS), cancer-specific survival (CSS), and overall survival.</p><p><strong>Results: </strong>NLP classified p16 and HPV status from a majority (91%) of reports. Accuracy, positive predictive value, sensitivity, and <i>F</i>-score for NLP-derived p16/HPV were 84%/82%, 91%/87%, 90%/89%, and 90%/88%, respectively. Four groups were identified: p16-positive (p16+)/HPV+ (75%), p16+/HPV-negative (HPV-; 13%), p16-negative (p16-)/HPV- (10%), and p16-/HPV+ (2%). There was no statistically significant difference in outcomes between p16+/HPV- and p16-/HPV- patients (5-year PFS 76.1% <i>v</i> 68.9%; <i>P</i> = .12; 5-year CSS 81.5% <i>v</i> 84.9%; <i>P</i> = .22). Number needed to harm calculations estimated one excess cancer-related death for every 10 p16+/HPV- patients, compared with that expected with p16+/HPV+ patients.</p><p><strong>Conclusion: </strong>NLP classified head and neck cancer pathology reports with high concordance with gold-standard categorization, but a conspicuous portion of reports could not be interpreted. p16/HPV discordant OPC constitutes a noteworthy minority of patients. The inferior prognosis of p16+/HPV- suggests that p16 alone for prognostication is insufficient-especially when considering treatment de-escalation.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2400177"},"PeriodicalIF":3.3000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JCO Clinical Cancer Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1200/CCI-24-00177","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/2/18 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: Treatment deintensification for human papillomavirus-positive (HPV+)-associated oropharyngeal cancer (OPC) has been the catalyst of experts worldwide. In situ hybridization is optimal to identify HPV+ OPC, but immunohistochemistry for its surrogate p16INK4a (p16) is standard-of-care given its availability and sensitivity. HPV testing is not required for clinical management, so treatments are often administered on the basis of p16 status alone. However, the prognosis of p16/HPV discordant tumors is uncertain.
Materials and methods: This cohort study included 727 consecutive patients with OPC with digitized unstructured pathology reports receiving curative radiation therapy at an academic cancer center. Natural language processing (NLP) methods were used to classify biomarker status and compared against manually derived classification. Patients were excluded if either p16 or HPV testing was not performed or equivocal. Primary end points were progression-free survival (PFS), cancer-specific survival (CSS), and overall survival.
Results: NLP classified p16 and HPV status from a majority (91%) of reports. Accuracy, positive predictive value, sensitivity, and F-score for NLP-derived p16/HPV were 84%/82%, 91%/87%, 90%/89%, and 90%/88%, respectively. Four groups were identified: p16-positive (p16+)/HPV+ (75%), p16+/HPV-negative (HPV-; 13%), p16-negative (p16-)/HPV- (10%), and p16-/HPV+ (2%). There was no statistically significant difference in outcomes between p16+/HPV- and p16-/HPV- patients (5-year PFS 76.1% v 68.9%; P = .12; 5-year CSS 81.5% v 84.9%; P = .22). Number needed to harm calculations estimated one excess cancer-related death for every 10 p16+/HPV- patients, compared with that expected with p16+/HPV+ patients.
Conclusion: NLP classified head and neck cancer pathology reports with high concordance with gold-standard categorization, but a conspicuous portion of reports could not be interpreted. p16/HPV discordant OPC constitutes a noteworthy minority of patients. The inferior prognosis of p16+/HPV- suggests that p16 alone for prognostication is insufficient-especially when considering treatment de-escalation.