JCO Clinical Cancer Informatics最新文献_第6页

Enhancing Patient-Trial Matching With Large Language Models: A Scoping Review of Emerging Applications and Approaches. 用大型语言模型增强患者-试验匹配：对新兴应用和方法的范围审查。

IF 3.3

JCO Clinical Cancer Informatics Pub Date : 2025-06-01 Epub Date: 2025-06-09 DOI: 10.1200/CCI-25-00071

Hongyu Chen, Xiaohan Li, Xing He, Aokun Chen, James McGill, Emily C Webber, Hua Xu, Mei Liu, Jiang Bian

{"title":"Enhancing Patient-Trial Matching With Large Language Models: A Scoping Review of Emerging Applications and Approaches.","authors":"Hongyu Chen, Xiaohan Li, Xing He, Aokun Chen, James McGill, Emily C Webber, Hua Xu, Mei Liu, Jiang Bian","doi":"10.1200/CCI-25-00071","DOIUrl":"10.1200/CCI-25-00071","url":null,"abstract":"Purpose: Patient recruitment remains a major bottleneck in clinical trial execution, with inefficient patient-trial matching often causing delays and failures. Recent advancements in large language models (LLMs) offer a promising avenue for automating and improving this process. This scoping review aims to provide a comprehensive synthesis of the emerging applications of LLMs in patient-trial matching.Methods: A comprehensive search was conducted in PubMed, Web of Science, and OpenAlex for literature published between December 1, 2022, and December 31, 2024. Studies were included if they explicitly integrated LLMs into patient-trial matching systems. Data extraction focused on system architectures, patient data processing, eligibility criteria processing, matching techniques, evaluation metrics, and performance.Results: Of the 2,357 studies initially identified, 24 met the inclusion criteria. The majority (21/24) were published in 2024, highlighting the rapid adoption of LLMs in this domain. Most systems used patient-centric matching (17/24), with OpenAI's generative pretrained transformer models being the most commonly used LLM. Core components of these systems included eligibility criteria processing, patient data processing, and matching, with some incorporating retrieval algorithms to enhance computational efficiency. LLM-integrated approaches demonstrated improved accuracy and scalability in patient-trial matching, although challenges such as performance variability, interpretability, and reliance on synthetic data sets remain significant.Conclusion: LLM-based patient-trial matching systems present a transformative opportunity to enhance the efficiency and accuracy of clinical trial recruitment. Despite current limitations related to model generalizability, explainability, and data constraints, future advancements in hybrid modeling strategies, domain-specific fine-tuning, and real-world data set integration could further optimize LLM-based trial matching. Addressing these challenges will be crucial to realizing the full potential of LLMs in streamlining patient recruitment and accelerating clinical trial execution.","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2500071"},"PeriodicalIF":3.3,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12169815/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144259398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Accuracy and Reproducibility of ChatGPT Responses to Breast Cancer Tumor Board Patients. 乳腺癌肿瘤板患者ChatGPT反应的准确性和可重复性。

IF 3.3

JCO Clinical Cancer Informatics Pub Date : 2025-06-01 Epub Date: 2025-06-04 DOI: 10.1200/CCI-25-00001

Ning Liao, Cheukfai Li, William J Gradishar, V Suzanne Klimberg, Joshua A Roshal, Taize Yuan, Sanjiv S Agarwala, Vincente K Valero, Sandra M Swain, Julie A Margenthaler, Isabel T Rubio, Sara A Hurvitz, Charles E Geyer, Nancy U Lin, Hope S Rugo, Guochun Zhang, Nanqiu Liu, Charles M Balch

{"title":"Accuracy and Reproducibility of ChatGPT Responses to Breast Cancer Tumor Board Patients.","authors":"Ning Liao, Cheukfai Li, William J Gradishar, V Suzanne Klimberg, Joshua A Roshal, Taize Yuan, Sanjiv S Agarwala, Vincente K Valero, Sandra M Swain, Julie A Margenthaler, Isabel T Rubio, Sara A Hurvitz, Charles E Geyer, Nancy U Lin, Hope S Rugo, Guochun Zhang, Nanqiu Liu, Charles M Balch","doi":"10.1200/CCI-25-00001","DOIUrl":"https://doi.org/10.1200/CCI-25-00001","url":null,"abstract":"Purpose: We assessed the accuracy and reproducibility of Chat Generative Pre-Trained Transformer's (ChatGPT) recommendations in response to breast cancer patients by comparing generated outputs with consensus expert opinions.Methods: 362 consecutive breast cancer patients sourced from a weekly international breast cancer webinar series were submitted to a tumor board of renowned experts. The same 362 clinical patients were also prompted to ChatGPT-4.0 three separate times to examine reproducibility.Results: Only 46% of ChatGPT-generated content was entirely concordant with the recommendations of breast cancer experts, and only 39% of ChatGPT's responses demonstrated inter-response similarity. ChatGPT's responses demonstrated higher concordance with CEN experts in earlier stages of breast cancer (0, I, II, III) compared to advanced (IV) patients (P = .019). There were less accurate responses from ChatGPT when responding to patients involving molecular markers and genetic testing (P = .025), and in patients involving antibody drug conjugates (P = .006). ChatGPT's responses were not necessarily incorrect but often omitted specific details about clinical management. When the same prompt was independently sent to CEN into the model on three occasions, each time by difference users, ChatGPT's responses exhibited variable content and formatting in 68% (246 out of 362) of patients and were entirely consistent with one another in only 32% of responses.Conclusion: Since this promising clinical decision-making support tool is widely used currently by physicians worldwide, it is important for the user to understand its limitations as currently constructed when responding to multidisciplinary breast cancer patients, and for researchers in the field to continue improving its ability with contemporary, accurate and complete breast cancer information. As currently constructed, ChatGPT is not engineered to generate identical outputs to the same input and was less likely to correctly interpret and recommend treatments for complex breast cancer patients.","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2500001"},"PeriodicalIF":3.3,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144227518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Deep Learning Model for Natural Language to Assess Effectiveness of Patients With Non-Muscle Invasive Bladder Cancer Receiving Intravesical Bacillus Calmette-Guérin Therapy. 基于自然语言的深度学习模型评估非肌肉浸润性膀胱癌患者接受膀胱内卡介苗-谷氨酰胺治疗的有效性。

IF 3.3

JCO Clinical Cancer Informatics Pub Date : 2025-06-01 Epub Date: 2025-06-27 DOI: 10.1200/CCI-24-00249

Makito Miyake, Naohiro Yonemoto, Kanae Togo, Linghua Xu, Tomoyo Oguri, Masayuki Tanaka, Yoshiyuki Hasegawa, Yoshinobu Izawa, Kenji Araki

{"title":"Deep Learning Model for Natural Language to Assess Effectiveness of Patients With Non-Muscle Invasive Bladder Cancer Receiving Intravesical Bacillus Calmette-Guérin Therapy.","authors":"Makito Miyake, Naohiro Yonemoto, Kanae Togo, Linghua Xu, Tomoyo Oguri, Masayuki Tanaka, Yoshiyuki Hasegawa, Yoshinobu Izawa, Kenji Araki","doi":"10.1200/CCI-24-00249","DOIUrl":"10.1200/CCI-24-00249","url":null,"abstract":"Purpose: Collecting information on clinical outcomes (recurrence/progression) from complex treatment courses in non-muscle invasive bladder cancer (NMIBC) is challenging and time-consuming. We developed a deep learning natural language processing model to assess outcomes in patients with NMIBC using vast data from electronic health records (EHRs).Methods: This retrospective study analyzed data from Japanese adults with NMIBC who started Bacillus Calmette-Guérin (BCG) induction therapy between April 2016 and June 2022. A Bidirectional Encoder Representations from Transformers (BERT) model was trained to classify outcomes, supported by human review for past history records. The model's performance was assessed by precision, recall, and F1 scores. We compared the effectiveness of BCG therapy between completion (patients who completed therapy) and non-completion groups.Results: Of 372 patients studied, 79.3% and 20.7% were in the completion group and the non-completion group, respectively. The final BERT model achieved average F1 scores of 0.91 and 0.98 for time to recurrence (TTR), and 0.74 and 0.94 for time to progression (TTP) before and after human support, respectively. The hazard ratio for TTR in BCG completion versus non-completion groups was 0.40 (95% CI, 0.26 to 0.62) by a multivariate Cox proportional hazard model and 0.41 (95% CI, 0.26 to 0.63) by inverse probability of treatment weighting.Conclusion: The developed model could compare the clinical outcomes between treatments in patients with NMIBC using EHRs. Human support, although required, was needed in only 10% documents and was deemed feasible. The model was able to demonstrate the difference in TTR and TTP between BCG completion and non-completion groups.","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2400249"},"PeriodicalIF":3.3,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12233173/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144512787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Open-Source Hybrid Large Language Model Integrated System for Extraction of Breast Cancer Treatment Pathway From Free-Text Clinical Notes. 从自由文本临床记录中提取乳腺癌治疗路径的开源混合大语言模型集成系统。

IF 3.3

JCO Clinical Cancer Informatics Pub Date : 2025-06-01 Epub Date: 2025-06-27 DOI: 10.1200/CCI-25-00002

Amara Tariq, Madhu Sikha, Allison W Kurian, Kevin Ward, Theresa H M Keegan, Daniel L Rubin, Imon Banerjee

{"title":"Open-Source Hybrid Large Language Model Integrated System for Extraction of Breast Cancer Treatment Pathway From Free-Text Clinical Notes.","authors":"Amara Tariq, Madhu Sikha, Allison W Kurian, Kevin Ward, Theresa H M Keegan, Daniel L Rubin, Imon Banerjee","doi":"10.1200/CCI-25-00002","DOIUrl":"10.1200/CCI-25-00002","url":null,"abstract":"Purpose: Automated curation of breast cancer treatment data with minimal human involvement could accelerate the collection of statewide and nationwide evidence for patient management and assessing the effectiveness of treatment pathways. The primary challenges are the complexity and inconsistency of structured clinical data streams and accurate extraction of this information from free-text clinical narratives.Materials and methods: We proposed a hybrid two-phase information extraction framework that combined a Unified Medical Language System parser (phase-1) with a fine-tuned large language model (LLM; phase-2) to extract longitudinal treatment timelines from time-stamped clinical notes. Our framework was developed through end-to-end joint learning as a question-answering model, where the model was trained to simultaneously answer five questions, each corresponding to a specific treatment.Results: We fine-tuned and internally validated the model on 26,692 patients with breast cancer (diagnosed between 2013 and 2020) receiving treatment at Mayo Clinic and externally validated the model on 162 randomly selected patients from Stanford Healthcare. Zero-shot LLM (out-of-the-box) had high specificity but low sensitivity, indicating that although these frameworks are useful for generic language understanding, they are lacking in terms of targeted clinical tasks. The proposed model achieved 0.942 average AUROC on the internal and 0.924 on the external data, demonstrating only marginal drop in performance when evaluated on external. The proposed model also achieved better trade-off between sensitivity (average: 79.2%) and specificity (average: 76.2%) compared with rule-based (average sensitivity: 70.5%, average specificity: 68.1%) and structured codes (average sensitivity: 64.1%, average specificity: 83.5%).Conclusion: The proposed framework can extract temporal information about cancer treatments from various time-stamped clinic notes, regardless of the setting of treatment administration (inpatient or outpatient) or time frame. To support the cancer research community for such data curation and longitudinal analysis, we have packaged the code as a docker image, which needs minimal system reconfiguration and shared with an open-source academic license.","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2500002"},"PeriodicalIF":3.3,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12208650/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144512802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Clinical Trial Design Approach to Auditing Language Models in Health Care Setting. 临床试验设计方法审计语言模型在卫生保健设置。

IF 3.3

JCO Clinical Cancer Informatics Pub Date : 2025-06-01 Epub Date: 2025-06-03 DOI: 10.1200/CCI-24-00331

Lovedeep Gondara, Jonathan Simkin, Shebnum Devji

{"title":"Clinical Trial Design Approach to Auditing Language Models in Health Care Setting.","authors":"Lovedeep Gondara, Jonathan Simkin, Shebnum Devji","doi":"10.1200/CCI-24-00331","DOIUrl":"https://doi.org/10.1200/CCI-24-00331","url":null,"abstract":"Purpose: Rapid advancements in natural language processing have led to the development of sophisticated language models. Inspired by their success, these models are now used in health care for tasks such as clinical documentation and medical record classification. However, language models are prone to errors, which can have serious consequences in critical domains such as health care, ensuring that their reliability is essential to maintain patient safety and data integrity.Methods: To address this, we propose an innovative auditing process based on principles from clinical trial design. Our approach involves subject matter experts (SMEs) manually reviewing pathology reports without previous knowledge of the model's classification. This single-blind setup minimizes bias and allows us to apply statistical rigor to assess model performance.Results: Deployed at the British Columbia Cancer Registry, our audit process effectively identified the core issues in the operational models. Early interventions addressed these issues, maintaining data integrity and patient care standards.Conclusion: The audit provides real-world performance metrics and underscores the importance of human-in-the-loop machine learning. Even advanced models require SME oversight to ensure accuracy and reliability. To our knowledge, we have developed the first continuous audit process for language models in health care, modeled after clinical trial principles. This methodology ensures that audits are statistically sound and operationally feasible, setting a new standard for evaluating language models in critical applications.","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2400331"},"PeriodicalIF":3.3,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144217490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Extremity Soft Tissue Sarcoma Reconstruction Nomograms: A Clinicoradiomic, Machine Learning-Powered Predictor of Postoperative Outcomes. 四肢软组织肉瘤重建图：一种临床放射学、机器学习驱动的术后预后预测器。

IF 3.3

JCO Clinical Cancer Informatics Pub Date : 2025-06-01 Epub Date: 2025-06-11 DOI: 10.1200/CCI-25-00007

Rami Elmorsi, Luis D Camacho, David D Krijgh, Heather Lyu, Margaret S Roubaud, Keila Torres, Valerae Lewis, Christina L Roland, Alexander F Mericli

{"title":"Extremity Soft Tissue Sarcoma Reconstruction Nomograms: A Clinicoradiomic, Machine Learning-Powered Predictor of Postoperative Outcomes.","authors":"Rami Elmorsi, Luis D Camacho, David D Krijgh, Heather Lyu, Margaret S Roubaud, Keila Torres, Valerae Lewis, Christina L Roland, Alexander F Mericli","doi":"10.1200/CCI-25-00007","DOIUrl":"https://doi.org/10.1200/CCI-25-00007","url":null,"abstract":"Purpose: The choice of wound closure modality after limb-sparing extremity soft-tissue sarcoma (eSTS) resection is fraught with uncertainty. Leveraging machine learning and clinicoradiomic data, we developed Sarcoma Reconstruction Nomograms (SARCON), a tool that provides probabilistic estimates of five adverse outcomes on the basis of the selected reconstructive modality.Methods: This retrospective cohort study of limb-sparing eSTS resections integrated clinical variables and radiomic features, including eSTS and limb dimensions. Target outcomes included surgical site infections (SSI), wound dehiscence (WD), seroma formation, and minor and major complications. For each outcome, three machine learning classifiers-Logistic Regression with Lasso regularization, Naïve Bayes, and FasterRisk-were developed and evaluated using 10-fold cross-validation (CV), 50 random 80%-20% splits, leave-one-out CV, and a test data set. The best-performing model for each outcome was used to construct a respective nomogram.Results: A total of 316 limb-sparing eSTS resections were analyzed, predominantly located in the thigh (54%), lower leg (17%), and upper arm (11%). Postoperative outcomes included SSI (12%), WD (16%), seroma formation (8.5%), minor complications (34%), and major complications (25%). Logistic Regression with Lasso regularization consistently outperformed the other models across all outcomes, achieving area under the receiver operator curves ranging from 0.83 to 0.93 in all tests.Conclusion: By providing probabilistic estimates of adverse outcomes on the basis of reconstructive modality, SARCON empowers surgeons to anticipate complications and optimize reconstructive strategies.","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2500007"},"PeriodicalIF":3.3,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144276574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Reply to: Machine-Learning Algorithms and Treatment Response in Advanced Melanoma. 回复：机器学习算法和晚期黑色素瘤的治疗反应。

IF 3.3

JCO Clinical Cancer Informatics Pub Date : 2025-06-01 Epub Date: 2025-06-26 DOI: 10.1200/CCI-25-00128

Richard Brohet, Jan Willem de Groot

引用次数: 0

Reliability of Large Language Model Knowledge Across Brand and Generic Cancer Drug Names. 大型语言模型知识在品牌和非专利抗癌药物名称中的可靠性。

IF 3.3

JCO Clinical Cancer Informatics Pub Date : 2025-06-01 Epub Date: 2025-06-16 DOI: 10.1200/CCI-24-00257

Jack Gallifant, Shan Chen, Sandeep K Jain, Pedro Moreira, Umit Topaloglu, Hugo J W L Aerts, Jeremy L Warner, William G La Cava, Danielle S Bitterman

{"title":"Reliability of Large Language Model Knowledge Across Brand and Generic Cancer Drug Names.","authors":"Jack Gallifant, Shan Chen, Sandeep K Jain, Pedro Moreira, Umit Topaloglu, Hugo J W L Aerts, Jeremy L Warner, William G La Cava, Danielle S Bitterman","doi":"10.1200/CCI-24-00257","DOIUrl":"10.1200/CCI-24-00257","url":null,"abstract":"Purpose: To evaluate the performance and consistency of large language models (LLMs) across brand and generic oncology drug names in various clinical tasks, addressing concerns about potential fluctuations in LLM performance because of subtle phrasing differences that could affect patient care.Methods: This study evaluated three LLMs (GPT-3.5-turbo-0125, GPT-4-turbo, and GPT-4o) using drug names from HemOnc ontology. The assessment included 367 generic-to-brand and 2,516 brand-to-generic pairs, 1,000 drug-drug interaction (DDI) synthetic patient cases, and 2,438 immune-related adverse event (irAE) cases. LLMs were tested on drug name recognition, word association, DDI (DDI) detection, and irAE diagnosis using both brand and generic drug names.Results: LLMs demonstrated high accuracy in matching brand and generic names (GPT-4o: 97.38% for brand, 94.71% for generic, P < .01). However, they showed significant inconsistencies in word association tasks. GPT-3.5-turbo-0125 exhibited biases favoring brand names for effectiveness (odds ratio [OR], 1.43, P < .05) and being side-effect-free (OR, 1.76, P < .05). DDI detection accuracy was poor across all models (<26%), with no significant differences between brand and generic names. Sentiment analysis revealed significant differences, particularly in GPT-3.5-turbo-0125 (brand mean 0.67, generic mean 0.95, P < .01). Consistency in irAE diagnosis varied across models.Conclusion: Despite high proficiency in name-matching, LLMs exhibit inconsistencies when processing brand versus generic drug names in more complex tasks. These findings highlight the need for increased awareness, improved robustness assessment methods, and the development of more consistent systems for handling nomenclature variations in clinical applications of LLMs.","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2400257"},"PeriodicalIF":3.3,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144310800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Prediction of Lymph Node Metastasis in Non-Small Cell Lung Carcinoma Using Primary Tumor Somatic Mutation Data. 利用原发肿瘤体细胞突变数据预测非小细胞肺癌淋巴结转移。

IF 3.3

JCO Clinical Cancer Informatics Pub Date : 2025-06-01 Epub Date: 2025-05-30 DOI: 10.1200/CCI-24-00303

Victor Lee, Nicholas S Moore, Joshua Doyle, Daniel Hicks, Patrick Oh, Shari Bodofsky, Sajid Hossain, Abhijit A Patel, Sanjay Aneja, Robert Homer, Henry S Park

{"title":"Prediction of Lymph Node Metastasis in Non-Small Cell Lung Carcinoma Using Primary Tumor Somatic Mutation Data.","authors":"Victor Lee, Nicholas S Moore, Joshua Doyle, Daniel Hicks, Patrick Oh, Shari Bodofsky, Sajid Hossain, Abhijit A Patel, Sanjay Aneja, Robert Homer, Henry S Park","doi":"10.1200/CCI-24-00303","DOIUrl":"https://doi.org/10.1200/CCI-24-00303","url":null,"abstract":"Purpose: Lymph node metastasis (LNM) significantly affects prognosis and treatment strategies in non-small cell lung cancer (NSCLC). Current diagnostic methods, including imaging and histopathology, have limited sensitivity and specificity. This study aims to develop and evaluate machine learning (ML) models that predict LNM in NSCLC using single-nucleotide polymorphism (SNP) data from The Cancer Genome Atlas.Methods: A cohort of 542 patients with NSCLC with comprehensive SNP data were analyzed. After preprocessing, feature selection was performed using chi-square tests to identify SNPs significantly associated with LNM. Twelve ML models, including Logistic Regression, Naive Bayes, and Support Vector Machines, were trained and evaluated using bootstrapped data sets. Model performance was assessed using metrics such as accuracy, area under the receiver operating characteristic curve (AUC), and F1 score. Shapley additive explanations values were used for feature interpretability, and survival analysis was conducted to assess clinical outcomes.Results: Naive Bayes and Logistic Regression models achieved the highest predictive performance, with median AUCs of 0.93 and 0.91, respectively. Key SNPs, including mutations in TANC2, KCNT2, and CENPF, were consistently identified as predictive features. Survival analysis demonstrated significant differences in outcomes on the basis of model-predicted LNM status (log-rank P = .0268). Feature selection improved model accuracy and robustness, highlighting the biological relevance of selected SNPs.Conclusion: ML models leveraging primary tumor SNP data can enhance LNM prediction in NSCLC, outperforming traditional diagnostic methods. These findings underscore the potential of integrating genomics and ML to develop noninvasive biomarkers, enabling precise risk stratification and personalized treatment strategies in oncology.","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2400303"},"PeriodicalIF":3.3,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144188467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Using an Integrated, Digital Framework to Standardize and Expand a Multisite Lung Cancer Screening Program. 使用集成的数字框架来标准化和扩展多站点肺癌筛查项目。

IF 3.3

JCO Clinical Cancer Informatics Pub Date : 2025-06-01 Epub Date: 2025-06-25 DOI: 10.1200/CCI-24-00322

Maria Katerina C Alfaro, Christine S Shusted, Teresa Giamboy, Gregory C Kane, Nathaniel R Evans, Brooke M Ruane, Eboni Gatson-Anderson, Emily Muse, Mary McMullen, Anne Marie Kinsey, Sandra Murray, Christopher McNair, Julie A Barta

{"title":"Using an Integrated, Digital Framework to Standardize and Expand a Multisite Lung Cancer Screening Program.","authors":"Maria Katerina C Alfaro, Christine S Shusted, Teresa Giamboy, Gregory C Kane, Nathaniel R Evans, Brooke M Ruane, Eboni Gatson-Anderson, Emily Muse, Mary McMullen, Anne Marie Kinsey, Sandra Murray, Christopher McNair, Julie A Barta","doi":"10.1200/CCI-24-00322","DOIUrl":"10.1200/CCI-24-00322","url":null,"abstract":"Purpose: Lung cancer screening (LCS) is one of the most potentially impactful interventions of the past two decades for reducing lung cancer mortality. However, no current standard exists in the field for comprehensive data collection and tracking of LCS, despite availability of electronic health records (EHRs) and LCS management tools. In a widely expanding LCS program, harmonization of data becomes critical for decisions surrounding clinical care coordination and operational management.Methods: This article summarizes the implementation of an integrated, digital framework within the Jefferson Health System using the Epic EHR and its customized SmartForms as well as Research Electronic Data Capture application. Leveraging these tools has allowed for standardized documentation across the LCS process continuum for each patient: LCS eligibility, shared decision making, low-dose computed tomography, and follow-up.Results: Since the initial rollout in October 2022, 11 program sites across four regional hubs have adopted this framework. A standardized process paired with interoperability between systems has resulted in a centralized data repository, increased communication and transparency within and between program sites, and decreased duplicative or manual processes across the entire LCS program.Conclusion: The resultant digital framework is poised for scale-up and sustainment across the Jefferson Health System, and it can also be replicated across other LCS programs. Future iterations of the current work or adoption by other programs should take into account the complexities of the EHR itself and data provenance to ensure success. Active participation among stakeholders for synchronous coordination of building, implementing, and troubleshooting a comprehensive repository for LCS data can ultimately facilitate measurement of quality metrics and develop future research in early detection of lung cancer.","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2400322"},"PeriodicalIF":3.3,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12208652/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144499102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0