H. Ikushima , K. Watanabe , A. Shinozaki-Ushiku , K. Oda , H. Kage
{"title":"基于机器学习的全国癌症综合基因组图谱数据跨癌症类型分析,以确定与推荐基因组匹配治疗相关的特征","authors":"H. Ikushima , K. Watanabe , A. Shinozaki-Ushiku , K. Oda , H. Kage","doi":"10.1016/j.esmoop.2024.103998","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>The low probability of identifying druggable mutations through comprehensive genomic profiling (CGP) and its financial and time costs hinder its widespread adoption. To enhance the effectiveness and efficiency of cancer precision medicine, it is critical to identify patient characteristics that are most likely to benefit from CGP.</div></div><div><h3>Patients and methods</h3><div>This nationwide retrospective study employed machine learning models to predict the identification of genome-matched therapies by CGP, utilizing a national database covering 99.7% of the patients who underwent CGP in Japan from June 2019 to November 2023. Prediction models were constructed for the overall cancer population, specific cancer types, and adolescent and young adult (AYA) group. The SHapley Additive exPlanations (SHAP) algorithm was applied to elucidate clinical features contributing to model predictions.</div></div><div><h3>Results</h3><div>This study included 60 655 patients [mean age (standard deviation), 60.8 years (14.5 years); 50.1% males]. CGP identified at least one genome-matched therapy in 11 227 cases (18.5%). The best prediction model was eXtreme Gradient Boosting (XGBoost) with an area under the receiver operating characteristic curve of 0.819. Cancer type was the most important predictor (negative for pancreas and positive for breast and lung), followed by the age, presence of liver metastasis, and number of metastatic sites. Analysis of cancer type-specific models identified several organ-specific features, including the sex, interval between the cancer diagnosis and CGP, sampling site, and CGP panel. Among 3455 AYA patients, genome-matched therapies were identified in 459 patients (13.3%). The AYA-specific model achieved an area under the receiver operating characteristic curve of 0.768, with bone tumor identified as a negative predictor in addition to those identified in the overall cancer population model.</div></div><div><h3>Conclusion</h3><div>Several factors predicting the identification of genome-matched therapies through CGP were identified for the overall cancer population and cancer type-specific subpopulations. Expedited CGP is recommended for patients who match the identified profile to facilitate early targeted therapy.</div></div>","PeriodicalId":11877,"journal":{"name":"ESMO Open","volume":"9 12","pages":"Article 103998"},"PeriodicalIF":7.1000,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A machine learning-based analysis of nationwide cancer comprehensive genomic profiling data across cancer types to identify features associated with recommendation of genome-matched therapy\",\"authors\":\"H. Ikushima , K. Watanabe , A. Shinozaki-Ushiku , K. Oda , H. Kage\",\"doi\":\"10.1016/j.esmoop.2024.103998\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Background</h3><div>The low probability of identifying druggable mutations through comprehensive genomic profiling (CGP) and its financial and time costs hinder its widespread adoption. To enhance the effectiveness and efficiency of cancer precision medicine, it is critical to identify patient characteristics that are most likely to benefit from CGP.</div></div><div><h3>Patients and methods</h3><div>This nationwide retrospective study employed machine learning models to predict the identification of genome-matched therapies by CGP, utilizing a national database covering 99.7% of the patients who underwent CGP in Japan from June 2019 to November 2023. Prediction models were constructed for the overall cancer population, specific cancer types, and adolescent and young adult (AYA) group. The SHapley Additive exPlanations (SHAP) algorithm was applied to elucidate clinical features contributing to model predictions.</div></div><div><h3>Results</h3><div>This study included 60 655 patients [mean age (standard deviation), 60.8 years (14.5 years); 50.1% males]. CGP identified at least one genome-matched therapy in 11 227 cases (18.5%). The best prediction model was eXtreme Gradient Boosting (XGBoost) with an area under the receiver operating characteristic curve of 0.819. Cancer type was the most important predictor (negative for pancreas and positive for breast and lung), followed by the age, presence of liver metastasis, and number of metastatic sites. Analysis of cancer type-specific models identified several organ-specific features, including the sex, interval between the cancer diagnosis and CGP, sampling site, and CGP panel. Among 3455 AYA patients, genome-matched therapies were identified in 459 patients (13.3%). The AYA-specific model achieved an area under the receiver operating characteristic curve of 0.768, with bone tumor identified as a negative predictor in addition to those identified in the overall cancer population model.</div></div><div><h3>Conclusion</h3><div>Several factors predicting the identification of genome-matched therapies through CGP were identified for the overall cancer population and cancer type-specific subpopulations. Expedited CGP is recommended for patients who match the identified profile to facilitate early targeted therapy.</div></div>\",\"PeriodicalId\":11877,\"journal\":{\"name\":\"ESMO Open\",\"volume\":\"9 12\",\"pages\":\"Article 103998\"},\"PeriodicalIF\":7.1000,\"publicationDate\":\"2024-11-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ESMO Open\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S205970292401768X\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ONCOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ESMO Open","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S205970292401768X","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ONCOLOGY","Score":null,"Total":0}
A machine learning-based analysis of nationwide cancer comprehensive genomic profiling data across cancer types to identify features associated with recommendation of genome-matched therapy
Background
The low probability of identifying druggable mutations through comprehensive genomic profiling (CGP) and its financial and time costs hinder its widespread adoption. To enhance the effectiveness and efficiency of cancer precision medicine, it is critical to identify patient characteristics that are most likely to benefit from CGP.
Patients and methods
This nationwide retrospective study employed machine learning models to predict the identification of genome-matched therapies by CGP, utilizing a national database covering 99.7% of the patients who underwent CGP in Japan from June 2019 to November 2023. Prediction models were constructed for the overall cancer population, specific cancer types, and adolescent and young adult (AYA) group. The SHapley Additive exPlanations (SHAP) algorithm was applied to elucidate clinical features contributing to model predictions.
Results
This study included 60 655 patients [mean age (standard deviation), 60.8 years (14.5 years); 50.1% males]. CGP identified at least one genome-matched therapy in 11 227 cases (18.5%). The best prediction model was eXtreme Gradient Boosting (XGBoost) with an area under the receiver operating characteristic curve of 0.819. Cancer type was the most important predictor (negative for pancreas and positive for breast and lung), followed by the age, presence of liver metastasis, and number of metastatic sites. Analysis of cancer type-specific models identified several organ-specific features, including the sex, interval between the cancer diagnosis and CGP, sampling site, and CGP panel. Among 3455 AYA patients, genome-matched therapies were identified in 459 patients (13.3%). The AYA-specific model achieved an area under the receiver operating characteristic curve of 0.768, with bone tumor identified as a negative predictor in addition to those identified in the overall cancer population model.
Conclusion
Several factors predicting the identification of genome-matched therapies through CGP were identified for the overall cancer population and cancer type-specific subpopulations. Expedited CGP is recommended for patients who match the identified profile to facilitate early targeted therapy.
期刊介绍:
ESMO Open is the online-only, open access journal of the European Society for Medical Oncology (ESMO). It is a peer-reviewed publication dedicated to sharing high-quality medical research and educational materials from various fields of oncology. The journal specifically focuses on showcasing innovative clinical and translational cancer research.
ESMO Open aims to publish a wide range of research articles covering all aspects of oncology, including experimental studies, translational research, diagnostic advancements, and therapeutic approaches. The content of the journal includes original research articles, insightful reviews, thought-provoking editorials, and correspondence. Moreover, the journal warmly welcomes the submission of phase I trials and meta-analyses. It also showcases reviews from significant ESMO conferences and meetings, as well as publishes important position statements on behalf of ESMO.
Overall, ESMO Open offers a platform for scientists, clinicians, and researchers in the field of oncology to share their valuable insights and contribute to advancing the understanding and treatment of cancer. The journal serves as a source of up-to-date information and fosters collaboration within the oncology community.