{"title":"Application of machine learning to classify cancers of unknown primary","authors":"Shuvam Sarkar, Daniel T. Baptista-Hon","doi":"10.1002/mog2.63","DOIUrl":null,"url":null,"abstract":"<p>A recent study by Moon et al.<span><sup>1</sup></span> published in Nature Medicine highlights the role of OncoNPC, a machine learning tool, in diagnosing cancers of unknown primary (CUP). The study offers an insight into the efficacy and accessibility of OncoNPC over traditional diagnostic tools and highlights the wider implications of machine learning technologies in delivering precision medicine.</p><p>The emergence of targeted immunotherapy over the past decade has led to a paradigm shift in clinical oncology. The efficacy of therapeutic agents in treating cancers such as chronic myeloid leukemia and HER2-positive breast cancer, for example, are now well established.<span><sup>2</sup></span> However, CUPs, metastatic diseases where the primary tumor could not be identified present a significant challenge in this new era of precision medicine. CUPs account for 3%–5% of cancer diagnoses and present significant challenges in providing targeted therapy due to diagnostic uncertainty, and limited therapeutic targets.<span><sup>3</sup></span> Indeed, the mortality rate in patients with CUPs is up to 80% at 12 months postdiagnosis.<span><sup>4</sup></span></p><p>Several authors have hypothesized pathological mechanisms that might underlie CUPs. Lopez-Lazaro<span><sup>5</sup></span> reconciled existing research on stem cells driving tumorigenesis by suggesting CUPs may occur as a result of stem cell migration followed by malignant transformation. This could, in theory, present as metastatic cancer in the absence of a clear primary tumor. Alternative studies have suggested CUPs occur from early dissemination of a primary tumor resulting in rapidly progressive metastatic disease.<span><sup>4</sup></span> This would account for the significant mortality rate associated with CUPs as early dissemination could increase metastatic burden and limit therapeutic interventions.</p><p>Current approaches for investigating CUPs focus primarily on immunohistochemistry (IHC) techniques or molecular profiling of tumor samples. Interpretation of IHC results can be inherently subjective. Studies using IHC techniques to investigate CUPs were only able to suggest a primary tumor in 25% of patients.<span><sup>6</sup></span> Molecular profiling compromises several techniques such as whole genome sequencing or gene expression analysis to determine the primary tumor based on the molecular characteristics of tumor cells. The efficacy of these methods remains unclear, however, as implementation into clinical practice is often limited by cost-effectiveness.</p><p>Moon et al utilized next-generation sequencing (NGS) data within this study to guide genomic profiling of CUPs.<span><sup>1</sup></span> NGS elicits a cellular genetic profile by simultaneously analyzing millions of fragments of DNA. This method is relatively cost-effective and significant tumor NGS data already exists.<span><sup>7</sup></span> This study therefore uses NGS data in concordance with electronic health records to retrospectively predict a primary tumor in 971 patients with CUP. The authors developed OncoNPC, a novel machine learning tool, which was trained on NGS data from patients with known primary tumor types. OncoNPC was able to classify 22 cancer types from patients with known primary tumors with high confidence and accounting for shifts in patient demographics. Interestingly, common cancer subtypes were identified with greater accuracy compared with rare groups. OncoNPC was then applied to patients with CUP, and predicted a primary cancer in 41% of patients with high confidence. This suggests a high proportion of CUPs are rare tumors. The commonest primary tumors were found to be lung, pancreatic, and bowel cancers which is consistent with pre-existing autopsy data from CUP mortalities.<span><sup>8</sup></span> In contrast to existing techniques such as IHC, OncoNPC therefore allows a more objective method of analyzing CUPs. The predictions of primary subtype and associated confidence intervals are made irrespective of user experience. Additionally, once the tool has been trained on baseline data, clinical application is not resource intensive and therefore more accessible than IHC or molecular profiling.</p><p>This study also attempted to characterize risk in patients with CUP based on predicted cancer subtype. A polygenic risk score was calculated based on germline variation data and found patients with CUP had greater germline risk compared to patients with known primary cancers. OncoNPC was also able to stratify risk based on predicted cancer subtype, with gastric and pancreatic cancers demonstrating the worst prognosis. Retrospective analysis of 158 CUP patients treated with palliative intent found that treatment in concordance with CUP tumor subtype demonstrated significantly better survival outcomes. Notably, OncoNPC identified a further 24 patients within this cohort who may have been suitable for targeted genomic therapy before palliative care.</p><p>This study offers an insight into the role of machine-learning tools in facilitating the emergence of personalized medicine, as well as the identification of potential therapeutic targets in patients with CUP. Given the need for early diagnosis and intervention within this patient cohort, OncoNPC could form a useful adjunct in the diagnostic workup for CUPs (Figure 1). OncoNPC offers a more objective and cost-effective method for analyzing CUPs compared with traditional methods, with demonstrated efficacy in identifying tumor profiles. Despite predictions in just 41% of the patient cohort, this study could pave the way for future research where predictive capabilities are augmented with clinical information, pathology reports and imaging results. Interestingly, the authors demonstrated that OncoNPC was able to assess germline risk for tumors. The increased germline risk score for CUPs compared to cancers with known primaries corresponds to an increased propensity for these tumors to metastasize and present as clinically aggressive disease. This could be due to greater mutational burden within CUPs. Accurately determining the risk of tumor spread could therefore allow OncoNPC to become a powerful prognostic tool and guide clinical practice. Indeed, retrospective analysis showed treating patients in concordance with OncoNPC results could have better survival outcomes. Additionally, information from this tool regarding prognosis could guide appropriate transitions to palliative treatment and ultimately improve the quality of end-of-life. Perhaps the most significant finding from this study shows that OncoNPC identified 15% of patients within the palliative cohort who may have been suited for targeted genomic therapy. This shows the impact OncoNPC could have in guiding clinical decision making and management plans. Ultimately, the findings from this study offer a glimpse into machine-learning tools and highlight the role they could play in this new era of precision medicine.</p><p><b>Shuvam Sarkar</b>: Data curation (lead); formal analysis (lead); methodology (lead); writing—original draft (equal); writing—review and editing (equal). <b>Daniel T. Baptista-Hon</b>: Conceptualization (lead); supervision (lead); validation (lead); visualization (lead); writing—original draft (equal); writing—review and editing (equal). Both authors have read and approved the final manuscript.</p><p>The authors declare no conflict of interest.</p><p>This research paper did not utilize any animals or human participants and therefore did not require any ethics approval.</p>","PeriodicalId":100902,"journal":{"name":"MedComm – Oncology","volume":"2 4","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/mog2.63","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"MedComm – Oncology","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/mog2.63","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
A recent study by Moon et al.1 published in Nature Medicine highlights the role of OncoNPC, a machine learning tool, in diagnosing cancers of unknown primary (CUP). The study offers an insight into the efficacy and accessibility of OncoNPC over traditional diagnostic tools and highlights the wider implications of machine learning technologies in delivering precision medicine.
The emergence of targeted immunotherapy over the past decade has led to a paradigm shift in clinical oncology. The efficacy of therapeutic agents in treating cancers such as chronic myeloid leukemia and HER2-positive breast cancer, for example, are now well established.2 However, CUPs, metastatic diseases where the primary tumor could not be identified present a significant challenge in this new era of precision medicine. CUPs account for 3%–5% of cancer diagnoses and present significant challenges in providing targeted therapy due to diagnostic uncertainty, and limited therapeutic targets.3 Indeed, the mortality rate in patients with CUPs is up to 80% at 12 months postdiagnosis.4
Several authors have hypothesized pathological mechanisms that might underlie CUPs. Lopez-Lazaro5 reconciled existing research on stem cells driving tumorigenesis by suggesting CUPs may occur as a result of stem cell migration followed by malignant transformation. This could, in theory, present as metastatic cancer in the absence of a clear primary tumor. Alternative studies have suggested CUPs occur from early dissemination of a primary tumor resulting in rapidly progressive metastatic disease.4 This would account for the significant mortality rate associated with CUPs as early dissemination could increase metastatic burden and limit therapeutic interventions.
Current approaches for investigating CUPs focus primarily on immunohistochemistry (IHC) techniques or molecular profiling of tumor samples. Interpretation of IHC results can be inherently subjective. Studies using IHC techniques to investigate CUPs were only able to suggest a primary tumor in 25% of patients.6 Molecular profiling compromises several techniques such as whole genome sequencing or gene expression analysis to determine the primary tumor based on the molecular characteristics of tumor cells. The efficacy of these methods remains unclear, however, as implementation into clinical practice is often limited by cost-effectiveness.
Moon et al utilized next-generation sequencing (NGS) data within this study to guide genomic profiling of CUPs.1 NGS elicits a cellular genetic profile by simultaneously analyzing millions of fragments of DNA. This method is relatively cost-effective and significant tumor NGS data already exists.7 This study therefore uses NGS data in concordance with electronic health records to retrospectively predict a primary tumor in 971 patients with CUP. The authors developed OncoNPC, a novel machine learning tool, which was trained on NGS data from patients with known primary tumor types. OncoNPC was able to classify 22 cancer types from patients with known primary tumors with high confidence and accounting for shifts in patient demographics. Interestingly, common cancer subtypes were identified with greater accuracy compared with rare groups. OncoNPC was then applied to patients with CUP, and predicted a primary cancer in 41% of patients with high confidence. This suggests a high proportion of CUPs are rare tumors. The commonest primary tumors were found to be lung, pancreatic, and bowel cancers which is consistent with pre-existing autopsy data from CUP mortalities.8 In contrast to existing techniques such as IHC, OncoNPC therefore allows a more objective method of analyzing CUPs. The predictions of primary subtype and associated confidence intervals are made irrespective of user experience. Additionally, once the tool has been trained on baseline data, clinical application is not resource intensive and therefore more accessible than IHC or molecular profiling.
This study also attempted to characterize risk in patients with CUP based on predicted cancer subtype. A polygenic risk score was calculated based on germline variation data and found patients with CUP had greater germline risk compared to patients with known primary cancers. OncoNPC was also able to stratify risk based on predicted cancer subtype, with gastric and pancreatic cancers demonstrating the worst prognosis. Retrospective analysis of 158 CUP patients treated with palliative intent found that treatment in concordance with CUP tumor subtype demonstrated significantly better survival outcomes. Notably, OncoNPC identified a further 24 patients within this cohort who may have been suitable for targeted genomic therapy before palliative care.
This study offers an insight into the role of machine-learning tools in facilitating the emergence of personalized medicine, as well as the identification of potential therapeutic targets in patients with CUP. Given the need for early diagnosis and intervention within this patient cohort, OncoNPC could form a useful adjunct in the diagnostic workup for CUPs (Figure 1). OncoNPC offers a more objective and cost-effective method for analyzing CUPs compared with traditional methods, with demonstrated efficacy in identifying tumor profiles. Despite predictions in just 41% of the patient cohort, this study could pave the way for future research where predictive capabilities are augmented with clinical information, pathology reports and imaging results. Interestingly, the authors demonstrated that OncoNPC was able to assess germline risk for tumors. The increased germline risk score for CUPs compared to cancers with known primaries corresponds to an increased propensity for these tumors to metastasize and present as clinically aggressive disease. This could be due to greater mutational burden within CUPs. Accurately determining the risk of tumor spread could therefore allow OncoNPC to become a powerful prognostic tool and guide clinical practice. Indeed, retrospective analysis showed treating patients in concordance with OncoNPC results could have better survival outcomes. Additionally, information from this tool regarding prognosis could guide appropriate transitions to palliative treatment and ultimately improve the quality of end-of-life. Perhaps the most significant finding from this study shows that OncoNPC identified 15% of patients within the palliative cohort who may have been suited for targeted genomic therapy. This shows the impact OncoNPC could have in guiding clinical decision making and management plans. Ultimately, the findings from this study offer a glimpse into machine-learning tools and highlight the role they could play in this new era of precision medicine.
Shuvam Sarkar: Data curation (lead); formal analysis (lead); methodology (lead); writing—original draft (equal); writing—review and editing (equal). Daniel T. Baptista-Hon: Conceptualization (lead); supervision (lead); validation (lead); visualization (lead); writing—original draft (equal); writing—review and editing (equal). Both authors have read and approved the final manuscript.
The authors declare no conflict of interest.
This research paper did not utilize any animals or human participants and therefore did not require any ethics approval.