{"title":"Computational Methods for Structure-to-Function Analysis of Diet-Derived Catechins-Mediated Targeting of In Vitro Vasculogenic Mimicry.","authors":"Abicumaran Uthamacumaran, Narjara Gonzalez Suarez, Abdoulaye Baniré Diallo, Borhane Annabi","doi":"10.1177/11769351211009229","DOIUrl":"https://doi.org/10.1177/11769351211009229","url":null,"abstract":"<p><strong>Background: </strong>Vasculogenic mimicry (VM) is an adaptive biological phenomenon wherein cancer cells spontaneously self-organize into 3-dimensional (3D) branching network structures. This emergent behavior is considered central in promoting an invasive, metastatic, and therapy resistance molecular signature to cancer cells. The quantitative analysis of such complex phenotypic systems could require the use of computational approaches including machine learning algorithms originating from complexity science.</p><p><strong>Procedures: </strong><i>In vitro</i> 3D VM was performed with SKOV3 and ES2 ovarian cancer cells cultured on Matrigel. Diet-derived catechins disruption of VM was monitored at 24 hours with pictures taken with an inverted microscope. Three computational algorithms for complex feature extraction relevant for 3D VM, including 2D wavelet analysis, fractal dimension, and percolation clustering scores were assessed coupled with machine learning classifiers.</p><p><strong>Results: </strong>These algorithms demonstrated the structure-to-function galloyl moiety impact on VM for each of the gallated catechin tested, and shown applicable in quantifying the drug-mediated structural changes in VM processes.</p><p><strong>Conclusions: </strong>Our study provides evidence of how appropriate 3D VM compression and feature extractors coupled with classification/regression methods could be efficient to study <i>in vitro</i> drug-induced perturbation of complex processes. Such approaches could be exploited in the development and characterization of drugs targeting VM.</p>","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":"20 ","pages":"11769351211009229"},"PeriodicalIF":2.0,"publicationDate":"2021-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/11769351211009229","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38954027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cancer InformaticsPub Date : 2021-03-19eCollection Date: 2021-01-01DOI: 10.1177/11769351211002494
Jason D Wells, Jacqueline R Griffin, Todd W Miller
{"title":"Pan-Cancer Transcriptional Models Predicting Chemosensitivity in Human Tumors.","authors":"Jason D Wells, Jacqueline R Griffin, Todd W Miller","doi":"10.1177/11769351211002494","DOIUrl":"10.1177/11769351211002494","url":null,"abstract":"<p><strong>Motivation: </strong>Despite increasing understanding of the molecular characteristics of cancer, chemotherapy success rates remain low for many cancer types. Studies have attempted to identify patient and tumor characteristics that predict sensitivity or resistance to different types of conventional chemotherapies, yet a concise model that predicts chemosensitivity based on gene expression profiles across cancer types remains to be formulated. We attempted to generate pan-cancer models predictive of chemosensitivity and chemoresistance. Such models may increase the likelihood of identifying the type of chemotherapy most likely to be effective for a given patient based on the overall gene expression of their tumor.</p><p><strong>Results: </strong>Gene expression and drug sensitivity data from solid tumor cell lines were used to build predictive models for 11 individual chemotherapy drugs. Models were validated using datasets from solid tumors from patients. For all drug models, accuracy ranged from 0.81 to 0.93 when applied to all relevant cancer types in the testing dataset. When considering how well the models predicted chemosensitivity or chemoresistance within individual cancer types in the testing dataset, accuracy was as high as 0.98. Cell line-derived pan-cancer models were able to statistically significantly predict sensitivity in human tumors in some instances; for example, a pan-cancer model predicting sensitivity in patients with bladder cancer treated with cisplatin was able to significantly segregate sensitive and resistant patients based on recurrence-free survival times (<i>P</i> = .048) and in patients with pancreatic cancer treated with gemcitabine (<i>P</i> = .038). These models can predict chemosensitivity and chemoresistance across cancer types with clinically useful levels of accuracy.</p>","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":"20 ","pages":"11769351211002494"},"PeriodicalIF":2.0,"publicationDate":"2021-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/11769351211002494","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25555274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cancer InformaticsPub Date : 2021-01-05eCollection Date: 2021-01-01DOI: 10.1177/1176935120985132
Jimmy T Efird
{"title":"Goldilocks Rounding: Achieving Balance Between Accuracy and Parsimony in the Reporting of Relative Effect Estimates.","authors":"Jimmy T Efird","doi":"10.1177/1176935120985132","DOIUrl":"https://doi.org/10.1177/1176935120985132","url":null,"abstract":"<p><p>Researchers often report a measure to several decimal places more than what is sensible or realistic. Rounding involves replacing a number with a value of lesser accuracy while minimizing the practical loss of validity. This practice is generally acceptable to simplify data presentation and to facilitate the communication and comparison of research results. Rounding also may reduce spurious accuracy when the extraneous digits are not justified by the exactness of the recording instrument or data collection procedure. However, substituting a more explicit or simpler representation for an original measure may not be practicable or acceptable if an adequate degree of accuracy is not retained. The error introduced by rounding exact numbers may result in misleading conclusions and the interpretation of study findings. For example, rounding the upper confidence interval for a relative effect estimate of 0.996 to 2 decimal places may obscure the statistical significance of the result. When presenting the findings of a study, authors need to be careful that they do not report numbers that contain too few significant digits. Equally important, they should avoid providing more significant figures than are warranted to convey the underlying meaning of the result.</p>","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":"20 ","pages":"1176935120985132"},"PeriodicalIF":2.0,"publicationDate":"2021-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/1176935120985132","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38827602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cancer InformaticsPub Date : 2020-12-10eCollection Date: 2017-01-01DOI: 10.1177/1176935116684825
Yu Jiang, Yuan Huang, Yinhao Du, Yinjun Zhao, Jie Ren, Shuangge Ma, Cen Wu
{"title":"Identification of Prognostic Genes and Pathways in Lung Adenocarcinoma Using a Bayesian Approach.","authors":"Yu Jiang, Yuan Huang, Yinhao Du, Yinjun Zhao, Jie Ren, Shuangge Ma, Cen Wu","doi":"10.1177/1176935116684825","DOIUrl":"10.1177/1176935116684825","url":null,"abstract":"<p><p>Lung cancer is the leading cause of cancer-associated mortality in the United States and the world. Adenocarcinoma, the most common subtype of lung cancer, is generally diagnosed at the late stage with poor prognosis. In the past, extensive effort has been devoted to elucidating lung cancer pathogenesis and pinpointing genes associated with survival outcomes. As the progression of lung cancer is a complex process that involves coordinated actions of functionally associated genes from cancer-related pathways, there is a growing interest in simultaneous identification of both prognostic pathways and important genes within those pathways. In this study, we analyse The Cancer Genome Atlas lung adenocarcinoma data using a Bayesian approach incorporating the pathway information as well as the interconnections among genes. The top 11 pathways have been found to play significant roles in lung adenocarcinoma prognosis, including pathways in mitogen-activated protein kinase signalling, cytokine-cytokine receptor interaction, and ubiquitin-mediated proteolysis. We have also located key gene signatures such as <i>RELB</i>, <i>MAP4K1</i>, and <i>UBE2C</i>. These results indicate that the Bayesian approach may facilitate discovery of important genes and pathways that are tightly associated with the survival of patients with lung adenocarcinoma.</p>","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":"16 ","pages":"1176935116684825"},"PeriodicalIF":2.0,"publicationDate":"2020-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/1176935116684825","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38743896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cancer InformaticsPub Date : 2020-11-24eCollection Date: 2020-01-01DOI: 10.1177/1176935120976399
Johannes Ptok, Stephan Theiss, Heiner Schaal
{"title":"VarCon: An R Package for Retrieving Neighboring Nucleotides of an SNV.","authors":"Johannes Ptok, Stephan Theiss, Heiner Schaal","doi":"10.1177/1176935120976399","DOIUrl":"https://doi.org/10.1177/1176935120976399","url":null,"abstract":"<p><p>Reporting of a single nucleotide variant (SNV) follows the Sequence Variant Nomenclature (http://varnomen.hgvs.org/), using an unambiguous numbering scheme specific for coding and noncoding DNA. However, the corresponding sequence neighborhood of a given SNV, which is required to assess its impact on splicing regulation, is not easily accessible from this nomenclature. Providing fast and easy access to this neighborhood just from a given SNV reference, the novel tool VarCon combines information of the Ensembl human reference genome and the corresponding transcript table for accurate retrieval. VarCon also displays splice site scores (HBond and MaxEnt scores) and HEXplorer profiles of an SNV neighborhood, reflecting position-dependent splice enhancing and silencing properties.</p>","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":"19 ","pages":"1176935120976399"},"PeriodicalIF":2.0,"publicationDate":"2020-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/1176935120976399","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38689828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cancer InformaticsPub Date : 2020-11-11eCollection Date: 2020-01-01DOI: 10.1177/1176935120972383
Sama Rezasoltani, Mahrooyeh Hadizadeh, Mina Golmohammadi, Ehsan Nazemalhossini-Mojarad, Sina Salari, Hamid Rezvani, Hamid Asadzadeh-Aghdaei, Michael Ladomery, Chris Young, Fakhrosadat Anaraki, Sarah Almond, Maziar Ashrafian Bonab
{"title":"APC and AXIN2 Are Promising Biomarker Candidates for the Early Detection of Adenomas and Hyperplastic Polyps.","authors":"Sama Rezasoltani, Mahrooyeh Hadizadeh, Mina Golmohammadi, Ehsan Nazemalhossini-Mojarad, Sina Salari, Hamid Rezvani, Hamid Asadzadeh-Aghdaei, Michael Ladomery, Chris Young, Fakhrosadat Anaraki, Sarah Almond, Maziar Ashrafian Bonab","doi":"10.1177/1176935120972383","DOIUrl":"10.1177/1176935120972383","url":null,"abstract":"<p><p>Aberrant activation of the WNT/CTNNB1 pathway is notorious in colorectal cancer (CRC). Here, we demonstrate that the expression of specific and crucial WNT signaling pathway genes is linked to disease progression in colonic adenomatous (AP) and hyperplastic (HP) polyps in an Iranian patient population. Thus, we highlight potential gene expression profiles as candidate novel biomarkers for the early detection of CRC. From a 12-month study (2016-2017), 44 biopsy samples were collected during colonoscopy from the patients with colorectal polyps and 10 healthy subjects for normalization. Clinical and demographic data were collected in all cases, and mRNA expression of APC, CTNNB1, CDH1, AXIN1, and AXIN2 genes was investigated using real-time polymerase chain reaction (PCR). CTNNB1 and CDH1 expression levels were unaltered in AP and HP subjects, whereas mRNA expression of APC was decreased in AP contrasted with HP subjects, with a significant association between APC downregulation and polyp size. Although AXIN1 showed no changes between AP and HP groups, a significant association between AXIN1 and dysplasia grade was found. Also, significant upregulation of AXIN2 in both AP and HP subjects was detected. In summary, we have shown increased expression of AXIN2 and decreased expression of APC correlating with grade of dysplasia and polyp size. Hence, AXIN2 and APC should be explored as biomarker candidates for early detection of AP and HP polyps in CRC.</p>","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":"19 ","pages":"1176935120972383"},"PeriodicalIF":2.0,"publicationDate":"2020-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/1176935120972383","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38302848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cancer InformaticsPub Date : 2020-11-11eCollection Date: 2020-01-01DOI: 10.1177/1176935120972377
Brian O'Sullivan, Cathal Seoighe
{"title":"vcfView: An Extensible Data Visualization and Quality Assurance Platform for Integrated Somatic Variant Analysis.","authors":"Brian O'Sullivan, Cathal Seoighe","doi":"10.1177/1176935120972377","DOIUrl":"https://doi.org/10.1177/1176935120972377","url":null,"abstract":"<p><strong>Motivation: </strong>Somatic mutations can have critical prognostic and therapeutic implications for cancer patients. Although targeted methods are often used to assay specific cancer driver mutations, high throughput sequencing is frequently applied to discover novel driver mutations and to determine the status of less-frequent driver mutations. The task of recovering somatic mutations from these data is nontrivial as somatic mutations must be distinguished from germline variants, sequencing errors, and other artefacts. Consequently, bioinformatics pipelines for recovery of somatic mutations from high throughput sequencing typically involve a large number of analytical choices in the form of quality filters.</p><p><strong>Results: </strong>We present vcfView, an interactive tool designed to support the evaluation of somatic mutation calls from cancer sequencing data. The tool takes as input a single variant call format (VCF) file and enables researchers to explore the impacts of analytical choices on the mutant allele frequency spectrum, on mutational signatures and on annotated somatic variants in genes of interest. It allows variants that have failed variant caller filters to be re-examined to improve sensitivity or guide the design of future experiments. It is extensible, allowing other algorithms to be incorporated easily.</p><p><strong>Availability: </strong>The shiny application can be downloaded from GitHub (https://github.com/BrianOSullivanGit/vcfView). All data processing is performed within <i>R</i> to ensure platform independence. The app has been tested on RStudio, version 1.1.456, with base <i>R</i> 3.6.2 and Shiny 1.4.0. A vignette based on a publicly available data set is also available on GitHub.</p>","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":"19 ","pages":"1176935120972377"},"PeriodicalIF":2.0,"publicationDate":"2020-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/1176935120972377","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38302847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cancer InformaticsPub Date : 2020-11-06eCollection Date: 2020-01-01DOI: 10.1177/1176935120969696
Madhuri Saindane, Harikrishna Reddy Rallabandi, Kyoung Sik Park, Alexander Heil, Sang Eun Nam, Young Bum Yoo, Jung-Hyun Yang, Ik Jin Yun
{"title":"Prognostic Significance of Prostaglandin-Endoperoxide Synthase-2 Expressions in Human Breast Carcinoma: A Multiomic Approach.","authors":"Madhuri Saindane, Harikrishna Reddy Rallabandi, Kyoung Sik Park, Alexander Heil, Sang Eun Nam, Young Bum Yoo, Jung-Hyun Yang, Ik Jin Yun","doi":"10.1177/1176935120969696","DOIUrl":"https://doi.org/10.1177/1176935120969696","url":null,"abstract":"<p><p>Prostaglandin-endoperoxide synthase-2 (<i>PTGS2</i>) plays a pivotal role in inflammation and carcinogenesis in human breast cancer. Our aim of the study is to find the prognostic value of <i>PTGS2</i> in breast cancer. We conducted a multiomic analysis to determine whether <i>PTGS2</i> functions as a prognostic biomarker in human breast cancer. We explored <i>PTGS2</i> mRNA expressions using different public bioinformatics portals. Oncomine, Serial Analysis of Gene Expression (SAGE), GEPIA, ULCAN, PrognoScan database, Kaplan-Meier Plotter, bc-GenExMiner, USC XENA, and Cytoscape/STRING DB were used to identify the prognostic roles of <i>PTGS2</i> in breast cancer. Based on the clinicopathological analysis, decreased <i>PTGS2</i> expressions correlated positively with older age, lymph node status, the human epidermal growth factor receptor 2 (HER2) status (<i>P</i> < .0001), estrogen receptor (ER+) expression (<i>P</i> < .0001) Luminal A (<i>P</i> < .0001), and Luminal B (<i>P</i> < .0001). Interestingly, progesterone receptor (PR) (<i>P</i> < .0001) negative showed a high expression of <i>PTGS2</i>. Prostaglandin-endoperoxide synthase-2 was downregulated in breast cancer tissues than in normal tissues. In the PrognoScan database and, Kaplan-Meier Scanner, downregulated expressions of <i>PTGS2</i> associated with poor overall survival (OS), relapse-free survival (RFS), and distant metastasis-free survival. The methylation levels were significantly higher in the Luminal B subtype. Through oncomine coexpressed gene analysis, we found a positive correlation between <i>PTGS2</i> and interleukin-6 (<i>IL-</i>6) expression in breast cancer tissues. These results indicate that downregulated expressions of <i>PTGS2</i> can be used as a promising prognostic biomarker and Luminal B hyper methylation may play an important role in the development of breast cancers. However, to clarify our results, extensive study is required.</p>","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":"19 ","pages":"1176935120969696"},"PeriodicalIF":2.0,"publicationDate":"2020-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/1176935120969696","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38634073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cancer InformaticsPub Date : 2020-11-04eCollection Date: 2020-01-01DOI: 10.1177/1176935120969692
Mohammed Amine Bendahou, Azeddine Ibrahimi, Mahjouba Boutarbouch
{"title":"Bioinformatics Analysis of Differentially Expressed Genes and miRNAs in Low-Grade Gliomas.","authors":"Mohammed Amine Bendahou, Azeddine Ibrahimi, Mahjouba Boutarbouch","doi":"10.1177/1176935120969692","DOIUrl":"https://doi.org/10.1177/1176935120969692","url":null,"abstract":"<p><p>Low-grade glioma is the most common type of primary intracranial tumor. In the last 3 years, new observations of molecular precursors in adults with gliomas have led to a modification in the histopathologic classification of these brain tumors. Among the biomarkers that have been highlighted, we have the micro RNAs (miRNAs) which play a crucial role in the regulation of gene expression and the long noncoding RNAs (lncRNAs) controlling various cellular and metabolic pathways. In our study, large-scale data on sequenced RNA and miRNAs from 516 patients were obtained from the Cancer Genome Atlas database by the TCGAbiolinks package. We identified the differential expression of miRNAs and genes using the Limma package and then we used the ClusterProfiler package for annotations of the biological pathways of the expressed genes, the survival package to estimate the survival analysis, and the GDCRNATools package to determine miRNAs-genes and miRNAs-lncRNAs interactions. We obtained a significant correlation between the miRNAs identified and the overall survival of the patients (log-rank <i>P</i> < .05) and we have theoretically proposed a novel network of miRNAs involved in low-grade gliomas, specifically astrocytomas and oligodendrogliomas, which combine both genes and lncRNAs.</p>","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":"19 ","pages":"1176935120969692"},"PeriodicalIF":2.0,"publicationDate":"2020-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/1176935120969692","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38634072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cancer InformaticsPub Date : 2020-10-13eCollection Date: 2020-01-01DOI: 10.1177/1176935120965542
Rezvan Ehsani, Finn Drabløs
{"title":"Robust Distance Measures for <i>k</i>NN Classification of Cancer Data.","authors":"Rezvan Ehsani, Finn Drabløs","doi":"10.1177/1176935120965542","DOIUrl":"10.1177/1176935120965542","url":null,"abstract":"<p><p>The <i>k</i>-Nearest Neighbor (<i>k</i>NN) classifier represents a simple and very general approach to classification. Still, the performance of <i>k</i>NN classifiers can often compete with more complex machine-learning algorithms. The core of <i>k</i>NN depends on a \"guilt by association\" principle where classification is performed by measuring the similarity between a query and a set of training patterns, often computed as distances. The relative performance of <i>k</i>NN classifiers is closely linked to the choice of distance or similarity measure, and it is therefore relevant to investigate the effect of using different distance measures when comparing biomedical data. In this study on classification of cancer data sets, we have used both common and novel distance measures, including the novel distance measures Sobolev and Fisher, and we have evaluated the performance of <i>k</i>NN with these distances on 4 cancer data sets of different type. We find that the performance when using the novel distance measures is comparable to the performance with more well-established measures, in particular for the Sobolev distance. We define a robust ranking of all the distance measures according to overall performance. Several distance measures show robust performance in <i>k</i>NN over several data sets, in particular the Hassanat, Sobolev, and Manhattan measures. Some of the other measures show good performance on selected data sets but seem to be more sensitive to the nature of the classification data. It is therefore important to benchmark distance measures on similar data prior to classification to identify the most suitable measure in each case.</p>","PeriodicalId":35418,"journal":{"name":"Cancer Informatics","volume":"19 ","pages":"1176935120965542"},"PeriodicalIF":2.0,"publicationDate":"2020-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1177/1176935120965542","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38538943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}