{"title":"Error bounds of decision templates and support vector machines in decision fusion","authors":"I. Dimou, M. Zervakis","doi":"10.1504/IJKESDP.2009.028816","DOIUrl":"https://doi.org/10.1504/IJKESDP.2009.028816","url":null,"abstract":"The need for accurate, robust, optimised classification systems has been driving information fusion methodology towards a state of early maturity throughout the last decade. Among its shortcomings we identify the lack of statistical foundation in many ad-hoc fusion methods and the lack of strong non-linear combiners with the capacity to partition complex decision spaces. In this work, we draw parallels between the well known decision templates (DT) fusion method and the nearest mean distance classifier in order to extract a useful formulation for the overall expected classification error. Additionally we evaluate DTs against a support vector machine (SVM) discriminant hyper-classifier, using two benchmark biomedical datasets. Beyond measuring performance statistics, we advocate the theoretical advantages of support vectors as multiple attractor points in a hyper-classifier's feature space.","PeriodicalId":347123,"journal":{"name":"Int. J. Knowl. Eng. Soft Data Paradigms","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127963976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Taktak, A. Eleuteri, M. Aung, P. Lisboa, L. Desjardins, B. Damato
{"title":"Survival analysis in cancer using a partial logistic neural network model with Bayesian regularisation framework: a validation study","authors":"A. Taktak, A. Eleuteri, M. Aung, P. Lisboa, L. Desjardins, B. Damato","doi":"10.1504/IJKESDP.2009.028819","DOIUrl":"https://doi.org/10.1504/IJKESDP.2009.028819","url":null,"abstract":"This paper describes a multicentre longitudinal cohort study to evaluate the predictive accuracy of a regularised Bayesian neural network model in a prognostic application. The study sample (n = 5442) comprises subjects treated with intraocular melanoma in two different centres in Liverpool and Paris. External validation was carried out by fitting the model to the data from Liverpool set and predicting for the data from Paris. The performance of the model in out-of-sample prediction was assessed statistically for discrimination of outcomes and calibration. It was also evaluated clinically by comparing against the accepted TNM staging system. The model had good discrimination with Harrell's C index > 0.7 up to ten years of follow-up. Calibration results were also good up to ten years using a Hosmer-Lemeshow type analysis (p > 0.05). The paper: 1) deals with the issue of missing data using methods that are well accepted in the literature; 2) proposes a framework for externally validating machine learning models applied to survival analysis; 3) applies accepted methods for dealing with missing data; 4) proposes an alternative staging system based on the model. The new staging system, which takes into account histopathologic information, has several advantages over the existing staging system.","PeriodicalId":347123,"journal":{"name":"Int. J. Knowl. Eng. Soft Data Paradigms","volume":"12 13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126184343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Image classification of artificial fingerprints using Gabor wavelet filters, self-organising maps and Hermite/Laguerre neural networks","authors":"Leif E. Peterson, K. Larin","doi":"10.1504/IJKESDP.2009.028817","DOIUrl":"https://doi.org/10.1504/IJKESDP.2009.028817","url":null,"abstract":"Image classification was performed using Gabor wavelet filters for image feature extraction, self-organising maps (SOM) for dimensional reduction of Gabor wavelet filters, and forward (FNN), Hermite (HNN) and Laguerre (LNN) neural networks to classify real and artificial fingerprint images from optical coherence tomography (OCT). Use of a SOM after Gabor edge detection of OCT images of fingerprint and material surfaces resulted in the greatest classification performance when compared with moments based on colour, texture and shape. The FNN and HNN performed similarly, however, the LNN performed the worst at a low number of hidden nodes but overtook performance of the FNN and HNN as the number of hidden nodes approached n = 10.","PeriodicalId":347123,"journal":{"name":"Int. J. Knowl. Eng. Soft Data Paradigms","volume":"176 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116397280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Avogadri, Matteo Brioschi, F. Ferrazzi, M. Ré, A. Beghini, G. Valentini
{"title":"A stability-based algorithm to validate hierarchical clusters of genes","authors":"R. Avogadri, Matteo Brioschi, F. Ferrazzi, M. Ré, A. Beghini, G. Valentini","doi":"10.1504/IJKESDP.2009.028985","DOIUrl":"https://doi.org/10.1504/IJKESDP.2009.028985","url":null,"abstract":"Stability-based methods have been successfully applied in functional genomics to the analysis of the reliability of clusterings characterised by a relatively low number of examples and clusters. The application of these methods to the validation of gene clusters discovered in biomolecular data may lead to computational problems due to the large amount of possible clusters involved. To address this problem, we present a stability-based algorithm to discover significant clusters in hierarchical clusterings with a large number of examples and clusters. The reliability of clusters of genes discovered in gene expression data of patients affected by human myeloid leukaemia is analysed through the proposed algorithm, and their relationships with specific biological processes are tested by means of Gene Ontology-based functional enrichment methods.","PeriodicalId":347123,"journal":{"name":"Int. J. Knowl. Eng. Soft Data Paradigms","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125022539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Concordance indices for comparing fuzzy, possibilistic, rough and grey partitions","authors":"M. Ceccarelli, A. Maratea","doi":"10.1504/IJKESDP.2009.028986","DOIUrl":"https://doi.org/10.1504/IJKESDP.2009.028986","url":null,"abstract":"Many indices have been proposed in literature for the comparison of two crisp data partitions, as resulting from two different classifications attempts, two different clustering solutions or the comparison of a predicted vs. a true labelling. Crisp partitions however cannot model ambiguity, vagueness or uncertainty in class definition and thus are not suitable to model all cases where information lacks, terms definitions are intrinsically imprecise or the classification results from a human expert knowledge representation. In presence of vagueness, it is not obvious how to quantify overlap or agreement of two different partitions of the same data and many facets of vagueness have emerged in literature through complimentary theories. The aim of the paper is to give simple numerical indices to quantify partitions agreement in the fuzzy, possibilistic, rough and grey frameworks. We propose a method based on pseudo counts, intuitive in the meaning and simple to implement that is very general and allows comparing fuzzy, possibilistic, rough and grey partitions, even with a different number of classes. The proposed method has just one free parameter used to model sensitivity to higher values of membership.","PeriodicalId":347123,"journal":{"name":"Int. J. Knowl. Eng. Soft Data Paradigms","volume":"2013 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127423931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. S. Fernandes, J. M. Fonseca, I. Jarman, T. Etchells, P. Lisboa, E. Biganzoli, C. Bajdik
{"title":"Evaluation of missing data imputation in longitudinal cohort studies in breast cancer survival","authors":"A. S. Fernandes, J. M. Fonseca, I. Jarman, T. Etchells, P. Lisboa, E. Biganzoli, C. Bajdik","doi":"10.1504/IJKESDP.2009.028818","DOIUrl":"https://doi.org/10.1504/IJKESDP.2009.028818","url":null,"abstract":"Missing values are common in medical datasets and may be amenable to data imputation when modelling a given data set or validating on an external cohort. This paper discusses model averaging over samples of the imputed distribution and extends this approach to generic non-linear modelling with the partial logistic artificial neural network (PLANN) regularised with automatic relevance determination (ARD). The study then applies the imputation to external validation, considering also predictions made for individual patients. A prognostic index is defined for the non-linear model and validation results show that four statistically significant risk groups identified at 95% level of confidence from the modelling data, from Christie Hospital (n = 931), retain good separation during external validation with data from the BC Cancer Agency (BCCA) (n = 4,083). A satisfactory discrimination and calibration performance was assessed with the time dependent C index (C td) and Hosmer-Lemeshow statistic, respectively, for both, training and validated model.","PeriodicalId":347123,"journal":{"name":"Int. J. Knowl. Eng. Soft Data Paradigms","volume":"53 3-4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123458503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Normalised compression distance and evolutionary distance of genomic sequences: comparison of clustering results","authors":"M. L. Rosa, S. Gaglio, R. Rizzo, A. Urso","doi":"10.1504/IJKESDP.2009.028987","DOIUrl":"https://doi.org/10.1504/IJKESDP.2009.028987","url":null,"abstract":"Genomic sequences are usually compared using evolutionary distance, a procedure that implies the alignment of the sequences. Alignment of long sequences is a time consuming procedure and the obtained dissimilarity results is not a metric. Recently, the normalised compression distance was introduced as a method to calculate the distance between two generic digital objects and it seems a suitable way to compare genomic strings. In this paper, the clustering and the non-linear mapping obtained using the evolutionary distance and the compression distance are compared, in order to understand if the two distances sets are similar.","PeriodicalId":347123,"journal":{"name":"Int. J. Knowl. Eng. Soft Data Paradigms","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114926893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A one class KNN for signal identification: a biological case study","authors":"V. Gesù, Giosuè Lo Bosco, Luca Pinello","doi":"10.1504/IJKESDP.2009.028989","DOIUrl":"https://doi.org/10.1504/IJKESDP.2009.028989","url":null,"abstract":"The paper describes an application of a one class KNN to identify different signal patterns embedded in a noise structured background. The problem becomes harder whenever only one pattern is well-represented in the signal; in such cases, one class classifier techniques are more indicated. The classification phase is applied after a preprocessing phase based on a multi layer model (MLM) that provides preliminary signal segmentation in an interval feature space. The one class KNN has been tested on synthetic and real (Saccharomyces cerevisiae) microarray data in the specific problem of DNA nucleosome and linker regions identification. Results have shown, in both cases, a good recognition rate.","PeriodicalId":347123,"journal":{"name":"Int. J. Knowl. Eng. Soft Data Paradigms","volume":"46 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120984423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ida Bifulco, Carmine Fedullo, F. Napolitano, G. Raiconi, R. Tagliaferri
{"title":"Multiple data structure discovery through global optimisation, meta clustering and consensus methods","authors":"Ida Bifulco, Carmine Fedullo, F. Napolitano, G. Raiconi, R. Tagliaferri","doi":"10.1504/IJKESDP.2009.028984","DOIUrl":"https://doi.org/10.1504/IJKESDP.2009.028984","url":null,"abstract":"When dealing with real data, clustering becomes a very complex problem, usually admitting many reasonable solutions. Moreover, even if completely different, such solutions can appear almost equivalent from the point of view of classical quality measures such as the distortion value. This implies that blind optimisation techniques alone are prone to discard qualitatively interesting solutions. In this work we propose a systematic approach to clustering, including the generation of a number of good solutions through global optimisation, the analysis of such solutions through meta clustering and the final construction of a small set of solutions through consensus clustering.","PeriodicalId":347123,"journal":{"name":"Int. J. Knowl. Eng. Soft Data Paradigms","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114628545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
H. Katagiri, Tomohiro Hayashida, I. Nishizaki, Jun Ishimatsu
{"title":"An approximate solution method based on tabu search for k-minimum spanning tree problems","authors":"H. Katagiri, Tomohiro Hayashida, I. Nishizaki, Jun Ishimatsu","doi":"10.1504/IJKESDP.2010.035908","DOIUrl":"https://doi.org/10.1504/IJKESDP.2010.035908","url":null,"abstract":"This paper considers a new tabu search-based approximate solution algorithm for k-minimum spanning tree problems. One of the features of the proposed algorithm is that it efficiently obtains local optimal solutions without applying minimum spanning tree algorithms. Numerical experimental results show that the proposed method provides a good performance especially for dense graphs in terms of solution accuracy over existing algorithms.","PeriodicalId":347123,"journal":{"name":"Int. J. Knowl. Eng. Soft Data Paradigms","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115113044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}