{"title":"Towards Achieving Diagnostic Consensus in Medical Image Interpretation","authors":"Mike Seidel, A. Rasin, J. Furst, D. Raicu","doi":"10.1109/ICDMW.2014.134","DOIUrl":"https://doi.org/10.1109/ICDMW.2014.134","url":null,"abstract":"The workload associated with the daily job of a clinical radiologist has been steadily increasing as the volume of the archived and the newly acquired images grows. Computer-aided diagnostic systems are becoming an indispensable tool in automating image analysis and providing preliminary diagnosis that can help guide radiologist's decisions. In this paper, we introduce a novel metric to evaluate the difficulty of reaching diagnostic consensus when interpreting a case and illustrate several benefits that such insight can provide. Using a lung nodule image dataset, we demonstrate how a metric-based case partitioning can be used to better select how many radiologists are assigned to each case and how to identify image features that provide important feedback to further assist with the diagnosis. This knowledge can also be leveraged to shed 25% of radiologist annotations without any loss in predictive accuracy.","PeriodicalId":289269,"journal":{"name":"2014 IEEE International Conference on Data Mining Workshop","volume":"130 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116900689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Laura Richards, L. Antonie, S. Areibi, G. Grewal, K. Inwood, J. A. Ross
{"title":"Comparing Classifiers in Historical Census Linkage","authors":"Laura Richards, L. Antonie, S. Areibi, G. Grewal, K. Inwood, J. A. Ross","doi":"10.1109/ICDMW.2014.160","DOIUrl":"https://doi.org/10.1109/ICDMW.2014.160","url":null,"abstract":"Linking multiple data collections to create longitudinal data is an important research problem with multiple applications. Longitudinal data allows analysts to perform studies that would be unfeasible otherwise. In our research we are interested in linking historical census collections to create longitudinal data that would allow tracking people overtime. The goal of the linking is to identify the same person in multiple census collections. A classification system is employed to make the decision if two people are the same or not, based on their characteristics. In this paper we present an empirical study where we explore the use of three different classifiers in a record linkage system and we evaluate their performance.","PeriodicalId":289269,"journal":{"name":"2014 IEEE International Conference on Data Mining Workshop","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114957182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Social Preference Ontologies for Enriching User and Item Data in Recommendation Systems","authors":"Christopher Krauss, S. Arbanowski","doi":"10.1109/ICDMW.2014.76","DOIUrl":"https://doi.org/10.1109/ICDMW.2014.76","url":null,"abstract":"Some of the known issues of recommendation algorithms are a result of the so called \"Cold Start Problem\" that is caused by a lack of sufficient data of users, items or the content, which are essential for the calculation of context-sensitive predictions. Along with this comes the \"Sparsity Problem\" which also exposes the problem of recommendation systems which are being provided with too little information of user feedback such as likes and views. As a consequent collaborative and knowledge-based filtering algorithms are unable of precise prediction which is causing a decline of the customer satisfaction. If beyond that there also is a lack of metadata, the calculation of similarities through content-based filtering algorithms is likely to fail as well. This paper introduces preference ontologies and how they help to reduce these issues by analyzing external data, in terms of texts from social networks and other web sources. Thereby we introduce a self-designed semantic engine, performing sentiment analysis and semantic keyword extraction. These novel ontologies represent the mined information and thus, describe the users interest in automatic analyzed topics and map them to the meta data of items in recommendation engines.","PeriodicalId":289269,"journal":{"name":"2014 IEEE International Conference on Data Mining Workshop","volume":"142 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123434986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"NLMF: NonLinear Matrix Factorization Methods for Top-N Recommender Systems","authors":"Santosh Kabbur, G. Karypis","doi":"10.1109/ICDMW.2014.108","DOIUrl":"https://doi.org/10.1109/ICDMW.2014.108","url":null,"abstract":"Many existing state-of-the-art top-N recommendation methods model users and items in the same latent space and the recommendation scores are computed via the dot product between those vectors. These methods assume that the user preference is consistent across all the items that he/she has rated. This assumption is not necessarily true, since many users can have multiple personas/interests and their preferences can vary with each such interest. To address this, a recently proposed method modeled the users with multiple interests. In this paper, we build on this approach and model users using a much richer representation. We propose a method which models the user preference as a combination of having global preference and interest-specific preference. The proposed method uses a nonlinear model for predicting the recommendation score, which is used to perform top-N recommendation task. The recommendation score is computed as a sum of the scores from the components representing global preference and interest-specific preference. A comprehensive set of experiments on multiple datasets show that the proposed model outperforms other state-of-the-art methods for top-N recommendation task.","PeriodicalId":289269,"journal":{"name":"2014 IEEE International Conference on Data Mining Workshop","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123693814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Medical Error Prevention Based on Path Integration System Approach","authors":"S. W. Chan, C. Leung, V. Cheng, Jiming Liu","doi":"10.1109/ICDMW.2014.119","DOIUrl":"https://doi.org/10.1109/ICDMW.2014.119","url":null,"abstract":"Recent findings show that medical errors are prevalent and lead to many unnecessary iatrogenic deaths and injuries. Medical error studies using different approaches such as person approach or system approach enable clinicians to have a better understanding and valuable insight into error prevention through employing guidelines, standardized procedures, and devices, etc. A novel approach is proposed, known as the Path Integration System Approach (PISA), based on information technology (IT), in the design of health systems processes to reduce adverse events. Unlike the person approach or the system approach, which basically addresses error-prone procedures and situations by building more error barriers or changing human behaviour, PISA is concerned with re-constructing the medical procedure paths and system operations to lower the association between clinical staff and medical errors, through the judicious deployment of Information and Communication Technology. It is shown that PISA has the potential to achieve medical error reduction in excess of 70%. Examples and guidelines are given to illustrate the application of the approach. The paper proposes an integrated approach which transforms the effective deployment of IT systems to focus on integration and communication between subsystems. Through the adoption of PISA, such integration maybe achieved to the benefit of different groups of stakeholders in medical error prevention.","PeriodicalId":289269,"journal":{"name":"2014 IEEE International Conference on Data Mining Workshop","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125063452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Creating Essential Hypothesis and Rules in Product Planning -- Introducing Abductive Inference Model","authors":"Jun Nakamura","doi":"10.1109/ICDMW.2014.63","DOIUrl":"https://doi.org/10.1109/ICDMW.2014.63","url":null,"abstract":"The abductive inference model has been discussed in the context of business strategy. However, the model seems unrealistic for applications in the real business world. Therefore, this study improves the model by formalizing experimental case studies in a web-based workplace for generating product ideas. The developed model implies the needs for visualizing human's thought process to create hypothesis and rules in product planning is a clue to designing data market place.","PeriodicalId":289269,"journal":{"name":"2014 IEEE International Conference on Data Mining Workshop","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127563979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Popular Items or Niche Items: Flexible Recommendation Using Cosine Patterns","authors":"Yaqiong Wang, Junjie Wu, Zhiang Wu, Hua Yuan, Xu Zhang","doi":"10.1109/ICDMW.2014.157","DOIUrl":"https://doi.org/10.1109/ICDMW.2014.157","url":null,"abstract":"Recent years have witnessed the explosive growth of recommender systems in various exciting application domains such as electronic commerce, social networking, and location-based services. A great many algorithms have been proposed to improve the accuracy of recommendation, but until recently the long tail problem rising from inadequate recommendation of niche items is recognized as a real challenge to a recommender. This is particularly true for ultra-massive online retailers who usually have tremendous niche goods for sale. In light of this, in this paper, we propose a pattern-based method called CORE for flexible recommendation of both popular and niche items. CORE has two notable features compared with various existing recommenders. First, it is superior to previous pattern-based methods by adopting cosine rather than frequent patterns for recommendation. This helps filter out spurious cross-support patterns harmful to recommendation. Second, compared with some benchmark methods such as SVD and LDA, CORE does well in niche item recommendation given particularly heavy tailed data sets. Indeed, the coupled configuration of the support and cosine measures enables CORE to switch freely between recommending popular and niche items. Experimental results on two benchmark data sets demonstrate the effectiveness of CORE especially in long tail recommendation. To our best knowledge, CORE is among the earliest recommenders designed purposefully for flexible recommendation of both head and tail items.","PeriodicalId":289269,"journal":{"name":"2014 IEEE International Conference on Data Mining Workshop","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128337130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Novel Time-Frequency Analysis Approach for Nonstationary Time Series Using Multiresolution Wavelet","authors":"Si-Rui Tan, Yang Li, Ke Li","doi":"10.1109/ICDMW.2014.89","DOIUrl":"https://doi.org/10.1109/ICDMW.2014.89","url":null,"abstract":"An efficient time-varying autoregressive (TVAR) modeling scheme using the multiresolution wavelet method is proposed for modeling nonstationary signals and with application to time-frequency analysis (TFA) of time-varying signal. In the new parametric modeling framework, the time-dependent parameters of the TVAR model are locally represented using a novel multiresolution wavelet decomposition scheme. The wavelet coefficients are estimated using an effective orthogonal least squares (OLS) algorithm. The resultant estimation of time-dependent spectral density in the signal can simultaneously achieve high resolution in both time and frequency, which is a powerful TFA technique for nonstationary signals. An artificial EEG signal is included to show the effectiveness of the new proposed approach. The experimental results elucidate that the multiresolution wavelet approach is capable of achieving a more accurate time-frequency representation of nonstationary signals.","PeriodicalId":289269,"journal":{"name":"2014 IEEE International Conference on Data Mining Workshop","volume":"25 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121270839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Classification of 3D Surface Data Using the Concept of Vertex Unique Labelled Subgraphs","authors":"Wen Yu, Frans Coenen, M. Zito, Kwankamon Dittakan","doi":"10.1109/ICDMW.2014.125","DOIUrl":"https://doi.org/10.1109/ICDMW.2014.125","url":null,"abstract":"An overview is presented on the use of the concept of Vertex Unique Labelled Sub graph (VULS) mining for the use of localised classification of regions in 3D surfaces represented in terms of grid graphs. A VULS is a sub graph within some larger graph G that has a unique (\"one-of\") vertex labelling associated with it. Given a 3D surface represented as a grid graph, we can identify a number of different forms of VULS that may be discovered: (i) all, (ii) minimal, (iii) frequent and (iv) frequent minimal. Algorithms for discovering (mining) these are presented in the paper. The paper also presents the Backward Match Voting (BMV) algorithm for predicting (classifying) vertex labels associated with an \"unseen' graph using a given collection of VULS. The operation of the VULS mining algorithms, and the BMV algorithm, is fully described and evaluated. The evaluation is conducted using satellite image data where the ground surface is represented as a 3D surface with the z dimension describing grey scale value. The idea is to predict vertex labels describing ground type. A statistical analysis of the results, using the Friedman test, is also presented so as to demonstrate the statistical significance of the VULS based 3D surface regional classification idea. The results indicate that the VULS concept is well suited to the task of 3D surface regional classification.","PeriodicalId":289269,"journal":{"name":"2014 IEEE International Conference on Data Mining Workshop","volume":"207 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122430592","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Limeng Cui, Fan Meng, Yong Shi, Minqiang Li, An Liu
{"title":"A Hierarchy Method Based on LDA and SVM for News Classification","authors":"Limeng Cui, Fan Meng, Yong Shi, Minqiang Li, An Liu","doi":"10.1109/ICDMW.2014.8","DOIUrl":"https://doi.org/10.1109/ICDMW.2014.8","url":null,"abstract":"He growth of the online data provides the user a access to information on the Internet but also creates the challenges to obtain the valuable knowledge. In this paper we focus on news text classification, which is meaningful for information provider to organize and display the news but also for the users to reach the valuable information easily. A hierarchy method based on LDA and SVM is proposed to accomplish this task and several experiments are conducted to evaluate our method. The results show that our method is promising in text classification problems.","PeriodicalId":289269,"journal":{"name":"2014 IEEE International Conference on Data Mining Workshop","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116035491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}