{"title":"Trip Router: A Time-Sensitive Route Recommender System","authors":"Hsun-Ping Hsieh, Cheng-te Li, Shou-de Lin","doi":"10.1109/ICDMW.2014.34","DOIUrl":"https://doi.org/10.1109/ICDMW.2014.34","url":null,"abstract":"Location-based services allow users to perform geo-spatial recording actions, which facilitates the mining of the moving activities of human beings. This paper proposes a system, Trip Router, to recommend time-sensitive trip routes consisting of a sequence of locations with associated time stamps based on knowledge extracted from large-scale location check-in data. We first propose a statistical route goodness measure considering: (a) the popularity of places, (b) the visiting order of places, (c) the proper visiting time of each place, and (d) the proper transit time from one place to another. Then we construct the time-sensitive route recommender with two major functions: (1) constructing the route based on the user-specified source location with the starting time, (2) composing the route between the specified source location and the destination location given a starting time. We devise a search method, Guidance Search, to derive the routes efficiently and effectively. Experiments on Gowalla check-in datasets with user study show the promising performance of our Trip Router system.","PeriodicalId":289269,"journal":{"name":"2014 IEEE International Conference on Data Mining Workshop","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121290788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Interoperability-Enriched App Recommendation","authors":"Shi Wen-xuan, Yin Airu","doi":"10.1109/ICDMW.2014.23","DOIUrl":"https://doi.org/10.1109/ICDMW.2014.23","url":null,"abstract":"At present, there are three main mobile apps marketplaces, iTunes App Store, Android Market and Windows Phone Store. With app recommendation technology, users not only discover more relevant apps, but they're also more likely to be engaged with those apps on a higher level because they are relevant to their interests in the first place. Collaborative filtering (CF) methods had been applied to recommender systems, but the CF techniques do not handle sparse dataset well, especially in the case of the cold start problem where there is no enough interaction for apps. To conquer this constraint, we propose a novel recommending model: Interoperability-Enriched Recommendation (IER) that is an interoperability-enriched collaborative filtering method for multi-marketplace app recommendation based on the global app ecosystem. Experimental results on the known marketplaces app dataset demonstrate that the proposed IER method significantly outperforms the state-of-the-art CF method and context-aware recommendations (CAR) method for app recommendation, especially in the cold start scenario.","PeriodicalId":289269,"journal":{"name":"2014 IEEE International Conference on Data Mining Workshop","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115075953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Abduction Dealing with Potential Values and Its Datasets","authors":"A. Abe","doi":"10.3233/IDT-150253","DOIUrl":"https://doi.org/10.3233/IDT-150253","url":null,"abstract":"In this paper, we introduce the concept of \"value\" to the abduction procedure. In fact, \"values\" are dealt with outside of the abduction procedure. For usual abduction, we always consider values included in the knowledge (hard coded). However, for a certain procedure, such values are unnecessary and sometimes harmful. Outside of the main abduction procedure, the inference system can flexibly deal with \"values\" to generate hypotheses considering the user's preference, situation, and the current trends, etc. In addition, we introduce the concept of the expiration of the hypothesis in the hypothesis generation. Recently generated hypotheses are not generated during such abduction procedure, in addition. Accordingly, the system can generate rather novel hypotheses to enjoy potential chances. This type of inference can be applied to daily life situations. Of course, according to application such as recommendation strategies in shops. The opposite strategy can be conducted.","PeriodicalId":289269,"journal":{"name":"2014 IEEE International Conference on Data Mining Workshop","volume":"178 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115704637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Quick Mining of Isomorphic Exact Large Patterns from Large Graphs","authors":"Islam Almasri, Xin Gao, N. Fedoroff","doi":"10.1109/ICDMW.2014.65","DOIUrl":"https://doi.org/10.1109/ICDMW.2014.65","url":null,"abstract":"The applications of the sub graph isomorphism search are growing with the growing number of areas that model their systems using graphs or networks. Specifically, many biological systems, such as protein interaction networks, molecular structures and protein contact maps, are modeled as graphs. The sub graph isomorphism search is concerned with finding all sub graphs that are isomorphic to a relevant query graph, the existence of such sub graphs can reflect on the characteristics of the modeled system. The most computationally expensive step in the search for isomorphic sub graphs is the backtracking algorithm that traverses the nodes of the target graph. In this paper, we propose a pruning approach that is inspired by the minimum remaining value heuristic that achieves greater scalability over large query and target graphs. Our testing on various biological networks shows that performance enhancement of our approach over existing state-of-the-art approaches varies between 6x and 53x.","PeriodicalId":289269,"journal":{"name":"2014 IEEE International Conference on Data Mining Workshop","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115918893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Improved Collaborative Method for Recommendation and Rating Prediction","authors":"Guoyong Cai, Rui Lv, Hao Wu, Xia Hu","doi":"10.1109/ICDMW.2014.60","DOIUrl":"https://doi.org/10.1109/ICDMW.2014.60","url":null,"abstract":"User-Item matrix (UI matrix) has been widely used in recommendation systems for data representation. However, as the amount of users and items increases, UI matrix becomes very sparse, which leads to unsatisfactory performance in traditional recommendation algorithms. To address this problem, in this paper, a rating prediction method with low sensitivity to sparse datasets is proposed. This method incorporates tag information and factor analysis approach that has been successfully applied in various areas, to discover the most similar top-N users based on the similarity of users' inner idiosyncrasies. Based on the most similar top-N users discovered, an improved collaborative filtering method is designed for rating prediction and recommendation. Extensive experiments have been done for comparing the proposed method with traditional collaborative filtering and the matrix factorization methods. The results demonstrate that our proposed method can achieve better accuracy, and it is less sensitive to sparseness of datasets.","PeriodicalId":289269,"journal":{"name":"2014 IEEE International Conference on Data Mining Workshop","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115469303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Community Detection on Large Graph Datasets for Recommender Systems","authors":"Rohit Parimi, Doina Caragea","doi":"10.1109/ICDMW.2014.159","DOIUrl":"https://doi.org/10.1109/ICDMW.2014.159","url":null,"abstract":"The explosion of content on World Wide Web (WWW) means that consumers are presented with a wide variety of items to choose from (items that concur with their taste and requirements). The generation of personalized consumer recommendations has become a crucial functionality for many web applications, yet a challenging task, given the scale and nature of the data. One popular solution to creating personalized item suggestions to users is recommender systems. In this work, we propose an approach that integrates community detection with neighborhood-based recommender systems, specifically, the Adsorption algorithm, for recommending items using implicit user preferences. Network communities represent a principled way of organizing real-world networks into densely connected clusters of nodes. We believe that these dense clusters identified by the community detection algorithm will be helpful to construct user neighborhoods for Adsorption algorithm for recommending collaborators and books to users. Through comprehensive experimental evaluations on the DBLP co-author dataset and Book Crossing dataset, the proposed approach of integrating community detection with the Adsorption algorithm is shown to deliver good performance.","PeriodicalId":289269,"journal":{"name":"2014 IEEE International Conference on Data Mining Workshop","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114178974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Chinese Microblog Sentiment Classification Based on Deep Belief Nets with Extended Multi-Modality Features","authors":"Xiao Sun, Chengcheng Li, Wanyi Xu, F. Ren","doi":"10.1109/ICDMW.2014.101","DOIUrl":"https://doi.org/10.1109/ICDMW.2014.101","url":null,"abstract":"This paper presents a DBN (deep belief nets) model and a multi-modality feature extraction method to extend features' dimensionalities of short text for Chinese micro blogging sentiment classification. Besides traditional features sets for document classification, comments for certain posts are also extracted as part of the micro blogging features according to the relationship between commenters and posters though constructing micro blogging social network as input information. Then, the integration of the above modality features is combined and represented as input vector for DBN. In this paper, a DBN model, which is stacked with several layers of RBM (Restricted Boltzmann Machine), is implemented to initialize the structure of neural network. The RBM layers can take probability distribution samples of original data to learn hidden structures for better feature representation. A Class RBM (Classification RBM) layer, which is stacked on top of several RBM layers, is adapted to achieve the final sentiment classification. The results demonstrate that, with proper structure and parameter, the performance of the proposed deep learning method on sentiment classification is better than state of the art surface learning models such as SVM or NB, which proves that DBN is suitable for short-length document classification with the proposed feature dimensionality extension method.","PeriodicalId":289269,"journal":{"name":"2014 IEEE International Conference on Data Mining Workshop","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114592432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Optimal Approach for Pruning Annular Regularized Extreme Learning Machines","authors":"Lavneet Singh, G. Chetty","doi":"10.1109/ICDMW.2014.69","DOIUrl":"https://doi.org/10.1109/ICDMW.2014.69","url":null,"abstract":"Larger datasets, with many samples are problematic for solving problems in data mining and machine learning, due to increase in computational times, increased complexity, and bad generalization due to outliers. Further, the accuracy and performance of machine learning and statistical models are still based on tuning of some parameters and optimizing them for generating better predictive models of learning. In this paper, we propose a novel formulation of Extreme Learning Machines - the Annular ELM, with RANSAC multi model response regularization for pruning large number of hidden nodes to acquire better optimality, generalization and classification accuracy. Experimental evaluation of the proposed ELM formulation on different benchmark datasets showed that the algorithm optimally prunes the hidden nodes, with better generalization and higher classification accuracy as compared to other algorithms, including the well-known SVM, OP-ELM for binary and multi-class classification and regression problems. Also, we extended the proposed algorithm to a more complex application context involving MRI Brain Image classification. For this study, we examine the performance of the proposed algorithm on magnetic resonance images (MRI) of various states of brain by extracting the most significant features, and to classify them into normal and abnormal brain images.","PeriodicalId":289269,"journal":{"name":"2014 IEEE International Conference on Data Mining Workshop","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114634097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bernard Chen, Christopher Rhodes, Aaron Crawford, Lorri Hambuchen
{"title":"Wineinformatics: Applying Data Mining on Wine Sensory Reviews Processed by the Computational Wine Wheel","authors":"Bernard Chen, Christopher Rhodes, Aaron Crawford, Lorri Hambuchen","doi":"10.1109/ICDMW.2014.149","DOIUrl":"https://doi.org/10.1109/ICDMW.2014.149","url":null,"abstract":"As the world becomes more digital, data Science is the successful study that incorporates varying techniques and theories from distinct fields. Among all fields, the domain knowledge might be the most important since all data science researchers need to start with the domain problem, and end with useful information within the domain. Identifying new application domain is always considered as fundamental research in the area. Wine was considered as a luxury in old days; however, it is popular and enjoyed by a wide variety of people today. Professional wine reviews provide insights on tens of thousands wines available each year. However, currently, there is no systematic way to utilize those large number reviews to benefit wine makers, distributers and consumers. This project proposes a brand new data science area named Wineinformatics. In order to automatically retrieve wines' flavors and characteristics from reviews, which are stored in the human language format, we propose a novel “Computational Wine Wheel” to extract key words. Two different public-available datasets are produced based on our new method in this paper. Hierarchical clustering algorithm is applied on the first dataset and retrieved meaningful clustering results. Association rules algorithm is performed on the second dataset to predict whether a wine is scored above 90 point or not based on the wine savory reviews. 5-fold cross validation experiments are executed based on different parameters and results with a range of 73%~82% accuracy are generated. This new domain will bring huge benefits to fields as diverse as computer science, statistics, business and agriculture.","PeriodicalId":289269,"journal":{"name":"2014 IEEE International Conference on Data Mining Workshop","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121631106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Collaborative and Adaptive Intrusion Detection Based on SVMs and Decision Trees","authors":"Luyao Teng, Shaohua Teng, Feiyi Tang, Haibin Zhu, Wei Zhang, Dongning Liu, Lu Liang","doi":"10.1109/ICDMW.2014.147","DOIUrl":"https://doi.org/10.1109/ICDMW.2014.147","url":null,"abstract":"Because network security has become one of the most serious problems in the world, intrusion detection is an important defence tool of network security. In this paper, A cooperative and adaptive intrusion detection method is proposed and a corresponding intrusion detection model is designed and implemented. The E-CARGO model is used to build the collaborative and adaptive intrusion detection model. The roles, agents and groups based on 2-class Support Vector Machines (SVMs) and Decision Trees (DTs) are described and built, and the adaptive scheduling mechanisms are designed. Finally, the KDD CUP 1999 data set is used to verify the effectiveness of our method. Experimental results show that the collaborative and adaptive intrusion detection method proposed in this paper is superior to the detection of the SVM in the detection accuracy and detection efficiency.","PeriodicalId":289269,"journal":{"name":"2014 IEEE International Conference on Data Mining Workshop","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121924235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}