{"title":"Improving Knowledge Tracing Model by Integrating Problem Difficulty","authors":"Sein Minn, Feida Zhu, M. Desmarais","doi":"10.1109/ICDMW.2018.00220","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00220","url":null,"abstract":"Intelligent Tutoring Systems (ITS) are designed for providing personalized instructions to students with the needs of their skills. Assessment of student knowledge acquisition dynamically is nontrivial during her learning process with ITS. Knowledge tracing, a popular student modeling technique for student knowledge assessment in adaptive tutoring, which is used for tracing student's knowledge state and detecting student's knowledge acquisition by using decomposed individual skill or problems with a single skill per problem. Unfortunately, recent KT models fail to deal with practices of complex skill composition and variety of concepts included in a problem simultaneously. Our goal is to investigate a student model that compatible for problems with multiple skills and various concept.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115933290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bishal Deb, Ankita Sarkar, Nupur Kumari, Akash Rupela, P. Gupta, Balaji Krishnamurthy
{"title":"Multimapper: Data Density Sensitive Topological Visualization","authors":"Bishal Deb, Ankita Sarkar, Nupur Kumari, Akash Rupela, P. Gupta, Balaji Krishnamurthy","doi":"10.1109/ICDMW.2018.00153","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00153","url":null,"abstract":"Mapper is an algorithm that summarizes the topological information contained in a dataset and provides an insightful visualization. It takes as input a point cloud which is possibly high-dimensional, a filter function on it and an open cover on the range of the function. It returns the nerve simplicial complex of the pullback of the cover. Mapper can be considered a discrete approximation of the topological construct called Reeb space, as analysed in the 1-dimensional case by [Carri et al.,]. Despite its success in obtaining insights in various fields such as in [Kamruzzaman et al., 2016], Mapper is an ad hoc technique requiring lots of parameter tuning. There is also no measure to quantify goodness of the resulting visualization, which often deviates from the Reeb space in practice. In this paper, we introduce a new cover selection scheme for data that reduces the obscuration of topological information at both the computation and visualisation steps. To achieve this, we replace global scale selection of cover with a scale selection scheme sensitive to local density of data points. We also propose a method to detect some deviations in Mapper from Reeb space via computation of persistence features on the Mapper graph.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"276 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114704632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Concept Analysis Based on Granular Formal Contexts","authors":"Zhen Wang, Ling Wei, Jianjun Qi","doi":"10.1109/ICDMW.2018.00077","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00077","url":null,"abstract":"Formal concept analysis (FCA) is an efficient tool for knowledge discovery and decision making from formal contexts. However, in the era of big data, FCA may face some challenges, one of which is that discovering knowledge from a big formal context may be hard. To make knowledge discovery from formal contexts easier and simpler, this study presents concept analysis based on granular formal contexts. First, granular formal context is proposed by combining FCA with the hierarchical idea of granular computing (GrC). Then, based on which, the corresponding notions such as granular derivation operators, granular formal concept, and granular concept lattice are defined. Finally, the connections between classical and granular derivation operators/formal concepts/concept lattices are presented.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"107 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116371117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"User-Device Authentication in Mobile Banking Using APHEN for Paratuck2 Tensor Decomposition","authors":"Jérémy Charlier, Eric Falk, R. State, Jean Hilger","doi":"10.1109/ICDMW.2018.00130","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00130","url":null,"abstract":"The new financial European regulations such as PSD2 are changing the retail banking services. Noticeably, the monitoring of the personal expenses is now opened to other institutions than retail banks. Nonetheless, the retail banks are looking to leverage the user-device authentication on the mobile banking applications to enhance the personal financial advertisement. To address the profiling of the authentication, we rely on tensor decompositions, a higher dimensional analogue of matrix decompositions. We use Paratuck2, which expresses a tensor as a multiplication of matrices and diagonal tensors, because of the imbalance between the number of users and devices. We highlight why Paratuck2 is more appropriate in this case than the popular CP tensor decomposition, which decomposes a tensor as a sum of rank-one tensors. However, the computation of Paratuck2 is computational intensive. We propose a new APproximate HEssian-based Newton resolution algorithm, APHEN, capable of solving Paratuck2 more accurately and faster than the other popular approaches based on alternating least square or gradient descent. The results of Paratuck2 are used for the predictions of users' authentication with neural networks. We apply our method for the concrete case of targeting clients for financial advertising campaigns based on the authentication events generated by mobile banking applications.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"104 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124641508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Defect Detection from UAV Images Based on Region-Based CNNs","authors":"Meng Lan, Yipeng Zhang, Lefei Zhang, Bo Du","doi":"10.1109/ICDMW.2018.00063","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00063","url":null,"abstract":"With the wide applications of Unmanned Aerial Vehicle (UAV) in engineering such as the inspection of the electrical equipment from distance, the demands of efficient object detection algorithms for abundant images acquired by UAV have also been significantly increased in recent years. In computer vision and data mining communities, traditional object detection methods usually train a class-specific learner (e.g., the SVM) based on the low level features to detect the single class of images by sliding a local window. Thus, they may not suit for the UAV images with complex background and multiple kinds of interest objects. Recently, the deep convolutional neural networks (CNNs) have already shown great advances in the object detection and segmentation fields and outperformed many traditional methods which usually been employed in the past decades. In this work, we study the performance of the region-based CNN for the electrical equipment defect detection by using the UAV images. In order to train the detection model, we collect a UAV images dataset composes of four classes of electrical equipment defects with thousands of annotated labels. Then, based on the region-based faster R-CNN model, we present a multi-class defects detection model for electrical equipment which is more efficient and accurate than traditional single class detection methods. Technically, we have replaced the RoI pooling layer with a similar operation in Tensorflow and promoted the mini-batch to 128 per image in the training procedure. These improvements have slightly increased the speed of detection without any accuracy loss. Therefore, the modified region-based CNN could simultaneously detect multi-class of defects of the electrical devices in nearly real time. Experimental results on the real word electrical equipment images demonstrate that the proposed method achieves better performance than the traditional object detection algorithms in defect detection.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124656218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gökberk Koçak, Ozgur Akgun, Ian Miguel, P. Nightingale
{"title":"Closed Frequent Itemset Mining with Arbitrary Side Constraints","authors":"Gökberk Koçak, Ozgur Akgun, Ian Miguel, P. Nightingale","doi":"10.1109/ICDMW.2018.00175","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00175","url":null,"abstract":"Frequent itemset mining (FIM) is a method for finding regularities in transaction databases. It has several application areas, such as market basket analysis, genome analysis, and drug design. Finding frequent itemsets allows further analysis to focus on a small subset of the data. For large datasets the number of frequent itemsets can also be very large, defeating their purpose. Therefore, several extensions to FIM have been studied, such as adding high-utility (or low-cost) constraints and only finding closed (or maximal) frequent itemsets. This paper presents a constraint programming based approach that combines arbitrary side constraints with closed frequent itemset mining. Our approach allows arbitrary side constraints to be expressed in a high level and declarative language which is then translated automatically for efficient solution by a SAT solver. We compare our approach with state-of-the-art algorithms via the MiningZinc system (where possible) and show significant contributions in terms of performance and applicability.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"167 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124674111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Arindam Ghosh, Prithviraj Pramanik, Kartick Das Banerjee, Ashutosh Roy, S. Nandi, Sujoy Saha
{"title":"Analyzing Correlation Between Air and Noise Pollution with Influence on Air Quality Prediction","authors":"Arindam Ghosh, Prithviraj Pramanik, Kartick Das Banerjee, Ashutosh Roy, S. Nandi, Sujoy Saha","doi":"10.1109/ICDMW.2018.00133","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00133","url":null,"abstract":"Air and noise pollution are two major factors that determine the quality of life of the people living in cities. The prime reasons for the rise of air and noise pollution are due to imbalanced urbanization, unregulated increase in traffic and inorganic industrialization. These have resulted in compromising the well-being of the citizens. In this context, the concept of smart cities has been developed. They inherently have the ability to sense and respond to the challenges which characterizes regular cities with the help of embedded intelligence. It has become important to monitor the environmental parameters for policy-making, planning and for making smart cities livable and sustainable. In a bid to make a smart city, in this work, we have studied the spatio-temporal relationship between air and noise pollution in four different locations and have also evaluated the effect of noise in predicting Air Quality(AQ). Data acquisition has been done using customized, self-developed CO_2; NO_2; PM2:5, humidity, temperature and intensity of noise. To determine the relationship between air and noise pollution, we have used Pearson correlation. Results show a strong association between the two types of pollution. For predicting the air quality, the impact of noise pollution as a feature has been investigated using three different machine learning models which are Decision Tree, Random Forest and K-Nearest Neighbors. When applicable, the results show that if noise pollution is used as a feature, we get a prediction accuracy of upto 95% which is an improvement of 5% on an average","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128301522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"BOFE: Anomaly Detection in Linear Time Based on Feature Estimation","authors":"Ao Yin, C. Zhang","doi":"10.1109/ICDMW.2018.00162","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00162","url":null,"abstract":"In this paper, we propose an anomaly detection algorithm based on feature estimation. The key insight of our algorithm is a fast and accurate feature estimator based on multiple mapping tables, called ensemble mapping table. These mapping tables, which are the novel representation of data set transformed by mapping functions, contain the feature information and corresponding probability. By establishing these mapping tables, we can obtain the empirical probability distribution of each feature. Then we can estimate the degree of abnormality of each feature according to its probability distribution, and count the number of anomaly features. This number will be treated as anomaly score of instances. In order to obtain unbiased score, the final anomaly score are the average value of the scores obtained from the ensemble mapping table. We derive the theoretical upper bound for the proposed algorithm and analyze the rationality of the anomaly score calculation method from statistical perspective. Experimental evaluations on multiple benchmark data sets illustrate that, compared to the existing state-of-the-art methods, our algorithm BOFE can achieve better AUC score and need less running time.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127100432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Human-Assisted Computation for Auto-Grading","authors":"Lin Ling, Chee-Wei Tan","doi":"10.1109/ICDMW.2018.00059","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00059","url":null,"abstract":"In this paper, we present a novel auto-grading framework that can automatically grade student assignments without prior knowledge of the answers. The idea is crowd-sourcing or human-assisted computation that extract knowledge from a large number of people to make predictions using hypothesis testing and Bayesian analysis. We also explore the possibilities of combining this framework with an educational chatbot software interface (e.g., the Facebook Messenger chatbot platform), in order to utilize the built-in image annotation feature that facilitates the assignment submission process in large classes.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"101 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123160063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Realization of Data Exchange and Utilization Society by Blockchain and Data Jacket: Merit of Consortium to Accelerate Co-Creation","authors":"Yusuke Ejiri, Eiji Ikeda, Hiromichi Sasaki","doi":"10.1109/ICDMW.2018.00034","DOIUrl":"https://doi.org/10.1109/ICDMW.2018.00034","url":null,"abstract":"Based on the recent rapid progress of big data analysis and AI technology, it is expected that large amount of data and latest analyzing technologies are combined with the creativeness of people, then co-creation across industries is accelerated to create innovative services or products. To realize that, it is inevitable to build system to share data in the society across industries securely, utilize them, and create value continuously. In this document, we introduce Fujitsu's solution development to realize data exchange and utilization society. Especially we focus on the \"consortium\" structure, where people are connected with mutual trust, and that is key factor to reduce various risk and insecure feeling about dealing with data then accelerate data utilization.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126299725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}