Advances in data mining. Industrial Conference on Data Mining最新文献

筛选
英文 中文
Data Privacy for Big Data Publishing Using Newly Enhanced PASS Data Mining Mechanism 基于新增强的PASS数据挖掘机制的大数据发布数据隐私
Advances in data mining. Industrial Conference on Data Mining Pub Date : 2018-08-22 DOI: 10.5772/INTECHOPEN.77033
Priyank Jain, Manasi Gyanchandani, N. Khare
{"title":"Data Privacy for Big Data Publishing Using Newly Enhanced PASS Data Mining Mechanism","authors":"Priyank Jain, Manasi Gyanchandani, N. Khare","doi":"10.5772/INTECHOPEN.77033","DOIUrl":"https://doi.org/10.5772/INTECHOPEN.77033","url":null,"abstract":"Anonymization is one of the main techniques that is being used in recent times to prevent privacy breaches on the published data; one such anonymization technique is k-anonymiz-ation technique. The anonymization is a parametric anonymization technique used for data anonymization. The aim of the k-anonymization is to generalize the tuples in a way that it cannot be identified using quasi-identifiers. In the past few years, we saw a tremendous growth in data that ultimately led to the concept of the big data. The growth in data made anonymization using conventional processing methods inefficient. To make the anonymi- zation more efficient, we used the proposed PASS mechanism in Hadoop framework to reduce the processing time of anonymization. In this work, we have divided the whole program into the map and reduce part. Moreover, the data types used in Hadoop provide better serialization and transport of data. We performed our experiments on the large dataset. The results proved the best efficiency of our implementation.","PeriodicalId":91437,"journal":{"name":"Advances in data mining. Industrial Conference on Data Mining","volume":"23 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78117133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Early Prediction of Patient Mortality Based on Routine Laboratory Tests and Predictive Models in Critically Ill Patients 基于常规实验室检查和预测模型的危重病人死亡率早期预测
Advances in data mining. Industrial Conference on Data Mining Pub Date : 2018-08-22 DOI: 10.5772/INTECHOPEN.76988
Sven Van Poucke, Ana Kovačević, M. Vukicevic
{"title":"Early Prediction of Patient Mortality Based on Routine Laboratory Tests and Predictive Models in Critically Ill Patients","authors":"Sven Van Poucke, Ana Kovačević, M. Vukicevic","doi":"10.5772/INTECHOPEN.76988","DOIUrl":"https://doi.org/10.5772/INTECHOPEN.76988","url":null,"abstract":"We propose a method for quantitative analysis of predictive power of laboratory tests and early detection of mortality risk by usage of predictive models and feature selection techniques. Our method allows automatic feature selection, model selection, and evalu- ation of predictive models. Experimental evaluation was conducted on patients with renal failure admitted to ICUs (medical intensive care, surgical intensive care, cardiac, and cardiac surgery recovery units) at Boston’s Beth Israel Deaconess Medical Center. Data are extracted from Multi parameter Intelligent Monitoring in Intensive Care III (MIMIC-III) database. We built and evaluated different single (e.g. Logistic regression) and ensemble (e.g. Random Forest) learning methods. Results revealed high predictive accuracy (area under the precision-recall curve (AUPRC) values >86%) from day four, with acceptable results on the second (>81%) and third day (>85%). Random forests seem to provide the best predictive accuracy. Feature selection techniques Gini and ReliefF scored best in most cases. Lactate, white blood cells, sodium, anion gap, chloride, bicar - bonate, creatinine, urea nitrogen, potassium, glucose, INR, hemoglobin, phosphate, total bilirubin, and base excess were most predictive for hospital mortality. Ensemble learn- ing methods are able to predict hospital mortality with high accuracy, based on laboratory tests and provide ranking in predictive priority.","PeriodicalId":91437,"journal":{"name":"Advances in data mining. Industrial Conference on Data Mining","volume":"23 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82057013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Adaptive Neural Network Classifier-Based Analysis of Big Data in Health Care 基于自适应神经网络分类器的医疗大数据分析
Advances in data mining. Industrial Conference on Data Mining Pub Date : 2018-08-22 DOI: 10.5772/INTECHOPEN.77225
Manaswini Pradhan
{"title":"Adaptive Neural Network Classifier-Based Analysis of Big Data in Health Care","authors":"Manaswini Pradhan","doi":"10.5772/INTECHOPEN.77225","DOIUrl":"https://doi.org/10.5772/INTECHOPEN.77225","url":null,"abstract":"Because of the massive volume, variety, and continuous updating of medical data, the efficient processing of medical data and the real-time response of the treatment recom-mendation has become an important issue. Fortunately, parallel computing and cloud computing provide powerful capabilities to cope with large-scale data. Therefore, in this paper, a FCM based Map-Reduce programming model is proposed for the parallel com- puting using AANN approach. The FCM based Map-Reduce, clusters the large medical datasets into smaller groups of certain similarity and assigns each data cluster to one Mapper, where the training of neural networks are done by the optimal selection of the interconnection weights by Whale Optimization Algorithm (WOA). Finally, the Reducer reduces all the AANN classifiers obtained from the Mappers for identifying the normal and abnormal classes of the newer medical records promptly and accurately. The pro- posed methodology is implemented in the working platform of JAVA using CloudSim simulator. memory. The proposed FCM based Map-Reduce model decreases the requirement of memory while equating with other accomplishing k-means based Map-Reduce and DBSCAN method.","PeriodicalId":91437,"journal":{"name":"Advances in data mining. Industrial Conference on Data Mining","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82325407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Performance-Aware High-Performance Computing for Remote Sensing Big Data Analytics 基于性能感知的高性能遥感大数据分析计算
Advances in data mining. Industrial Conference on Data Mining Pub Date : 2018-08-22 DOI: 10.5772/INTECHOPEN.75934
Mustafa Kemal Pektürk and Muhammet Ünal
{"title":"Performance-Aware High-Performance Computing for Remote Sensing Big Data Analytics","authors":"Mustafa Kemal Pektürk and Muhammet Ünal","doi":"10.5772/INTECHOPEN.75934","DOIUrl":"https://doi.org/10.5772/INTECHOPEN.75934","url":null,"abstract":"The incredible increase in the volume of data emerging along with recent technological developments has made the analysis processes which use traditional approaches more difficult for many organizations. Especially applications involving subjects that require timely processing and big data such as satellite imagery, sensor data, bank operations, web servers, and social networks require efficient mechanisms for collecting, storing, processing, and analyzing these data. At this point, big data analytics, which contains data mining, machine learning, statistics, and similar techniques, comes to the help of organizations for end-to-end managing of the data. In this chapter, we introduce a novel high-performance computing system on the geo-distributed private cloud for remote sensing applications, which takes advantages of network topology, exploits utilization and workloads of CPU, storage, and memory resources in a distributed fashion, and optimizes resource allocation for realizing big data analytics efficiently.","PeriodicalId":91437,"journal":{"name":"Advances in data mining. Industrial Conference on Data Mining","volume":"62 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83036356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Mining HCI Data for Theory of Mind Induction 为心智归纳理论挖掘HCI数据
Advances in data mining. Industrial Conference on Data Mining Pub Date : 2018-08-22 DOI: 10.5772/INTECHOPEN.74400
O. Arnold, K. Jantke
{"title":"Mining HCI Data for Theory of Mind Induction","authors":"O. Arnold, K. Jantke","doi":"10.5772/INTECHOPEN.74400","DOIUrl":"https://doi.org/10.5772/INTECHOPEN.74400","url":null,"abstract":"Human-computer interaction (HCI) results in enormous amounts of data-bearing potentials for understanding a human user’s intentions, goals, and desires. Knowing what users want and need is a key to intelligent system assistance. The theory of mind concept known from studies in animal behavior is adopted and adapted for expressive user modeling. Theories of mind are hypothetical user models representing, to some extent, a human user’s thoughts. A theory of mind may even reveal tacit knowledge. In this way, user modeling becomes knowledge discovery going beyond the human’s knowledge and covering domain-specific insights. Theories of mind are induced by mining HCI data. Data mining turns out to be inductive modeling. Intelligent assistant systems inductively modeling a human user’s intentions, goals, and the like, as well as domain knowledge are, by nature, learning systems. To cope with the risk of getting it wrong, learning systems are equipped with the skill of reflection.","PeriodicalId":91437,"journal":{"name":"Advances in data mining. Industrial Conference on Data Mining","volume":"84 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77654049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Identification of Research Thematic Approaches Based on Keywords Network Analysis in Colombian Social Sciences 基于关键词网络分析的哥伦比亚社会科学研究主题方法识别
Advances in data mining. Industrial Conference on Data Mining Pub Date : 2018-08-22 DOI: 10.5772/INTECHOPEN.76834
José heRNaNdo ávila-tosCaNo, I. Romero-Pérez, AiledMarenco-Escuderos, Eugenio Saavedra Guajardo
{"title":"Identification of Research Thematic Approaches Based on Keywords Network Analysis in Colombian Social Sciences","authors":"José heRNaNdo ávila-tosCaNo, I. Romero-Pérez, AiledMarenco-Escuderos, Eugenio Saavedra Guajardo","doi":"10.5772/INTECHOPEN.76834","DOIUrl":"https://doi.org/10.5772/INTECHOPEN.76834","url":null,"abstract":"The purpose of this research was to unveil the structure of knowledge of Social Sciences in Colombia through the analysis of thematic networks and its association with differ ent disciplines’ new knowledge production to define scenarios and trends in each. 2992 published articles in the period 2006–2015 were revised in this research, all indexed in Web of Science, Scopus and other bibliographic databases, applying the social networks analysis technique to the keywords of all. The analysis included each discipline’s clus tering coefficient and group metrics. The results described in this chapter identify how social disciplines in Colombia have mainly focused its research production in topics such as armed conflict, poverty and human development.","PeriodicalId":91437,"journal":{"name":"Advances in data mining. Industrial Conference on Data Mining","volume":"14 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79182563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Semantic Infrastructure for Service Environment Supporting Successful Aging 支持成功老化的服务环境语义基础结构
Advances in data mining. Industrial Conference on Data Mining Pub Date : 2018-08-22 DOI: 10.5772/INTECHOPEN.76945
V. Salminen, Päivi Sanerma, S. Niittymäki, Peter W. Eklund
{"title":"Semantic Infrastructure for Service Environment Supporting Successful Aging","authors":"V. Salminen, Päivi Sanerma, S. Niittymäki, Peter W. Eklund","doi":"10.5772/INTECHOPEN.76945","DOIUrl":"https://doi.org/10.5772/INTECHOPEN.76945","url":null,"abstract":"Demographic changes and the rapid increase of aging people are occurring throughout the world. There is a need for step-by-step developing service environment to support elderly living as old as possible at home. Digital equipment and technology solutions installed at home produce real-time data which can be used for predictive and optimized service creation. New technology solutions have to be tested at home environments to get certainty of usability, flexibility, and accessibility. The implementation of new digitalization has to happen according to ethical rules taking into account the values of elderly people. The data gathered through digital equipment is used in optimizing service processes. However, service process misses common ontology and semantic infrastructure to use the gathered data for service optimization. The service environment and semantic infrastructure, which could be used in social and health care, are introduced in this article.","PeriodicalId":91437,"journal":{"name":"Advances in data mining. Industrial Conference on Data Mining","volume":"45 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80096246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Ensemble Methods in Environmental Data Mining 环境数据挖掘中的集成方法
Advances in data mining. Industrial Conference on Data Mining Pub Date : 2018-08-22 DOI: 10.5772/INTECHOPEN.74393
Goksu Tuysuzoglu, Derya Birant, A. Pala
{"title":"Ensemble Methods in Environmental Data Mining","authors":"Goksu Tuysuzoglu, Derya Birant, A. Pala","doi":"10.5772/INTECHOPEN.74393","DOIUrl":"https://doi.org/10.5772/INTECHOPEN.74393","url":null,"abstract":"Environmental data mining is the nontrivial process of identifying valid, novel, and potentially useful patterns in data from environmental sciences. This chapter proposes ensemble methods in environmental data mining that combines the outputs from multiple classification models to obtain better results than the outputs that could be obtained by an individual model. The study presented in this chapter focuses on several ensemble strategies in addition to the standard single classifiers such as decision tree, naive Bayes, support vector machine, and k-nearest neighbor (KNN), popularly used in literature. This is the first study that compares four ensemble strategies for envi ronmental data mining: (i) bagging , (ii) bagging combined with random feature subset selection (the random forest algorithm), (iii) boosting (the AdaBoost algorithm), and (iv) voting of different algorithms. In the experimental studies, ensemble methods are tested on different real-world environmental datasets in various subjects such as air, ecology, rainfall, and soil. methods are majority voting, performance weighting, Bayesian combination, and vogging. Meta-learning methods learn from new training data created from the predictions of a set of base classifiers. The most well-known meta-learning methods are stacking strategies for environmental data mining: (i) bagging, (ii) bagging combined with random feature subset selection, (iii) boosting, and (iv) voting. In the experimental studies, ensemble methods are tested on different real-world environmental datasets.","PeriodicalId":91437,"journal":{"name":"Advances in data mining. Industrial Conference on Data Mining","volume":"42 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88483167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
A Decision Rule Based Approach to Generational Feature Selection 基于决策规则的分代特征选择方法
Advances in data mining. Industrial Conference on Data Mining Pub Date : 2018-07-11 DOI: 10.1007/978-3-319-95786-9_17
Wieslaw Paja
{"title":"A Decision Rule Based Approach to Generational Feature Selection","authors":"Wieslaw Paja","doi":"10.1007/978-3-319-95786-9_17","DOIUrl":"https://doi.org/10.1007/978-3-319-95786-9_17","url":null,"abstract":"","PeriodicalId":91437,"journal":{"name":"Advances in data mining. Industrial Conference on Data Mining","volume":"13 1","pages":"230-239"},"PeriodicalIF":0.0,"publicationDate":"2018-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84319817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Speeding Up Continuous kNN Join by Binary Sketches 二元草图加速连续kNN连接
Advances in data mining. Industrial Conference on Data Mining Pub Date : 2018-07-11 DOI: 10.1007/978-3-319-95786-9_14
Filip Nálepa, Michal Batko, P. Zezula
{"title":"Speeding Up Continuous kNN Join by Binary Sketches","authors":"Filip Nálepa, Michal Batko, P. Zezula","doi":"10.1007/978-3-319-95786-9_14","DOIUrl":"https://doi.org/10.1007/978-3-319-95786-9_14","url":null,"abstract":"","PeriodicalId":91437,"journal":{"name":"Advances in data mining. Industrial Conference on Data Mining","volume":"2 1","pages":"183-198"},"PeriodicalIF":0.0,"publicationDate":"2018-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82387124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信