{"title":"Recurrent Feature Grouping and Classification Model for Action Model Prediction in CBMR","authors":"V. Reddy, P. Sureshvarma, A. Govardhan","doi":"10.5121/IJDKP.2017.7605","DOIUrl":"https://doi.org/10.5121/IJDKP.2017.7605","url":null,"abstract":"","PeriodicalId":131153,"journal":{"name":"International Journal of Data Mining & Knowledge Management Process","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132063375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Missing Data Classification of Chronic Kidney Disease","authors":"Wala Abedalkhader, Noora Abdulrahman","doi":"10.5121/IJDKP.2017.7604","DOIUrl":"https://doi.org/10.5121/IJDKP.2017.7604","url":null,"abstract":"In this paper we propose an approach on chronic kidney disease classification with the presence of missing data. We implemented a classification system to solve the challenge of detecting chronic kidney diseases based on medical test data. The approach is comparing three different techniques that deals with missing data including deletion, mean imputation, and selection of best features. Each techniques is tested using the K-NN classifier, Naïve Bayes classifier, decision tree, and support vector machines (SVM). The final accuracy of each system is determined using 10-fold cross validation.","PeriodicalId":131153,"journal":{"name":"International Journal of Data Mining & Knowledge Management Process","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114355089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Online scalable SVM ensemble learning method (OSSELM) for spatio-temporal air pollution analysis","authors":"Shahid Ali, Simon Dacey","doi":"10.5121/IJDKP.2017.7602","DOIUrl":"https://doi.org/10.5121/IJDKP.2017.7602","url":null,"abstract":"Environmental air pollution studies fail to consider the fact that air pollution is a spatio-temporal problem. The volume and complexity of the data have created the need to explore various machine learning models, however, those models have advantages and disadvantages when applied to regional air pollution analysis, furthermore, most environmental problems are global distribution problems. This research addressed spatio-temporal problem using decentralized computational technique named Online Scalable SVM Ensemble Learning Method (OSSELM). Evaluation criteria for computational air pollution analysis includes: accuracy, real time & prediction, spatio-temporal and decentralised analysis, we assert that these criteria can be improved using the proposed OSSELM. Special consideration is given to distributed ensemble to resolve spatio-temporal data collection problem (i.e. the data collected from multiple monitoring stations dispersed over a geographical location). Moreover, the experimental results demonstrated that the proposed OSSELM produced impressive results compare to SVM ensemble for air pollution analysis in Auckland region.","PeriodicalId":131153,"journal":{"name":"International Journal of Data Mining & Knowledge Management Process","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121520116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Approach for Thickening Sentence Score for Automatic Text Summarization","authors":"Michael George","doi":"10.5121/IJDKP.2017.7607","DOIUrl":"https://doi.org/10.5121/IJDKP.2017.7607","url":null,"abstract":"In our study we will use approach that combine Natural language processing NLP with Term occurrences to improve the quality of important sentences selection by thickening sentence score along with reducing the number of long sentences that would be included in the final summarization. There are sixteen known methods for automatic text summarization. In our paper we utilized Term frequency approach and built an algorithm to re filter sentences score.","PeriodicalId":131153,"journal":{"name":"International Journal of Data Mining & Knowledge Management Process","volume":"113 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117251020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Application of Extensive Feature Extraction as a Cost Strategy in Clinical Decision Support System","authors":"O. Henry, U. Chidiebere, Inyiama Hycinth","doi":"10.5121/IJDKP.2017.7603","DOIUrl":"https://doi.org/10.5121/IJDKP.2017.7603","url":null,"abstract":"","PeriodicalId":131153,"journal":{"name":"International Journal of Data Mining & Knowledge Management Process","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132356358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Influence of Data Geometry in Random Subset Feature Selection","authors":"D. Lakshmipadmaja, B. Vishnuvardhan","doi":"10.5121/IJDKP.2017.7403","DOIUrl":"https://doi.org/10.5121/IJDKP.2017.7403","url":null,"abstract":"The geometry of data, also known as probability distribution, is an important consideration for accurate computation of data mining tasks, such as pre-processing, classification and interpretation. The data geometry influences outcome and accuracy of the statistical analysis to a large extent. The current paper focuses on, understanding the influence of data geometry in the feature subset selection process using random forest algorithm. In practice, it is assumed that the data follows normal distribution and most of the time, it may not be true. The dimensionality reduction varies, due to change in the distribution of the data. A comparison is made using three standard distributions such as Triangular, Uniform and Normal Distribution. The results are discussed in this paper.","PeriodicalId":131153,"journal":{"name":"International Journal of Data Mining & Knowledge Management Process","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123087803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Predictive Modelling of Crime Dataset Using Data Mining","authors":"P. Yerpude, Vaishnavi Gudur","doi":"10.5121/IJDKP.2017.7404","DOIUrl":"https://doi.org/10.5121/IJDKP.2017.7404","url":null,"abstract":"With a substantial increase in crime across the globe, there is a need for analyzing the crime data to lower the crime rate. This helps the police and citizens to take necessary actions and solve the crimes faster. In this paper, data mining techniques are applied to crime data for predicting features that affect the high crime rate. Supervised learning uses data sets to train, test and get desired results on them whereas Unsupervised learning divides an inconsistent, unstructured data into classes or clusters. Decision trees, Naive Bayes and Regression are some of the supervised learning methods in data mining and machine learning on previously collected data and thus used for predicting the features responsible for causing crime in a region or locality. Based on the rankings of the features, the Crimes Record Bureau and Police Department can take necessary actions to decrease the probability of occurrence of the crime.","PeriodicalId":131153,"journal":{"name":"International Journal of Data Mining & Knowledge Management Process","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129453986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mining the Gift Receiver's Mind","authors":"Yi-Ning Tu, Fong-Ling Fu","doi":"10.5121/IJDKP.2017.7202","DOIUrl":"https://doi.org/10.5121/IJDKP.2017.7202","url":null,"abstract":"Choosing an appropriate gift is difficult because the purpose of gift giving is to arouse affection in the receiver, not the giver, and too many variables that influence the results. Utilizing 600 samples and a hybrid method combining the decision tree and K-nearest neighbor approaches, this study builds a DTKNN two–stepped recommendation system which achieves a precision rate higher than 80%. The contribution of this research is to propose a new data mining technique to solve the problem of a recommendation system for altruistic gift selection which allows the receiver to perceive the affection desired by the giver.","PeriodicalId":131153,"journal":{"name":"International Journal of Data Mining & Knowledge Management Process","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114588166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}