Vishnu Murthy G, Vishnu Vardhan B, Sarangam K, V. P
{"title":"A Comparative study on Term Weighting Methods for Automated Telugu Text Categorization with Effective Classifiers","authors":"Vishnu Murthy G, Vishnu Vardhan B, Sarangam K, V. P","doi":"10.5121/IJDKP.2013.3606","DOIUrl":"https://doi.org/10.5121/IJDKP.2013.3606","url":null,"abstract":"Automatic Text categorization refers to the process of assigning a category or some categories automatically among predefined ones. Text categorization is challenging in Indian languages has rich in morphology, a large number of word forms and large feature spaces. This paper investigates the performance of different classification approaches using different term weighting approaches in order to decide the most applicable one to Telugu text classification problem. We have investigated on different term weighting methods for Telugu corpus in combination with Naive Bayes ( NB), Support Vector Machine (SVM) and k Nearest Neighbor (kNN) classifiers.","PeriodicalId":131153,"journal":{"name":"International Journal of Data Mining & Knowledge Management Process","volume":"129 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128598262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohammed Abdalla, Hoda M. O. Mokhtar, M. Noureldin
{"title":"A Unified Approach for Spatial Data Query","authors":"Mohammed Abdalla, Hoda M. O. Mokhtar, M. Noureldin","doi":"10.5121/IJDKP.2013.3604","DOIUrl":"https://doi.org/10.5121/IJDKP.2013.3604","url":null,"abstract":"With the rapid development in Geographic Information Systems (GISs) and their applications, more and more geo-graphical databases have been developed by different vendors. However, data integration and accessing is still a big problem for the development of GIS applications as no interoperability exists among different spatial databases. In this paper we propose a unified approach for spatial data query. The paper describes a framework for integrating information from repositories containing different vector data sets formats and repositories containing raster datasets. The presented approach converts different vector data formats into a single unified format (File Geo-Database “GDB”). In addition, we employ “metadata” to support a wide range of users’ queries to retrieve relevant geographic information from heterogeneous and distributed repositories. Such an employment enhances both query processing and performance.","PeriodicalId":131153,"journal":{"name":"International Journal of Data Mining & Knowledge Management Process","volume":"6 7","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120837525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abdullah M.E, Rady E.A, Kozea A.M, Hassanein W.A, S. Abdelbadie
{"title":"What is the Major Power Linking Statistics & Data Mining ?","authors":"Abdullah M.E, Rady E.A, Kozea A.M, Hassanein W.A, S. Abdelbadie","doi":"10.5121/IJDKP.2013.3609","DOIUrl":"https://doi.org/10.5121/IJDKP.2013.3609","url":null,"abstract":"In the recent years, numerous scientific research studies which stand for the intersecting disciplines between statistics and data mining (DM) are obtained [17, 18, 19, 24, 27, 30, 35]. This paper is devoted to answer the titled suggested question which is based on five reply trends, the 1 st trend based on an updated historical vision for each of statistics and DM. The 2 nd trend is concerned with modern theoretical significant reply between statistics and DM. The major power linking statistics and DM is established in the 3 rd trend. Lastly, the 4 th trend represents a significant comparison between statistics & DM. A conceptual classification about Statistical Data Mining (SDM) process in Egypt will be represented in the 5 th reply trend. Finally, the conclusion and the future work are represented.","PeriodicalId":131153,"journal":{"name":"International Journal of Data Mining & Knowledge Management Process","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120962654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Identifying Similar Web Pages Based on Automated and User Preference Value Using Scoring Methods","authors":"K. Gandhimathi, Vijaya Ms","doi":"10.5121/IJDKP.2013.3603","DOIUrl":"https://doi.org/10.5121/IJDKP.2013.3603","url":null,"abstract":"","PeriodicalId":131153,"journal":{"name":"International Journal of Data Mining & Knowledge Management Process","volume":"583 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116268475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Apriori Based Algorithm to Mine Association Rules with Inter Itemset Distance","authors":"P. Sarma, A. Mahanta","doi":"10.5121/IJDKP.2013.3605","DOIUrl":"https://doi.org/10.5121/IJDKP.2013.3605","url":null,"abstract":"Association rules discovered from transaction databases can be large in number. Reduction of association rules is an issue in recent times. Conventionally by varying support and confidence number of rules can be increased and decreased. By combining additional constraint with support number of frequent itemsets can be reduced and it leads to generation of less number of rules. Average inter itemset distance(IID) or Spread, which is the intervening separation of itemsets in the transactions has been used as a measure of interestingness for association rules with a view to reduce the number of association rules. In this paper by using average Inter Itemset Distance a complete algorithm based on the apriori is designed and implemented with a view to reduce the number of frequent itemsets and the association rules and also to find the distribution pattern of the association rules in terms of the number of transactions of non occurrences of the frequent itemsets. Further the apriori algorithm is also implemented and results are compared. The theoretical concepts related to inter itemset distance are also put forward.","PeriodicalId":131153,"journal":{"name":"International Journal of Data Mining & Knowledge Management Process","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133184759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohammed Reda Chbihi Louhdi, Hicham Behja, Said Ouatik El Alaoui
{"title":"A Novel Method for Generating an Elearning Ontology","authors":"Mohammed Reda Chbihi Louhdi, Hicham Behja, Said Ouatik El Alaoui","doi":"10.5121/IJDKP.2013.3610","DOIUrl":"https://doi.org/10.5121/IJDKP.2013.3610","url":null,"abstract":"","PeriodicalId":131153,"journal":{"name":"International Journal of Data Mining & Knowledge Management Process","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121757144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"RECOMMENDATION SYSTEM USING BLOOM FILTER IN MAPREDUCE","authors":"R. Pagare, A. Shinde","doi":"10.5121/IJDKP.2013.3608","DOIUrl":"https://doi.org/10.5121/IJDKP.2013.3608","url":null,"abstract":"Many clients like to use the Web to discover product details in the form of online reviews. The reviews are provided by other clients and specialists. Recommender systems provide an important response to the information overload problem as it presents users more practical and personalized information facilities. Collaborative filtering methods are vital component in recommender systems as they generate high-quality recommendations by influencing the likings of society of similar users. The collaborative filtering method has assumption that people having same tastes choose the same items. The conventional collaborative filtering system has drawbacks as sparse data problem & lack of scalability. A new recommender system is required to deal with the sparse data problem & produce high quality recommendations in large scale mobile environment. MapReduce is a programming model which is widely used for large-scale data analysis. The described algorithm of recommendation mechanism for mobile commerce is user based collaborative filtering using MapReduce which reduces scalability problem in conventional CF system. One of the essential operations for the data analysis is join operation. But MapReduce is not very competent to execute the join operation as it always uses all records in the datasets where only small fraction of datasets are applicable for the join operation. This problem can be reduced by applying bloomjoin algorithm. The bloom filters are constructed and used to filter out redundant intermediate records. The proposed algorithm using bloom filter will reduce the number of intermediate results and will improve the join performance.","PeriodicalId":131153,"journal":{"name":"International Journal of Data Mining & Knowledge Management Process","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116738129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Statistical Data Fusion Technique in Virtual Data Integration Environment","authors":"Mohamed M. Hafez, A. E. Bastawissy, O. H. Mohamed","doi":"10.5121/IJDKP.2013.3503","DOIUrl":"https://doi.org/10.5121/IJDKP.2013.3503","url":null,"abstract":"Data fusion in the virtual data integration environment starts after detecting and clustering duplicated records from the different integrated data sources. It refers to the process of selecting or fusing attribute values from the clustered duplicates into a single record representing the real world object. In this paper, a statistical technique for data fusion is introduced based on some probabilistic scores from both data sources and clustered duplicates.","PeriodicalId":131153,"journal":{"name":"International Journal of Data Mining & Knowledge Management Process","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122594805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. VaraprasadRao, B. Vishnuvardhan, P. VijaypalReddy
{"title":"A Statistical Model for GIST Generation : A Case Study on Hindi News Article","authors":"M. VaraprasadRao, B. Vishnuvardhan, P. VijaypalReddy","doi":"10.5121/IJDKP.2013.3502","DOIUrl":"https://doi.org/10.5121/IJDKP.2013.3502","url":null,"abstract":"","PeriodicalId":131153,"journal":{"name":"International Journal of Data Mining & Knowledge Management Process","volume":"1983 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134072968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comparison Between RISS and DCHARM for Mining Gene Expression Data","authors":"Shaymaa S. Mousa","doi":"10.5121/ijdkp.2013.3505","DOIUrl":"https://doi.org/10.5121/ijdkp.2013.3505","url":null,"abstract":"","PeriodicalId":131153,"journal":{"name":"International Journal of Data Mining & Knowledge Management Process","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121133970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}