{"title":"A Conceptual Framework for the Mining and Analysis of the Social Media Data","authors":"S. R. Joseph, Keletso J. Letsholo, H. Hlomani","doi":"10.14257/ijdta.2017.10.10.02","DOIUrl":"https://doi.org/10.14257/ijdta.2017.10.10.02","url":null,"abstract":"Social media data possess the characteristics of Big Data such as volume, veracity, velocity, variability and value. These characteristics make its analysis a bit more challenging than conventional data. Manual analysis approaches are unable to cope with the fast pace at which data is being generated. Processing data manually is also time consuming and requires a lot of effort as compared to using computational methods. However, computational analysis methods usually cannot capture in-depth meanings (semantics) within data. On their individual capacity, each approach is insufficient. As a solution, we propose a Conceptual Framework, which integrates both the traditional approaches and computational approaches to the mining and analysis of social media data. This allows us to leverage the strengths of traditional content analysis, with its regular meticulousness and relative understanding, whilst exploiting the extensive capacity of Big Data analytics and accuracy of computational methods. The proposed Conceptual Framework was evaluated in two stages using an example case of the political landscape of Botswana data collected from Facebook and Twitter platforms. Firstly, a user study was carried through the Inductive Content Analysis (ICA) process using the collected data. Additionally, a questionnaire was conducted to evaluate the usability of ICA as perceived by the participants. Secondly, an experimental study was conducted to evaluate the performance of data mining algorithms on the data from the ICA process. The results, from the user study, showed that the ICA process is flexible and systematic in terms of allowing the users to analyse social media data, hence reducing the time and effort required to manually analyse data. The users’ perception in terms of ease of use and usefulness of the ICA on analysing social media data is positive. The results from the experimental study show that data mining algorithms produced higher accurate results in classifying data when supplied with data from the ICA process. That is, when data mining algorithms are integrated with the ICA process, they are able to overcome the difficulty they face to capture semantics within data. Overall, the results of this study, including the Proposed Conceptual Framework are useful to scholars and practitioners who wish to do some researches on social media data mining and analysis. The Framework serves as a guide to the mining and analysis of the social media data in a systematic manner.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"2011 1","pages":"11-34"},"PeriodicalIF":0.0,"publicationDate":"2017-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87834348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Security Alarm Systems: Modeling and Analysis of SIA Protocol","authors":"Shankar Raman Ravindran","doi":"10.14257/ijdta.2017.10.9.04","DOIUrl":"https://doi.org/10.14257/ijdta.2017.10.9.04","url":null,"abstract":"","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"6 1","pages":"39-46"},"PeriodicalIF":0.0,"publicationDate":"2017-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86922857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Proposal for an Economic Attendance Management System","authors":"Bokrae Jang, J. Yim, Seunghyun Oh","doi":"10.14257/ijdta.2017.10.9.01","DOIUrl":"https://doi.org/10.14257/ijdta.2017.10.9.01","url":null,"abstract":"This paper proposes an attendance management system (AMS) after reviewing existing AMSs. The proposed AMS uses smartphones carried by students and the wireless access points that are installed to allow Wi-Fi compliant devices such as smartphones to access to the local area network. There is almost no university campus where the local area network is not available. Since the proposed AMS does not use any other devices except the smartphones and access points, it is easy and economical to be installed. The proposed system recognizes all students in the class as attendees as long as every student carries a smartphone that is registered to the AMS. The proposed system recognizes all students who are not in the classroom as absentees. During the preparation stage, each access point is assigned a unique dynamic Internet Protocol (IP) address allocation range.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"5 1","pages":"1-10"},"PeriodicalIF":0.0,"publicationDate":"2017-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81933354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimal Predictive analytics of Pima Diabetics using Deep Learning","authors":"H. Balaji, N. Iyengar, Ronnie D. Caytiles","doi":"10.14257/IJDTA.2017.10.9.05","DOIUrl":"https://doi.org/10.14257/IJDTA.2017.10.9.05","url":null,"abstract":"An intelligent predictive model using deep learning is proposed to predict the patient risk factor and severity of diabetics using conditional data set. The model involves deep learning in the form of a deep neural network which helps to apply predictive analytics on the diabetes data set to obtain optimal results. The existing predictive models is used to predict the severity and the risk factor of the diabetics based on the data which is processed. In our case Firstly, a feature selection algorithm is run for the selection process. Secondly, the deep learning model has a deep neural network which employs a Restricted Boltzmann Machine (RBM) as a basic unit to analyse the data by assigning weights to the each branch of the neural network. This deep neural network, coded on python, will help to obtain numeric results on the severity and the risk factor of the diabetics in the data set. At the end, a comparative study is done between the implementation of this model on type 1 diabetes mellitus, Pima Indians diabetes and the Rough set theory model. The results add value to additional reports because the number of studies done on diabetes using a deep learning model is few to none. This will help to predict diabetes with much more precision as shown by the results obtained. characteristic","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"448 1","pages":"47-62"},"PeriodicalIF":0.0,"publicationDate":"2017-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88349173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient Filtering Technique for Reducing Time Overhead of Dynamic Data Race Detection in Multithread Programs","authors":"Ok-Kyoon Ha, Se-Won Park, S. Heo","doi":"10.14257/IJDTA.2017.10.9.03","DOIUrl":"https://doi.org/10.14257/IJDTA.2017.10.9.03","url":null,"abstract":"Data races are the hardest defect to handle in multithread programs because they may lead to unpredictable results of the program caused by nondeterministic interleaving of concurrent threads. The main drawback of dynamic data race detection is the heavy additional overhead to monitor and analyze memory operations and thread operations during an execution of the program. It is important to reduce the additional overheads for debugging the concurrency bug. This paper presents a monitoring filtering technique that rules out repeatedly executing regions of parallel loops from the monitoring targets.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"79 1","pages":"23-38"},"PeriodicalIF":0.0,"publicationDate":"2017-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87521591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Stephen Neal Joshua Eali, N. Thirupathi Rao, Swathi Kalam, D. Bhattacharyya, Hye-jin Kim
{"title":"Performance Gain in HIVE through Query Optimization using Index Joins","authors":"Stephen Neal Joshua Eali, N. Thirupathi Rao, Swathi Kalam, D. Bhattacharyya, Hye-jin Kim","doi":"10.14257/ijdta.2017.10.9.02","DOIUrl":"https://doi.org/10.14257/ijdta.2017.10.9.02","url":null,"abstract":"Index joins range unit pivotal for proficiency and quality once technique questions over colossal data. HIVE may be a cluster balanced immense data administration motor that is good for data examination applications and for OLAP for phenomenally \"specific\" inquiries whose yield sizes region unit little division from the contributing data, there the beast compel experiences poor execution because of repetitive circle I/O operations or end in starts of additional guide operations. Here all through this paper a shot is made and propose file joins procedure to rush up the inquiry strategy and incorporate it in Hive by mapping our vogue to the unique change stream to assess the execution, we've a slant to give and measure check inquiries on datasets created abuse TPC-H benchmark. Our outcomes show vital execution increase over moderately tremendous data sets and/or uncommonly specific questions having a two-way are a piece of and one be a piece of condition.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"7 1","pages":"11-22"},"PeriodicalIF":0.0,"publicationDate":"2017-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78811506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Seung-Il Moon, Ki-Min Song, J. Shim, Ho-young Choi
{"title":"Personal Information Protection Issues and Its Solutions","authors":"Seung-Il Moon, Ki-Min Song, J. Shim, Ho-young Choi","doi":"10.14257/IJDTA.2017.10.8.11","DOIUrl":"https://doi.org/10.14257/IJDTA.2017.10.8.11","url":null,"abstract":"This study aims to review the legal system concerning information efficiency and privacy protection on the U-Health infrastructure construction for the disabled. Regarding methodology, related provisions such as U-Health, legal definitions of the disabled, as well as Privacy Protection Act for security are analyzed and studied. As a result, Personal Information Control Right of the information agent should be secured in the gathering, processing, use and provision of the medical information. Also, legal norms to protect personal medical information leakage due to inadequate administrative and technical action are required.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"10 1","pages":"115-122"},"PeriodicalIF":0.0,"publicationDate":"2017-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88052152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Amita Arora, Akanksha Diwedy, Manjeet Singh, N. Chauhan
{"title":"Machine Learning Approach for Text Summarization","authors":"Amita Arora, Akanksha Diwedy, Manjeet Singh, N. Chauhan","doi":"10.14257/ijdta.2017.10.8.08","DOIUrl":"https://doi.org/10.14257/ijdta.2017.10.8.08","url":null,"abstract":"With the abundance of interminable text documents, providing summaries can help in retrieval of relevant information very quickly. The technique is to extract those sentences from the document that contain important information. This paper presents the results of our research on extractive summarization with a method based on Support Vector Machines (SVMs). The SVMs are trained using DUC-2002 dataset and the importance of sentences is judged on the basis of salient features. To evaluate the performance of our system, comparisons are conducted with two existing methods. ROUGE scores are used to compare the system generated summaries with the human generated summaries, and the experimental results show that our system's performance achieved high metrics.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"5 1","pages":"83-90"},"PeriodicalIF":0.0,"publicationDate":"2017-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88817046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fragment Allocation and Replication in Distributed Databases","authors":"A. Amiri","doi":"10.14257/ijdta.2017.10.8.05","DOIUrl":"https://doi.org/10.14257/ijdta.2017.10.8.05","url":null,"abstract":"We study the problem of designing a distributed database system. We develop optimization models for the problem that deals simultaneously with two major design issues, namely which fragments to replicate, and where to store those fragments and replicas. Given the difficulty of the problem, we propose a solution algorithm based on a new formulation of the problem in which every server is allocated a fragment combination from a set of combinations generated by a randomized greedy heuristic. The results of a computational study show that the algorithm outperforms a standard branch & bound technique for large instances of the problem.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"38 1","pages":"43-56"},"PeriodicalIF":0.0,"publicationDate":"2017-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88388247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Study on the Visualizing Time Series Data Using R","authors":"Eunmi Jung, A. Kim, Hyenki Kim","doi":"10.14257/ijdta.2017.10.8.01","DOIUrl":"https://doi.org/10.14257/ijdta.2017.10.8.01","url":null,"abstract":"With the recent increase in data volume, there is a growing interest in Big Data technology and there is also a growing interest in techniques to visualize result of big data processing. The vast majority of people accept visual information more quickly than text. Therefore, visualization is the important thing to focus on regarding big data analysis. Therefore, the study examined various visualization methods using an open source statistical analysis software R program. The study explored a method to configure data sets and a method to implement various graphs according to visualization method using R to determine patterns in data and understand the characteristics of data at a glance through visualization of data. Through this, it was possible to determine characteristics of data that were not known only through simple regression analysis and through showing that rather than interpreting data as it is, it could be visualized in various methods through conversion of data sets, it is expected that it will help users to make various decisions.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"79 1","pages":"1-10"},"PeriodicalIF":0.0,"publicationDate":"2017-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84073371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}