{"title":"A Novel Framework for Data Extraction from Multiple Repositories and Generation of Ontologies using Inverted Indexing Technique","authors":"Sudeepthi Govathoti, M. Babu","doi":"10.14257/IJDTA.2017.10.7.07","DOIUrl":"https://doi.org/10.14257/IJDTA.2017.10.7.07","url":null,"abstract":"Recent years have observed the tremendous growth of information through the large number of domains available in the web. Social media (LinkedIn, Twitter etc.) concentrate on handling massive data obtaining from various sources. It is a fact that information retrieval and data extraction are difficult tasks in handling the large collection of web documents. Semantic web is a new technology used to handle the massive raw data to transform it into knowledgeable representation. Traditional search engines use page ranking algorithms to find data from a large data sources. The proposed work is aimed at designing a user interface for data extraction from multiple repositories using Uniform Resource Identifiers (URIs) and applying inverted indexing techniques for generation of Ontologies. These methods may be used to develop efficient semantic web knowledge based systems for retrieving relevant information from the web .","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"11 1","pages":"77-88"},"PeriodicalIF":0.0,"publicationDate":"2017-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84189587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Miran Seok, Hye-Jeong Song, Chan-Young Park, Jong-Dae Kim, Yu-Seop Kim
{"title":"A Study of Dictionary Based Korean Semantic Role Labeling","authors":"Miran Seok, Hye-Jeong Song, Chan-Young Park, Jong-Dae Kim, Yu-Seop Kim","doi":"10.14257/ijdta.2017.10.7.06","DOIUrl":"https://doi.org/10.14257/ijdta.2017.10.7.06","url":null,"abstract":"A semantic role is information used to clarify the role of entities in an event that a sentence describes, including agent, theme, experience, object, and location. Semantic role labeling (SRL) is a process that determines the semantic relation of a predicate and its arguments in a sentence and is an important factor in the semantic analysis of natural language processing, in addition to word sense disambiguation. To date, many manual semantic tagging tasks have been constructed; however, these tasks require a great deal of time and cost. To solve this problem, we propose a method for automatic SRL using frame files included in the Korean version of Proposition Bank (PropBank), which is one of the most widely used corpora. Frame files provide guidelines for PropBank annotators and include a list of framesets, which stand for a set of syntactic frames. First, we select the proper sense of the predicate from among multiple senses of the predicate in the frame files. Senses of the predicate are classified according to the semantic and syntactic properties of the predicate’s arguments. We collect the nouns in a sample sentence of a given sense; we also collect all of the nouns that appear in a given sentence. The semantic similarities between the nouns from the sample sentence and the given sentence are measured and the sense with the highest similarity value is selected. The frame information of the selected sense is used for SRL of the given predicate and its arguments.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"57 1","pages":"65-76"},"PeriodicalIF":0.0,"publicationDate":"2017-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81540526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Cross-Domain Analysis using Morphological Sentence Pattern Approach for Extracting Aspect-based Lexicon","authors":"Youngsub Han, Yanggon Kim, Jin-Hee Song","doi":"10.14257/IJDTA.2017.10.7.02","DOIUrl":"https://doi.org/10.14257/IJDTA.2017.10.7.02","url":null,"abstract":"","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"46 1","pages":"13-26"},"PeriodicalIF":0.0,"publicationDate":"2017-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80389047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Design and Implementation of the Symbol Table for Object-Oriented Programming Language","authors":"Yangsun Lee","doi":"10.14257/ijdta.2017.10.7.03","DOIUrl":"https://doi.org/10.14257/ijdta.2017.10.7.03","url":null,"abstract":"The symbol table used in the existing compiler stores one symbol information into a plurality of sub tables, and the abstract syntax tree necessary for generating symbols has a binary tree structure composed of a single data structure node. This structure increases the source code complexity of modules that generate symbols and modules that reference symbol tables, and when designing a compiler for a new language, it is necessary to newly design an abstract syntax tree and a symbol table structure considering the characteristics of the language. In this paper, we apply the object-oriented principle and visitor pattern to improve the abstract syntax tree structure and design and implement the symbol table for the object oriented language. The design of AST (abstract syntax trees) with object-oriented principles and Visitor patterns reduces the time and cost of redesign because it makes it easy to add features of the language without the need to redesign the AST (abstract syntax tree) for the new object-oriented language. In addition, it is easy to create a symbol through the Visitor pattern. Symbol tables using the open-close principle and the dependency inversion principle can improve the code reusability of the source code that creates and refer to the table and improve the readability of the code.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"2 1 1","pages":"27-40"},"PeriodicalIF":0.0,"publicationDate":"2017-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78284669","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Integrated Processes of SDR Data for Real-time Processing","authors":"Sang-Young Lee","doi":"10.14257/IJDTA.2017.10.7.05","DOIUrl":"https://doi.org/10.14257/IJDTA.2017.10.7.05","url":null,"abstract":"In this paper, relative data is classified using STDC which is an efficient classification process using the ontology technique. Classified data are saved at the storage according to its SDR type. Integrated processes are used to reuse the saved SDR data. Thus, relative data is constructed in a systematic reuse system applying total architecture. This overcomes the disadvantage of the past processes that required numerous joint computation when handling question and answer. SDTC Technique solves the weakness of old methods which required multiple join calculation that caused functional decline and allows normalized type of classification task.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"3 1","pages":"55-64"},"PeriodicalIF":0.0,"publicationDate":"2017-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75121544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Integrated Design Solution for Distributed Databases Using Genetic Algorithms","authors":"Sukkyu Song","doi":"10.14257/IJDTA.2017.10.6.02","DOIUrl":"https://doi.org/10.14257/IJDTA.2017.10.6.02","url":null,"abstract":"The design of distributed database systems has prompted many research problems. Among others, the issue of interdependency and interaction associated with data fragmentation, data allocation, and distributed query optimization still remains unanswered. These problems have been proven to be NP-complete or NP-hard, so most previous studies have addressed these problems in isolation by making simplified assumptions. However, these problems are interdependent and hence solving them independently results in inefficient solution overall. In this research, we develop an integrated distributed database design solution for three problems: partitioning data sets, allocating partitioned data sets among the sites of a network, and allocating operations as a problem of distributed query optimization. We use a transaction-based approach, wherein most important transactions are considered in determining the effective design of distributed database, and consider two types of transactions: OLTP (on-line transaction processing) and DSS (decision support system), for reflecting various distributed database design objectives such as total time minimization, response time minimization, and minimization of a combination of both. We employ genetic algorithms as searching methods for the best distributed database design solution. The integrated design solutions are determined by analyzing interactions between the problems in four stages: 1) between vertical fragmentation and operation allocation, 2) between vertical fragmentation and data allocation, 3) between data allocation and operation allocation, and 4) integration of all three problems, with the objectives of cost minimization and load balancing. Our integrated approach resulted in a cost effective distributed database design compared to the designs considering the problems in isolation.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"44 1","pages":"13-34"},"PeriodicalIF":0.0,"publicationDate":"2017-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77838296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Impact of Some Socio-Economic Factors on Academic Performance: A Fuzzy Mining Decision Support System","authors":"O. Oladipupo, A. I. Ehigbochie","doi":"10.14257/IJDTA.2017.10.6.06","DOIUrl":"https://doi.org/10.14257/IJDTA.2017.10.6.06","url":null,"abstract":"Due to the reported impacts of some socio-economic factors on academic performance and nations’ education value, there is need for strong awareness to assist students in making the right decision. To this effect, this study proposes and designs student decision support system for determining the extent to which different levels of some socioeconomic factors involvement can jointly affect academic performance. The factors are: Student’s interest, Relationship status, Entrepreneurial activities, Peer influence, Health and family background. The traditional decision support system architecture was extended in this study by introducing two components: Fuzzy engine and Mining Engine. Fuzzy engine was introduced to capture intra uncertainties in students' judgment about the data gathered and Mining engine to extract hidden and previously unknown interesting patterns from the dataset. The predictive model was established using fuzzy association rule mining technique. The dataset was gathered using one-on-one questionnaire interaction with students from 4 Universities in Nigeria. The system evaluates students' linguistic levels of involvement and predicts the possible class of honours for them with explicit interpretation of the fired patterns. This system will assist the students in decision making as to the extent they can be involved in some socioeconomic activities relative to their family and health status in order to have their desired classes of honour.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"35 1","pages":"71-86"},"PeriodicalIF":0.0,"publicationDate":"2017-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80215841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Effective Approach for Non-Numeric Relational Database Verification","authors":"L. Camara, Demba Coulibaly, Ali Hamadou, Junyi Li","doi":"10.14257/IJDTA.2017.10.6.03","DOIUrl":"https://doi.org/10.14257/IJDTA.2017.10.6.03","url":null,"abstract":"With the large distribution of digital data, protecting their integrity becomes necessary and digital watermarking has been proposed as solution for protecting the content of relational database. Previous watermarking techniques mainly focus on the numeric database authentication by inserting watermark bits in digital data which may greatly degrade the data quality. In this paper, we present a distortion free approach to verify the integrity of a combined numeric and non-numeric relational database. The technique first partitions the database in different groups of square matrices, then the ASCII code of non-numeric data of group attributes are computed and used to generate the watermark. Security analyzes and experiments demonstrated that the proposed technique is resilient against malicious attacks and moreover the tampering can be detected up to group level.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"134 1","pages":"35-46"},"PeriodicalIF":0.0,"publicationDate":"2017-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77386687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Query Optimization for Databases in Cloud Environment: A Survey","authors":"Archana Bachhav, V. Kharat, M. Shelar","doi":"10.14257/IJDTA.2017.10.6.01","DOIUrl":"https://doi.org/10.14257/IJDTA.2017.10.6.01","url":null,"abstract":"Now days in the field of service oriented technologies cloud computing plays an important role. The main aim of cloud computing is to make people compute and store the resources easily and efficiently. Recent focus is deal with data expressing and searching. To improve the performance in the cloud requires the optimization of data processing time. Our study gives a comprehensive survey on numerous models and approaches used for query optimization to minimize execution time and to improve resource utilization. We have reviewed various research work done on query optimization for conventional SQL and MapReduce platforms.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"4 1","pages":"1-12"},"PeriodicalIF":0.0,"publicationDate":"2017-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76260086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analysis of Criminal Profiling Utilizing Structured and Unstructured Data","authors":"Yonghoon Kim, Mokdong Chung","doi":"10.14257/ijdta.2017.10.6.04","DOIUrl":"https://doi.org/10.14257/ijdta.2017.10.6.04","url":null,"abstract":"In general, the structured data knows the meaning of the sentence and unstructured data refers to an unknown means. Although the quantity of structured information in the entire data and within organizations is increasing, the majority of information remains available only in unstructured data. While different in form, both unstructured and structured information sources provide information about entities in the world and their properties and relations. Due to the recent rapid changes in society and wide spread of information devices, diverse digital information is utilized in a variety of economic and social analysis. Information related to the crime statistics by type of crime has been used as a major factor in crime. However, statistical analysis using only the structured data has the difficulty in the investigation by providing limited information to investigators and users. In this paper, structured data and unstructured data are analyzed by applying Korean Natural Language Processing (Ko-NLP) and the Latent Semantic Analysis (LSA) technique. It will provide a crime profile optimum system that can be applied to the crime profiling system or statistical analysis [1].","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"1 1","pages":"47-60"},"PeriodicalIF":0.0,"publicationDate":"2017-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85144430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}