{"title":"ONTOCUBO: cube-based ontology construction and exploration","authors":"Carlos Garcia-Alvarado, C. Ordonez","doi":"10.1145/2588555.2594521","DOIUrl":null,"url":null,"abstract":"One of the major challenges of big data analytics is the diverse information content, which has no pre-defined structure or classification. This is in contrast to the well-designed structure of a database specified on an ER model. A standard mechanism for understanding interrelationships and the structure of documents is using ontologies. With such motivation in mind, we present a system that enables data management and querying of documents based on ontologies by leveraging the functionality of the DBMS. In this paper, we present ONTOCUBO, a novel system based on our research for text summarization using ontologies and automatic extraction of concepts for building ontologies using Online Analytical Processing (OLAP) cubes. ONTOCUBO is a database-centric approach that excels in its performance, due to an SQL-based single pass summarization phase through the original data set that computes values such as keyword frequency, standard deviation, and lift. This approach is complemented with a set of User-Defined-Function-based algorithms that analyze the summarization results for concepts and their interrelationships. Finally, we show in detail our application that extracts and builds an ontology, but also allows concept summarizations and allows domain experts to explore and modify the resulting ontology.","PeriodicalId":314442,"journal":{"name":"Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data","volume":"286 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2588555.2594521","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
One of the major challenges of big data analytics is the diverse information content, which has no pre-defined structure or classification. This is in contrast to the well-designed structure of a database specified on an ER model. A standard mechanism for understanding interrelationships and the structure of documents is using ontologies. With such motivation in mind, we present a system that enables data management and querying of documents based on ontologies by leveraging the functionality of the DBMS. In this paper, we present ONTOCUBO, a novel system based on our research for text summarization using ontologies and automatic extraction of concepts for building ontologies using Online Analytical Processing (OLAP) cubes. ONTOCUBO is a database-centric approach that excels in its performance, due to an SQL-based single pass summarization phase through the original data set that computes values such as keyword frequency, standard deviation, and lift. This approach is complemented with a set of User-Defined-Function-based algorithms that analyze the summarization results for concepts and their interrelationships. Finally, we show in detail our application that extracts and builds an ontology, but also allows concept summarizations and allows domain experts to explore and modify the resulting ontology.