{"title":"基于在线评论主题建模的旅游知识模型","authors":"Valentinus Roby Hananto, U. Serdült, V. Kryssanov","doi":"10.1145/3456172.3456211","DOIUrl":null,"url":null,"abstract":"Ontologies and knowledge models have gained more recognition because of their extensive use in recommender systems. The lack of automatic approaches in ontology engineering, however, becomes a challenge to fulfill increasing needs for such knowledge models in the field of tourism. In this study, a system for building tourism knowledge models from online reviews is proposed. The main contribution of the study is the application of topic modeling to build a knowledge model that, in turn, allows for an automated labeling process to train classifiers. Given a collection of unlabeled tourism online reviews, Latent Dirichlet Allocation (LDA) is applied to automatically label each document. Each topic discovered by LDA is labeled with one specific category, representing its semantic meaning based on an existing general ontology as a reference. These automatically labeled documents are used for classification, and the result is compared with manual annotation. Experiments on Indonesian tourism datasets showed that the automatic labeling approach using LDA provides for a precision score of 70%. In classification tasks, this approach can achieve comparable or even better classification performance than the manual labeling. The results obtained suggest that the developed system is capable of building a tourism knowledge model and providing acceptable-quality training data for the development of tourism recommender systems.","PeriodicalId":133908,"journal":{"name":"Proceedings of the 2021 7th International Conference on Computing and Data Engineering","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A Tourism Knowledge Model through Topic Modeling from Online Reviews\",\"authors\":\"Valentinus Roby Hananto, U. Serdült, V. Kryssanov\",\"doi\":\"10.1145/3456172.3456211\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Ontologies and knowledge models have gained more recognition because of their extensive use in recommender systems. The lack of automatic approaches in ontology engineering, however, becomes a challenge to fulfill increasing needs for such knowledge models in the field of tourism. In this study, a system for building tourism knowledge models from online reviews is proposed. The main contribution of the study is the application of topic modeling to build a knowledge model that, in turn, allows for an automated labeling process to train classifiers. Given a collection of unlabeled tourism online reviews, Latent Dirichlet Allocation (LDA) is applied to automatically label each document. Each topic discovered by LDA is labeled with one specific category, representing its semantic meaning based on an existing general ontology as a reference. These automatically labeled documents are used for classification, and the result is compared with manual annotation. Experiments on Indonesian tourism datasets showed that the automatic labeling approach using LDA provides for a precision score of 70%. In classification tasks, this approach can achieve comparable or even better classification performance than the manual labeling. The results obtained suggest that the developed system is capable of building a tourism knowledge model and providing acceptable-quality training data for the development of tourism recommender systems.\",\"PeriodicalId\":133908,\"journal\":{\"name\":\"Proceedings of the 2021 7th International Conference on Computing and Data Engineering\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-01-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2021 7th International Conference on Computing and Data Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3456172.3456211\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2021 7th International Conference on Computing and Data Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3456172.3456211","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Tourism Knowledge Model through Topic Modeling from Online Reviews
Ontologies and knowledge models have gained more recognition because of their extensive use in recommender systems. The lack of automatic approaches in ontology engineering, however, becomes a challenge to fulfill increasing needs for such knowledge models in the field of tourism. In this study, a system for building tourism knowledge models from online reviews is proposed. The main contribution of the study is the application of topic modeling to build a knowledge model that, in turn, allows for an automated labeling process to train classifiers. Given a collection of unlabeled tourism online reviews, Latent Dirichlet Allocation (LDA) is applied to automatically label each document. Each topic discovered by LDA is labeled with one specific category, representing its semantic meaning based on an existing general ontology as a reference. These automatically labeled documents are used for classification, and the result is compared with manual annotation. Experiments on Indonesian tourism datasets showed that the automatic labeling approach using LDA provides for a precision score of 70%. In classification tasks, this approach can achieve comparable or even better classification performance than the manual labeling. The results obtained suggest that the developed system is capable of building a tourism knowledge model and providing acceptable-quality training data for the development of tourism recommender systems.