Shang-En Yang, Hung-Yuan Chen, Vorakit Vorakitphan, Yao-Chung Fan
{"title":"从大量纯文本中学习术语分类关系","authors":"Shang-En Yang, Hung-Yuan Chen, Vorakit Vorakitphan, Yao-Chung Fan","doi":"10.1109/ICS.2016.0061","DOIUrl":null,"url":null,"abstract":"In this paper, we present a heuristic for labeling a given term a taxonomy label. Specifically, for a given term, our goal is to construct a model for determining an \"is-a\" relationship between the given term and an inferred concept. Such term-labelling problem is not new, but the existing solutions require semi-supervised training processing, e.g., supervised LDA, or rely on lexicographers, e.g., wordnet. The model construction cost becomes burdens for employing such semantic understanding capability in various emerging applications. Aiming at these issues, in this study, we present a lightweight approach with the following features. First, the proposed approach is unsupervised and take only pain text as inputs. Second, the proposed approach allows incremental model construction. Third, the proposed approach is simple but effective and computationally efficient in comparison with the existing solutions. We demonstrate these results through experiments by comparing our approach with DBpedia and employ the popular search terms as test set. From experiment results, we see that 30 percent improvement in accuracy can be achieved by the proposed approach.","PeriodicalId":281088,"journal":{"name":"2016 International Computer Symposium (ICS)","volume":"192 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Learning Term Taxonomy Relationship from a Large Collection of Plain Text\",\"authors\":\"Shang-En Yang, Hung-Yuan Chen, Vorakit Vorakitphan, Yao-Chung Fan\",\"doi\":\"10.1109/ICS.2016.0061\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we present a heuristic for labeling a given term a taxonomy label. Specifically, for a given term, our goal is to construct a model for determining an \\\"is-a\\\" relationship between the given term and an inferred concept. Such term-labelling problem is not new, but the existing solutions require semi-supervised training processing, e.g., supervised LDA, or rely on lexicographers, e.g., wordnet. The model construction cost becomes burdens for employing such semantic understanding capability in various emerging applications. Aiming at these issues, in this study, we present a lightweight approach with the following features. First, the proposed approach is unsupervised and take only pain text as inputs. Second, the proposed approach allows incremental model construction. Third, the proposed approach is simple but effective and computationally efficient in comparison with the existing solutions. We demonstrate these results through experiments by comparing our approach with DBpedia and employ the popular search terms as test set. From experiment results, we see that 30 percent improvement in accuracy can be achieved by the proposed approach.\",\"PeriodicalId\":281088,\"journal\":{\"name\":\"2016 International Computer Symposium (ICS)\",\"volume\":\"192 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 International Computer Symposium (ICS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICS.2016.0061\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 International Computer Symposium (ICS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICS.2016.0061","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Learning Term Taxonomy Relationship from a Large Collection of Plain Text
In this paper, we present a heuristic for labeling a given term a taxonomy label. Specifically, for a given term, our goal is to construct a model for determining an "is-a" relationship between the given term and an inferred concept. Such term-labelling problem is not new, but the existing solutions require semi-supervised training processing, e.g., supervised LDA, or rely on lexicographers, e.g., wordnet. The model construction cost becomes burdens for employing such semantic understanding capability in various emerging applications. Aiming at these issues, in this study, we present a lightweight approach with the following features. First, the proposed approach is unsupervised and take only pain text as inputs. Second, the proposed approach allows incremental model construction. Third, the proposed approach is simple but effective and computationally efficient in comparison with the existing solutions. We demonstrate these results through experiments by comparing our approach with DBpedia and employ the popular search terms as test set. From experiment results, we see that 30 percent improvement in accuracy can be achieved by the proposed approach.