{"title":"Automatic acquisition of concepts from domain texts","authors":"Janardhana Punuru, Jianhua Chen","doi":"10.1109/GRC.2006.1635831","DOIUrl":null,"url":null,"abstract":"Domain specific concept extraction is a key com- ponent in ontology construction for Semantic Web applications. Manual concept extraction is costly both in time and labor. In this paper, we present several heuristic methods for automatic concepts extraction from domain texts. These methods aim to improve the precision and recall over the word frequency-based techniques. Precision is improved by elimination of irrelevant terms using word sense information. Recall is enhanced by adding new concepts formed by composition of relevant words. Our methods are domain independent, and can be applied in fully automatic way to the concept extraction task. Experimental results on the electronic voting domain texts (from New York Times) are presented which show the promise of the proposed methods. Index Terms— Concept extraction, ontology engineering, text processing, WordNet, WordNet Senses.","PeriodicalId":400997,"journal":{"name":"2006 IEEE International Conference on Granular Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 IEEE International Conference on Granular Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GRC.2006.1635831","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
Domain specific concept extraction is a key com- ponent in ontology construction for Semantic Web applications. Manual concept extraction is costly both in time and labor. In this paper, we present several heuristic methods for automatic concepts extraction from domain texts. These methods aim to improve the precision and recall over the word frequency-based techniques. Precision is improved by elimination of irrelevant terms using word sense information. Recall is enhanced by adding new concepts formed by composition of relevant words. Our methods are domain independent, and can be applied in fully automatic way to the concept extraction task. Experimental results on the electronic voting domain texts (from New York Times) are presented which show the promise of the proposed methods. Index Terms— Concept extraction, ontology engineering, text processing, WordNet, WordNet Senses.