{"title":"本体构建中概念识别的领域特定术语提取","authors":"Kiruparan Balachandran, Surangika Ranathunga","doi":"10.1109/WI.2016.0016","DOIUrl":null,"url":null,"abstract":"An ontology is a formal and explicit specification of a shared conceptualization. Manual construction of domain ontology does not adequately satisfy requirements of new applications, because they need a more dynamic ontology and the possibility to manage a considerable quantity of concepts that humans cannot achieve alone. Researchers have discussed ontology learning as a solution to overcome issues related to the manual construction of ontology. Ontology learning is either an automatic or semi-automatic process to apply methods for building ontology from scratch, or enriching or adapting an existing ontology. This research focuses on improving the process of term extraction for identifying concepts in ontology learning. Available approaches for term extraction process are limited in various ways. These limitations include: (1) obtaining domain-specific terms from a domain expert as seed words without automatically discovering them from the corpus, and (2) unsuitable usage of corpora in discovering domain-specific terms for multiple domains. Our study uses linguistic analysis and statistical calculations to extract domain-specific simple and complex terms to overcome this first limitation. To eliminate the second limitation, we use multiple contrastive corpora that reduce the biasness in using a single contrastive corpus. Evaluations show that our system is better at extracting terms when compared with the previous research that used the same corpora.","PeriodicalId":6513,"journal":{"name":"2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI)","volume":"28 1","pages":"34-41"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Domain-Specific Term Extraction for Concept Identification in Ontology Construction\",\"authors\":\"Kiruparan Balachandran, Surangika Ranathunga\",\"doi\":\"10.1109/WI.2016.0016\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"An ontology is a formal and explicit specification of a shared conceptualization. Manual construction of domain ontology does not adequately satisfy requirements of new applications, because they need a more dynamic ontology and the possibility to manage a considerable quantity of concepts that humans cannot achieve alone. Researchers have discussed ontology learning as a solution to overcome issues related to the manual construction of ontology. Ontology learning is either an automatic or semi-automatic process to apply methods for building ontology from scratch, or enriching or adapting an existing ontology. This research focuses on improving the process of term extraction for identifying concepts in ontology learning. Available approaches for term extraction process are limited in various ways. These limitations include: (1) obtaining domain-specific terms from a domain expert as seed words without automatically discovering them from the corpus, and (2) unsuitable usage of corpora in discovering domain-specific terms for multiple domains. Our study uses linguistic analysis and statistical calculations to extract domain-specific simple and complex terms to overcome this first limitation. To eliminate the second limitation, we use multiple contrastive corpora that reduce the biasness in using a single contrastive corpus. Evaluations show that our system is better at extracting terms when compared with the previous research that used the same corpora.\",\"PeriodicalId\":6513,\"journal\":{\"name\":\"2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI)\",\"volume\":\"28 1\",\"pages\":\"34-41\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WI.2016.0016\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WI.2016.0016","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Domain-Specific Term Extraction for Concept Identification in Ontology Construction
An ontology is a formal and explicit specification of a shared conceptualization. Manual construction of domain ontology does not adequately satisfy requirements of new applications, because they need a more dynamic ontology and the possibility to manage a considerable quantity of concepts that humans cannot achieve alone. Researchers have discussed ontology learning as a solution to overcome issues related to the manual construction of ontology. Ontology learning is either an automatic or semi-automatic process to apply methods for building ontology from scratch, or enriching or adapting an existing ontology. This research focuses on improving the process of term extraction for identifying concepts in ontology learning. Available approaches for term extraction process are limited in various ways. These limitations include: (1) obtaining domain-specific terms from a domain expert as seed words without automatically discovering them from the corpus, and (2) unsuitable usage of corpora in discovering domain-specific terms for multiple domains. Our study uses linguistic analysis and statistical calculations to extract domain-specific simple and complex terms to overcome this first limitation. To eliminate the second limitation, we use multiple contrastive corpora that reduce the biasness in using a single contrastive corpus. Evaluations show that our system is better at extracting terms when compared with the previous research that used the same corpora.