{"title":"GeTCo: an ontology-based approach for patent classification search","authors":"Hoang-Minh Nguyen, Cong-Phuoc Phan, Hong-Quang Nguyen","doi":"10.1145/3011141.3011205","DOIUrl":null,"url":null,"abstract":"The main contribution of this paper is a method for creating a Graph-Embedded-Tree-based ontology, which utilizes domain knowledge from a patent classification scheme, for a patent classification process. Our contribution is twofold. First, we propose a novel definition of GeTCo ontology, which consists of four types of concept: Class, Document, Phrase, and Term. Depending on relationships of each pair of concepts, we further define their semantic information to give our classifier better reasoning capability whenever the semantic ambiguation occurs. Second, we propose a novel method to construct our ontology based on the United State Patent Classification Scheme (USPC) without relying on a rule-based method for concept extraction and thus, it can negate intensive-manual efforts in traditional ontology construction. We developed a prototype application on top of Rocchio classifier, called the GeTCo-enabled Rocchio classifier, to evaluate our proposed ontology. Our experiments with filtered 9703 single-class patents showed that the GeTCo-enabled Rocchio classifier, backed by our proposed directed-graph ontology, yields higher F1-score (i.e., +7%) than original Rocchio classifier without GeTCo supports.","PeriodicalId":247823,"journal":{"name":"Proceedings of the 18th International Conference on Information Integration and Web-based Applications and Services","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 18th International Conference on Information Integration and Web-based Applications and Services","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3011141.3011205","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
The main contribution of this paper is a method for creating a Graph-Embedded-Tree-based ontology, which utilizes domain knowledge from a patent classification scheme, for a patent classification process. Our contribution is twofold. First, we propose a novel definition of GeTCo ontology, which consists of four types of concept: Class, Document, Phrase, and Term. Depending on relationships of each pair of concepts, we further define their semantic information to give our classifier better reasoning capability whenever the semantic ambiguation occurs. Second, we propose a novel method to construct our ontology based on the United State Patent Classification Scheme (USPC) without relying on a rule-based method for concept extraction and thus, it can negate intensive-manual efforts in traditional ontology construction. We developed a prototype application on top of Rocchio classifier, called the GeTCo-enabled Rocchio classifier, to evaluate our proposed ontology. Our experiments with filtered 9703 single-class patents showed that the GeTCo-enabled Rocchio classifier, backed by our proposed directed-graph ontology, yields higher F1-score (i.e., +7%) than original Rocchio classifier without GeTCo supports.