Leonardo Rigutini, E. Iorio, M. Ernandes, Marco Maggini
{"title":"利用Web对数据进行语义标注","authors":"Leonardo Rigutini, E. Iorio, M. Ernandes, Marco Maggini","doi":"10.1109/WI-IATW.2006.118","DOIUrl":null,"url":null,"abstract":"This paper proposes a system for automatically categorizing terms or lexical entities into a predefined set of semantic domains. We present an approach that exploits the knowledge available in the Web to create a model of each term or entity (entity context lexicons - ECLs). Each profile is simply a list of terms (similar to the bag-of-words representation in text categorization) and it is composed primarily by the words often appearing in the same contexts of the entity. These profiles model the contexts in which the entity usually appears and they can be subsequently processed by an automatic classifier. Moreover, we propose and validate a profile-based categorization model developed for this particular task which uses the ECLs of the training entities to build a profile for each class (class context lexicon - CCL). Finally, we propose a technique for dealing with multi-label classification based on a decision module that exploits a neural network. We show the effectiveness of the proposed approach on a term categorization task using a standard benchmark composed of a set of domain-specific lexicons (WordNetDomains)","PeriodicalId":358971,"journal":{"name":"2006 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology Workshops","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Semantic Labeling of Data by Using the Web\",\"authors\":\"Leonardo Rigutini, E. Iorio, M. Ernandes, Marco Maggini\",\"doi\":\"10.1109/WI-IATW.2006.118\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper proposes a system for automatically categorizing terms or lexical entities into a predefined set of semantic domains. We present an approach that exploits the knowledge available in the Web to create a model of each term or entity (entity context lexicons - ECLs). Each profile is simply a list of terms (similar to the bag-of-words representation in text categorization) and it is composed primarily by the words often appearing in the same contexts of the entity. These profiles model the contexts in which the entity usually appears and they can be subsequently processed by an automatic classifier. Moreover, we propose and validate a profile-based categorization model developed for this particular task which uses the ECLs of the training entities to build a profile for each class (class context lexicon - CCL). Finally, we propose a technique for dealing with multi-label classification based on a decision module that exploits a neural network. We show the effectiveness of the proposed approach on a term categorization task using a standard benchmark composed of a set of domain-specific lexicons (WordNetDomains)\",\"PeriodicalId\":358971,\"journal\":{\"name\":\"2006 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology Workshops\",\"volume\":\"18 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2006-12-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2006 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology Workshops\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WI-IATW.2006.118\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WI-IATW.2006.118","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
This paper proposes a system for automatically categorizing terms or lexical entities into a predefined set of semantic domains. We present an approach that exploits the knowledge available in the Web to create a model of each term or entity (entity context lexicons - ECLs). Each profile is simply a list of terms (similar to the bag-of-words representation in text categorization) and it is composed primarily by the words often appearing in the same contexts of the entity. These profiles model the contexts in which the entity usually appears and they can be subsequently processed by an automatic classifier. Moreover, we propose and validate a profile-based categorization model developed for this particular task which uses the ECLs of the training entities to build a profile for each class (class context lexicon - CCL). Finally, we propose a technique for dealing with multi-label classification based on a decision module that exploits a neural network. We show the effectiveness of the proposed approach on a term categorization task using a standard benchmark composed of a set of domain-specific lexicons (WordNetDomains)