{"title":"动态构建Web实体的全局模式","authors":"Xiuxing Xu, Qingzhong Li, Yongquan Dong, Yanhui Ding","doi":"10.1109/WISA.2010.32","DOIUrl":null,"url":null,"abstract":"With the rapid development of the Internet, popular entities have more and more instances on the Web. It is observed that, on one hand, for the same Web entity, different Web entity instances often contain different attributes, and for the same attribute, different Web entity instances often use different labels; on the other, new Web entity instances which contain new attributes and labels are appearing on the Web. Therefore, it is difficult to dynamically construct a global schema for the Web entities of a given entity type, although the global schema is highly desired in Web entity instances detection, extraction and integration. In this paper, we propose a novel approach to dynamically construct a global schema for the Web entities of a given entity type. First, a SVM (support vector machine) classification model is built based on the Web entity instances which have been extracted from related Web pages. Then, based on this model, a global schema discovery approach is provided to dynamically construct the global schema for target entity type. Experimental results on the Chinese Web sites show that the approach is general and effective.","PeriodicalId":122827,"journal":{"name":"2010 Seventh Web Information Systems and Applications Conference","volume":"157 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Dynamically Constructing a Global Schema for Web Entities\",\"authors\":\"Xiuxing Xu, Qingzhong Li, Yongquan Dong, Yanhui Ding\",\"doi\":\"10.1109/WISA.2010.32\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the rapid development of the Internet, popular entities have more and more instances on the Web. It is observed that, on one hand, for the same Web entity, different Web entity instances often contain different attributes, and for the same attribute, different Web entity instances often use different labels; on the other, new Web entity instances which contain new attributes and labels are appearing on the Web. Therefore, it is difficult to dynamically construct a global schema for the Web entities of a given entity type, although the global schema is highly desired in Web entity instances detection, extraction and integration. In this paper, we propose a novel approach to dynamically construct a global schema for the Web entities of a given entity type. First, a SVM (support vector machine) classification model is built based on the Web entity instances which have been extracted from related Web pages. Then, based on this model, a global schema discovery approach is provided to dynamically construct the global schema for target entity type. Experimental results on the Chinese Web sites show that the approach is general and effective.\",\"PeriodicalId\":122827,\"journal\":{\"name\":\"2010 Seventh Web Information Systems and Applications Conference\",\"volume\":\"157 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-08-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 Seventh Web Information Systems and Applications Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WISA.2010.32\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 Seventh Web Information Systems and Applications Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WISA.2010.32","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Dynamically Constructing a Global Schema for Web Entities
With the rapid development of the Internet, popular entities have more and more instances on the Web. It is observed that, on one hand, for the same Web entity, different Web entity instances often contain different attributes, and for the same attribute, different Web entity instances often use different labels; on the other, new Web entity instances which contain new attributes and labels are appearing on the Web. Therefore, it is difficult to dynamically construct a global schema for the Web entities of a given entity type, although the global schema is highly desired in Web entity instances detection, extraction and integration. In this paper, we propose a novel approach to dynamically construct a global schema for the Web entities of a given entity type. First, a SVM (support vector machine) classification model is built based on the Web entity instances which have been extracted from related Web pages. Then, based on this model, a global schema discovery approach is provided to dynamically construct the global schema for target entity type. Experimental results on the Chinese Web sites show that the approach is general and effective.