Saravadee Sae Tan, E. Tang, Bali Ranaivo-Malançon, G. Sodhy
{"title":"异构结构化文档集合中的语义对应建模","authors":"Saravadee Sae Tan, E. Tang, Bali Ranaivo-Malançon, G. Sodhy","doi":"10.1109/STAIR.2011.5995787","DOIUrl":null,"url":null,"abstract":"On the web, most structured document collections consist of documents from different sources and marked up with different types of structures. The diversity of structures has led to the emergence of heterogeneous structured documents. The heterogeneity of structured documents is one of the reason for query-document mismatch in structured document retrieval. In structured document retrieval, a user is assumed to have intimate knowledge of the document structures and is able to specify contextual constraints in their queries. However, it is impossible for the user to know all structures in heterogeneous structured document collections. In this paper, we propose to include similar correspondence relations in the representation model for structured document retrieval. The similar correspondences make the relations between similar contents explicit in order to improve structured document retrieval effectiveness. We introduce a generic and flexible structured document model to represent heterogeneous structured documents as well as the similar correspondences in the document collections. We also illustrate how the proposed model can be utilized in structured document retrieval.","PeriodicalId":376671,"journal":{"name":"2011 International Conference on Semantic Technology and Information Retrieval","volume":"161 5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Modeling semantic correspondence in heterogeneous structured document collection\",\"authors\":\"Saravadee Sae Tan, E. Tang, Bali Ranaivo-Malançon, G. Sodhy\",\"doi\":\"10.1109/STAIR.2011.5995787\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"On the web, most structured document collections consist of documents from different sources and marked up with different types of structures. The diversity of structures has led to the emergence of heterogeneous structured documents. The heterogeneity of structured documents is one of the reason for query-document mismatch in structured document retrieval. In structured document retrieval, a user is assumed to have intimate knowledge of the document structures and is able to specify contextual constraints in their queries. However, it is impossible for the user to know all structures in heterogeneous structured document collections. In this paper, we propose to include similar correspondence relations in the representation model for structured document retrieval. The similar correspondences make the relations between similar contents explicit in order to improve structured document retrieval effectiveness. We introduce a generic and flexible structured document model to represent heterogeneous structured documents as well as the similar correspondences in the document collections. We also illustrate how the proposed model can be utilized in structured document retrieval.\",\"PeriodicalId\":376671,\"journal\":{\"name\":\"2011 International Conference on Semantic Technology and Information Retrieval\",\"volume\":\"161 5 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-06-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 International Conference on Semantic Technology and Information Retrieval\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/STAIR.2011.5995787\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 International Conference on Semantic Technology and Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/STAIR.2011.5995787","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Modeling semantic correspondence in heterogeneous structured document collection
On the web, most structured document collections consist of documents from different sources and marked up with different types of structures. The diversity of structures has led to the emergence of heterogeneous structured documents. The heterogeneity of structured documents is one of the reason for query-document mismatch in structured document retrieval. In structured document retrieval, a user is assumed to have intimate knowledge of the document structures and is able to specify contextual constraints in their queries. However, it is impossible for the user to know all structures in heterogeneous structured document collections. In this paper, we propose to include similar correspondence relations in the representation model for structured document retrieval. The similar correspondences make the relations between similar contents explicit in order to improve structured document retrieval effectiveness. We introduce a generic and flexible structured document model to represent heterogeneous structured documents as well as the similar correspondences in the document collections. We also illustrate how the proposed model can be utilized in structured document retrieval.