Melkamu Beyene, P. Portier, Solomon Atnafu, S. Calabretto
{"title":"多语言链接开放数据上下文中的数据集链接","authors":"Melkamu Beyene, P. Portier, Solomon Atnafu, S. Calabretto","doi":"10.1145/3012071.3012090","DOIUrl":null,"url":null,"abstract":"Although, the syntactical and structural heterogeneities among inter-language linked open data (LOD) data sources bring many challenges, entity co-reference resolution in a multilingual linked open data (MLOD) setting is not well studied. In this research, a three phase approach is proposed. First, statistical relational learning (SRL) with factorization of three way tensor is used to compute structural similarity between entities. Second, textual data from the Web of documents is associated in order to increase our knowledge of entities. Through a latent Dirichlet allocation (LDA), entities' textual data is projected into a cross-lingual topic space. This cross-lingual topic space is used to find textual similarities between entities. Third, a belief aggregation strategy is used to combine the structural and textual similarity results into a global similarity score. We have shown by experiments that our algorithm out-performs state of the art approaches based on tensor decomposition for the task of entity co-reference resolution in a MLOD setting.","PeriodicalId":294250,"journal":{"name":"Proceedings of the 8th International Conference on Management of Digital EcoSystems","volume":"81 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Dataset linking in a multilingual linked open data context\",\"authors\":\"Melkamu Beyene, P. Portier, Solomon Atnafu, S. Calabretto\",\"doi\":\"10.1145/3012071.3012090\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Although, the syntactical and structural heterogeneities among inter-language linked open data (LOD) data sources bring many challenges, entity co-reference resolution in a multilingual linked open data (MLOD) setting is not well studied. In this research, a three phase approach is proposed. First, statistical relational learning (SRL) with factorization of three way tensor is used to compute structural similarity between entities. Second, textual data from the Web of documents is associated in order to increase our knowledge of entities. Through a latent Dirichlet allocation (LDA), entities' textual data is projected into a cross-lingual topic space. This cross-lingual topic space is used to find textual similarities between entities. Third, a belief aggregation strategy is used to combine the structural and textual similarity results into a global similarity score. We have shown by experiments that our algorithm out-performs state of the art approaches based on tensor decomposition for the task of entity co-reference resolution in a MLOD setting.\",\"PeriodicalId\":294250,\"journal\":{\"name\":\"Proceedings of the 8th International Conference on Management of Digital EcoSystems\",\"volume\":\"81 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 8th International Conference on Management of Digital EcoSystems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3012071.3012090\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 8th International Conference on Management of Digital EcoSystems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3012071.3012090","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Dataset linking in a multilingual linked open data context
Although, the syntactical and structural heterogeneities among inter-language linked open data (LOD) data sources bring many challenges, entity co-reference resolution in a multilingual linked open data (MLOD) setting is not well studied. In this research, a three phase approach is proposed. First, statistical relational learning (SRL) with factorization of three way tensor is used to compute structural similarity between entities. Second, textual data from the Web of documents is associated in order to increase our knowledge of entities. Through a latent Dirichlet allocation (LDA), entities' textual data is projected into a cross-lingual topic space. This cross-lingual topic space is used to find textual similarities between entities. Third, a belief aggregation strategy is used to combine the structural and textual similarity results into a global similarity score. We have shown by experiments that our algorithm out-performs state of the art approaches based on tensor decomposition for the task of entity co-reference resolution in a MLOD setting.