一种保护隐私的高效协议,用于使用长字符串属性的语义相似连接

Bilal Hawashin, F. Fotouhi, T. Truta
{"title":"一种保护隐私的高效协议,用于使用长字符串属性的语义相似连接","authors":"Bilal Hawashin, F. Fotouhi, T. Truta","doi":"10.1145/1971690.1971696","DOIUrl":null,"url":null,"abstract":"During the similarity join process, one or more sources may not allow sharing the whole data with other sources. In this case, privacy preserved similarity join is required. We showed in our previous work [4] that using long attributes, such as paper abstracts, movie summaries, product descriptions, and user feedbacks, could improve the similarity join accuracy under supervised learning. However, the existing secure protocols for similarity join methods can not be used to join tables using these long attributes. Moreover, the majority of the existing privacy-preserving protocols did not consider the semantic similarities during the similarity join process. In this paper, we introduce a secure efficient protocol to semantically join tables when the join attributes are long attributes. Furthermore, instead of using machine learning methods, which are not always applicable, we use similarity thresholds to decide matched pairs. Results show that our protocol can efficiently join tables using the long attributes by considering the semantic relationships among the long string values. Therefore, it improves the overall secure similarity join performance.","PeriodicalId":245552,"journal":{"name":"International Conference on Pattern Analysis and Intelligent Systems","volume":"97 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":"{\"title\":\"A privacy preserving efficient protocol for semantic similarity join using long string attributes\",\"authors\":\"Bilal Hawashin, F. Fotouhi, T. Truta\",\"doi\":\"10.1145/1971690.1971696\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"During the similarity join process, one or more sources may not allow sharing the whole data with other sources. In this case, privacy preserved similarity join is required. We showed in our previous work [4] that using long attributes, such as paper abstracts, movie summaries, product descriptions, and user feedbacks, could improve the similarity join accuracy under supervised learning. However, the existing secure protocols for similarity join methods can not be used to join tables using these long attributes. Moreover, the majority of the existing privacy-preserving protocols did not consider the semantic similarities during the similarity join process. In this paper, we introduce a secure efficient protocol to semantically join tables when the join attributes are long attributes. Furthermore, instead of using machine learning methods, which are not always applicable, we use similarity thresholds to decide matched pairs. Results show that our protocol can efficiently join tables using the long attributes by considering the semantic relationships among the long string values. Therefore, it improves the overall secure similarity join performance.\",\"PeriodicalId\":245552,\"journal\":{\"name\":\"International Conference on Pattern Analysis and Intelligent Systems\",\"volume\":\"97 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-03-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"17\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Pattern Analysis and Intelligent Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/1971690.1971696\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Pattern Analysis and Intelligent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1971690.1971696","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 17

摘要

在相似连接过程中,一个或多个源可能不允许与其他源共享整个数据。在这种情况下,需要保持隐私的相似性连接。我们在之前的工作[4]中表明,使用长属性,如论文摘要、电影摘要、产品描述和用户反馈,可以提高监督学习下的相似连接精度。但是,现有的相似性连接方法的安全协议不能用于使用这些长属性连接表。此外,现有的大多数隐私保护协议在相似度连接过程中没有考虑语义相似度。本文提出了一种安全高效的连接表的协议,用于连接属性为长属性时的语义连接。此外,我们没有使用并不总是适用的机器学习方法,而是使用相似阈值来决定匹配对。结果表明,通过考虑长字符串值之间的语义关系,该协议可以有效地利用长属性进行表连接。因此,它提高了整体的安全相似连接性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A privacy preserving efficient protocol for semantic similarity join using long string attributes
During the similarity join process, one or more sources may not allow sharing the whole data with other sources. In this case, privacy preserved similarity join is required. We showed in our previous work [4] that using long attributes, such as paper abstracts, movie summaries, product descriptions, and user feedbacks, could improve the similarity join accuracy under supervised learning. However, the existing secure protocols for similarity join methods can not be used to join tables using these long attributes. Moreover, the majority of the existing privacy-preserving protocols did not consider the semantic similarities during the similarity join process. In this paper, we introduce a secure efficient protocol to semantically join tables when the join attributes are long attributes. Furthermore, instead of using machine learning methods, which are not always applicable, we use similarity thresholds to decide matched pairs. Results show that our protocol can efficiently join tables using the long attributes by considering the semantic relationships among the long string values. Therefore, it improves the overall secure similarity join performance.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信