{"title":"跨多方存储库的协作相似度搜索","authors":"Malek Athamnah, Anis Alazzawe, K. Kant","doi":"10.1145/3154273.3154352","DOIUrl":null,"url":null,"abstract":"The expanding role of online data collection and analytics from a variety of sources in the operation of emerging cyber and cyberphysical systems brings in two crucial issues: (a) collaboration across multiple parties that generate and own parts of the data with only limited access rights to others, and (b) need to efficiently identify suitable patterns in the data in order to drive the decision making. In this paper, we examine such scenarios where we assume that all collected data is organized in form of a database and the relevant patterns are those that concern similarities across the entities represented by the data. An entity of interest is either a physical or logical item with multiple attributes (e.g., a shipped product with price and size as attributes, traffic sensors measuring the volume of traffic and weather conditions at intersections). We assume a that all data regarding the entities is maintained in a standard relational form so that it is possible to describe the queries on it precisely. The similarities are then considered in terms of attribute values. In some cases, the attributes of the entities may themselves be partitioned across parties and thus stored on different nodes. We consider queries in this environment that must comply with access rules across parties and seek entities that are similar to a given entity in terms of their attributes. We propose efficient methods for getting similar entities across multiple attributes when the threshold for similarity may vary across searches. Through extensive experimentation, we show that our mechanism is significantly more efficient than a direct search through the entire dataset.","PeriodicalId":276042,"journal":{"name":"Proceedings of the 19th International Conference on Distributed Computing and Networking","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Collaborative Similarity Search Across Multi-party Repositories\",\"authors\":\"Malek Athamnah, Anis Alazzawe, K. Kant\",\"doi\":\"10.1145/3154273.3154352\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The expanding role of online data collection and analytics from a variety of sources in the operation of emerging cyber and cyberphysical systems brings in two crucial issues: (a) collaboration across multiple parties that generate and own parts of the data with only limited access rights to others, and (b) need to efficiently identify suitable patterns in the data in order to drive the decision making. In this paper, we examine such scenarios where we assume that all collected data is organized in form of a database and the relevant patterns are those that concern similarities across the entities represented by the data. An entity of interest is either a physical or logical item with multiple attributes (e.g., a shipped product with price and size as attributes, traffic sensors measuring the volume of traffic and weather conditions at intersections). We assume a that all data regarding the entities is maintained in a standard relational form so that it is possible to describe the queries on it precisely. The similarities are then considered in terms of attribute values. In some cases, the attributes of the entities may themselves be partitioned across parties and thus stored on different nodes. We consider queries in this environment that must comply with access rules across parties and seek entities that are similar to a given entity in terms of their attributes. We propose efficient methods for getting similar entities across multiple attributes when the threshold for similarity may vary across searches. Through extensive experimentation, we show that our mechanism is significantly more efficient than a direct search through the entire dataset.\",\"PeriodicalId\":276042,\"journal\":{\"name\":\"Proceedings of the 19th International Conference on Distributed Computing and Networking\",\"volume\":\"46 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-01-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 19th International Conference on Distributed Computing and Networking\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3154273.3154352\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 19th International Conference on Distributed Computing and Networking","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3154273.3154352","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Collaborative Similarity Search Across Multi-party Repositories
The expanding role of online data collection and analytics from a variety of sources in the operation of emerging cyber and cyberphysical systems brings in two crucial issues: (a) collaboration across multiple parties that generate and own parts of the data with only limited access rights to others, and (b) need to efficiently identify suitable patterns in the data in order to drive the decision making. In this paper, we examine such scenarios where we assume that all collected data is organized in form of a database and the relevant patterns are those that concern similarities across the entities represented by the data. An entity of interest is either a physical or logical item with multiple attributes (e.g., a shipped product with price and size as attributes, traffic sensors measuring the volume of traffic and weather conditions at intersections). We assume a that all data regarding the entities is maintained in a standard relational form so that it is possible to describe the queries on it precisely. The similarities are then considered in terms of attribute values. In some cases, the attributes of the entities may themselves be partitioned across parties and thus stored on different nodes. We consider queries in this environment that must comply with access rules across parties and seek entities that are similar to a given entity in terms of their attributes. We propose efficient methods for getting similar entities across multiple attributes when the threshold for similarity may vary across searches. Through extensive experimentation, we show that our mechanism is significantly more efficient than a direct search through the entire dataset.