Scalable and Efficient Spatial-Aware Parallelization Strategies for Multimedia Retrieval

Guilherme Andrade, George Teodoro, R. Ferreira
{"title":"Scalable and Efficient Spatial-Aware Parallelization Strategies for Multimedia Retrieval","authors":"Guilherme Andrade, George Teodoro, R. Ferreira","doi":"10.1109/SBAC-PAD49847.2020.00027","DOIUrl":null,"url":null,"abstract":"Similarity search is a key operation in several multimedia applications, including online Content-Based Multimedia Retrieval (CBMR) services. These applications have to deal with very large databases and are submitted to high query rates. In this context, scalability in distributed memory system is critical to assemble the required computing power and memory space. However, we have identified that the Data Equal Split (DES) parallelization and associated data partition strategy employed by the related works on the domain have limitations in terms of efficiency and scalability. Therefore, in this paper, we developed and implemented a framework for similarity search execution on distributed memory machines and proposed a novel class of data partition strategies that takes into account the data spatial organization in its distribution. This approach leads to a reduction in communication traffic and in costs associated with processing each task in local searches carried out in the distributed machine. Our approach attained a speedup of 2.4× on top of DES in the baseline case (5 nodes) and also achieves higher scalability efficiency and is 14.5× faster when 160 nodes are used. In fact, our novel data organization led to superlinear scalability in all configurations evaluated.","PeriodicalId":202581,"journal":{"name":"2020 IEEE 32nd International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)","volume":"115 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 32nd International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SBAC-PAD49847.2020.00027","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Similarity search is a key operation in several multimedia applications, including online Content-Based Multimedia Retrieval (CBMR) services. These applications have to deal with very large databases and are submitted to high query rates. In this context, scalability in distributed memory system is critical to assemble the required computing power and memory space. However, we have identified that the Data Equal Split (DES) parallelization and associated data partition strategy employed by the related works on the domain have limitations in terms of efficiency and scalability. Therefore, in this paper, we developed and implemented a framework for similarity search execution on distributed memory machines and proposed a novel class of data partition strategies that takes into account the data spatial organization in its distribution. This approach leads to a reduction in communication traffic and in costs associated with processing each task in local searches carried out in the distributed machine. Our approach attained a speedup of 2.4× on top of DES in the baseline case (5 nodes) and also achieves higher scalability efficiency and is 14.5× faster when 160 nodes are used. In fact, our novel data organization led to superlinear scalability in all configurations evaluated.
面向多媒体检索的可扩展高效空间感知并行化策略
相似度搜索是许多多媒体应用的关键操作,包括基于内容的在线多媒体检索服务。这些应用程序必须处理非常大的数据库,并且提交的查询率很高。在这种情况下,分布式内存系统的可伸缩性对于集合所需的计算能力和内存空间至关重要。然而,我们已经发现,在该领域的相关工作中使用的数据相等分割(DES)并行化和相关的数据分区策略在效率和可伸缩性方面存在局限性。因此,在本文中,我们开发并实现了一个在分布式存储机器上执行相似搜索的框架,并提出了一类考虑数据分布中的空间组织的新型数据分区策略。这种方法减少了通信流量,并降低了在分布式机器上执行本地搜索中处理每个任务的相关成本。在基线情况下(5个节点),我们的方法在DES的基础上实现了2.4倍的加速,并且还实现了更高的可伸缩性效率,当使用160个节点时,速度提高了14.5倍。事实上,我们的新颖数据组织在所有评估的配置中都带来了超线性可伸缩性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信