Parallel Dynamic Batch Loading in the M-tree

Jakub Lokoč
{"title":"Parallel Dynamic Batch Loading in the M-tree","authors":"Jakub Lokoč","doi":"10.1109/SISAP.2009.27","DOIUrl":null,"url":null,"abstract":"Although metric access methods (MAMs) proved their capabilities when performing efficient similarity search, their further performance improvement is needed due to extreme growth of data volumes. Since multi core processors become widely available, it is justified to exploit parallelism. However, taking into account the Gustafson’s law, it is necessary to find tasks suitable for parallelization. Such a task could be M-tree construction. Unfortunately, parallelism during an object insertion in hierarchical index structures is limited by a node capacity. It is much less restrictive to run several independent insertions in parallel. However, synchronization problems occur whenever a node is about to split. In this paper we present our new technique of M-tree construction. The technique postpones splitting of overfull nodes and thus allows simple parallelization of M-tree construction. We also utilize an adaptation of recently introduced re-inserting technique in the M-tree. Our experiments confirm the new technique guarantees significant speed up of M-tree construction and also improves the quality of the index.","PeriodicalId":130242,"journal":{"name":"2009 Second International Workshop on Similarity Search and Applications","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 Second International Workshop on Similarity Search and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SISAP.2009.27","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

Although metric access methods (MAMs) proved their capabilities when performing efficient similarity search, their further performance improvement is needed due to extreme growth of data volumes. Since multi core processors become widely available, it is justified to exploit parallelism. However, taking into account the Gustafson’s law, it is necessary to find tasks suitable for parallelization. Such a task could be M-tree construction. Unfortunately, parallelism during an object insertion in hierarchical index structures is limited by a node capacity. It is much less restrictive to run several independent insertions in parallel. However, synchronization problems occur whenever a node is about to split. In this paper we present our new technique of M-tree construction. The technique postpones splitting of overfull nodes and thus allows simple parallelization of M-tree construction. We also utilize an adaptation of recently introduced re-inserting technique in the M-tree. Our experiments confirm the new technique guarantees significant speed up of M-tree construction and also improves the quality of the index.
m树中的并行动态批加载
虽然度量访问方法(MAMs)在执行高效的相似性搜索时证明了它们的能力,但由于数据量的急剧增长,它们的性能需要进一步改进。由于多核处理器变得广泛可用,利用并行性是合理的。然而,考虑到Gustafson定律,有必要找到适合并行化的任务。这样的任务可以是m树构造。不幸的是,在分层索引结构中插入对象期间的并行性受到节点容量的限制。并行运行多个独立插入的限制要少得多。但是,每当节点即将分裂时,就会出现同步问题。本文提出了一种新的m树构造技术。该技术推迟了过度满节点的分割,从而允许m树构造的简单并行化。我们还利用了最近在m树中引入的重新插入技术。我们的实验证明,新技术保证了m树构建的显著速度,也提高了索引的质量。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信