在Cell/B.E.上对大型字典进行高速字符串搜索处理器

2008 IEEE International Symposium on Parallel and Distributed Processing Pub Date : 2008-04-14 DOI:10.1109/IPDPS.2008.4536300

D. Scarpazza, Oreste Villa, F. Petrini

{"title":"在Cell/B.E.上对大型字典进行高速字符串搜索处理器","authors":"D. Scarpazza, Oreste Villa, F. Petrini","doi":"10.1109/IPDPS.2008.4536300","DOIUrl":null,"url":null,"abstract":"Our digital universe is growing, creating exploding amounts of data which need to be searched, filtered and protected. String searching is at the core of the tools we use to curb this explosion, such as search engines, network intrusion detection systems, spam filters, and anti-virus programs. But as communication speed grows, our capability to perform string searching in real-time seems to fall behind. Multi-core architectures promise enough computational power to cope with the incoming challenge, but it is still unclear which algorithms and programming models to use to unleash this power. We have parallelized a popular string searching algorithm, Aho-Corasick, on the IBM Cell/B.E. processor, with the goal of performing exact string matching against large dictionaries. In this article we propose a novel approach to fully exploit the DMA-based communication mechanisms of the Cell/B.E. to provide an unprecedented level of aggregate performance with irregular access patterns. We have discovered that memory congestion plays a crucial role in determining the performance of this algorithm. We discuss three aspects of congestion: memory pressure, layout issues and hot spots, and we present a collection of algorithmic solutions to alleviate these problems and achieve quasi-optimal performance. The implementation of our algorithm provides a worst- case throughput of 2.5 Gbps, and a typical throughput between 3.3 and 4.4 Gbps, measured on realistic scenarios with a two-processor Cell/B.E. system.","PeriodicalId":162608,"journal":{"name":"2008 IEEE International Symposium on Parallel and Distributed Processing","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"43","resultStr":"{\"title\":\"High-speed string searching against large dictionaries on the Cell/B.E. Processor\",\"authors\":\"D. Scarpazza, Oreste Villa, F. Petrini\",\"doi\":\"10.1109/IPDPS.2008.4536300\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Our digital universe is growing, creating exploding amounts of data which need to be searched, filtered and protected. String searching is at the core of the tools we use to curb this explosion, such as search engines, network intrusion detection systems, spam filters, and anti-virus programs. But as communication speed grows, our capability to perform string searching in real-time seems to fall behind. Multi-core architectures promise enough computational power to cope with the incoming challenge, but it is still unclear which algorithms and programming models to use to unleash this power. We have parallelized a popular string searching algorithm, Aho-Corasick, on the IBM Cell/B.E. processor, with the goal of performing exact string matching against large dictionaries. In this article we propose a novel approach to fully exploit the DMA-based communication mechanisms of the Cell/B.E. to provide an unprecedented level of aggregate performance with irregular access patterns. We have discovered that memory congestion plays a crucial role in determining the performance of this algorithm. We discuss three aspects of congestion: memory pressure, layout issues and hot spots, and we present a collection of algorithmic solutions to alleviate these problems and achieve quasi-optimal performance. The implementation of our algorithm provides a worst- case throughput of 2.5 Gbps, and a typical throughput between 3.3 and 4.4 Gbps, measured on realistic scenarios with a two-processor Cell/B.E. system.\",\"PeriodicalId\":162608,\"journal\":{\"name\":\"2008 IEEE International Symposium on Parallel and Distributed Processing\",\"volume\":\"35 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-04-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"43\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 IEEE International Symposium on Parallel and Distributed Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IPDPS.2008.4536300\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 IEEE International Symposium on Parallel and Distributed Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPS.2008.4536300","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 43

摘要

我们的数字宇宙在不断增长，产生了大量需要搜索、过滤和保护的数据。字符串搜索是我们用来遏制这种爆炸的工具的核心，例如搜索引擎、网络入侵检测系统、垃圾邮件过滤器和反病毒程序。但随着通信速度的增长，我们执行实时字符串搜索的能力似乎落后了。多核架构承诺有足够的计算能力来应对即将到来的挑战，但目前尚不清楚使用哪种算法和编程模型来释放这种能力。我们在IBM Cell/B.E.上并行化了一种流行的字符串搜索算法Aho-Corasick处理器，目标是对大型字典执行精确的字符串匹配。在本文中，我们提出了一种新的方法来充分利用基于dma的Cell/B.E.通信机制以不规则访问模式提供前所未有的聚合性能。我们发现，内存拥塞在决定该算法的性能方面起着至关重要的作用。我们讨论了拥塞的三个方面:内存压力、布局问题和热点，并提出了一系列算法解决方案来缓解这些问题并实现准最优性能。我们算法的实现提供了2.5 Gbps的最坏情况吞吐量，以及3.3和4.4 Gbps之间的典型吞吐量，在双处理器Cell/B.E.的实际场景中测量系统。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

High-speed string searching against large dictionaries on the Cell/B.E. Processor

Our digital universe is growing, creating exploding amounts of data which need to be searched, filtered and protected. String searching is at the core of the tools we use to curb this explosion, such as search engines, network intrusion detection systems, spam filters, and anti-virus programs. But as communication speed grows, our capability to perform string searching in real-time seems to fall behind. Multi-core architectures promise enough computational power to cope with the incoming challenge, but it is still unclear which algorithms and programming models to use to unleash this power. We have parallelized a popular string searching algorithm, Aho-Corasick, on the IBM Cell/B.E. processor, with the goal of performing exact string matching against large dictionaries. In this article we propose a novel approach to fully exploit the DMA-based communication mechanisms of the Cell/B.E. to provide an unprecedented level of aggregate performance with irregular access patterns. We have discovered that memory congestion plays a crucial role in determining the performance of this algorithm. We discuss three aspects of congestion: memory pressure, layout issues and hot spots, and we present a collection of algorithmic solutions to alleviate these problems and achieve quasi-optimal performance. The implementation of our algorithm provides a worst- case throughput of 2.5 Gbps, and a typical throughput between 3.3 and 4.4 Gbps, measured on realistic scenarios with a two-processor Cell/B.E. system.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2008 IEEE International Symposium on Parallel and Distributed Processing

自引率

0.00%

发文量