基于商用gpu系统的高吞吐量子集匹配

Proceedings of the Twelfth European Conference on Computer Systems Pub Date : 2017-04-23 DOI:10.1145/3064176.3064190

Daniele Rogora, M. Papalini, Koorosh Khazaei, Alessandro Margara, Antonio Carzaniga, G. Cugola

{"title":"基于商用gpu系统的高吞吐量子集匹配","authors":"Daniele Rogora, M. Papalini, Koorosh Khazaei, Alessandro Margara, Antonio Carzaniga, G. Cugola","doi":"10.1145/3064176.3064190","DOIUrl":null,"url":null,"abstract":"Large-scale information processing often relies on subset matching for data classification and routing. Examples are publish/subscribe and stream processing systems, database systems, social media, and information-centric networking. For instance, an advanced Twitter-like messaging service where users might follow specific publishers as well as specific topics encoded as tag sets must join a stream of published messages with the users and their preferred tag sets so that the user tag set is a subset of the message tags. Subset matching is an old but also notoriously difficult problem. We present TagMatch, a system that solves this problem by taking advantage of a hybrid CPU/GPU stream processing architecture. TagMatch targets large-scale applications with thousands of matching operations per seconds against hundreds of millions of tag sets. We evaluate TagMatch on an advanced message streaming application, with very positive results both in absolute terms and in comparison with existing systems. As a notable example, our experiments demonstrate that TagMatch running on a single, commodity machine with two GPUs can easily sustain the traffic throughput of Twitter even augmented with expressive tag-based selection.","PeriodicalId":262089,"journal":{"name":"Proceedings of the Twelfth European Conference on Computer Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"High-Throughput Subset Matching on Commodity GPU-Based Systems\",\"authors\":\"Daniele Rogora, M. Papalini, Koorosh Khazaei, Alessandro Margara, Antonio Carzaniga, G. Cugola\",\"doi\":\"10.1145/3064176.3064190\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Large-scale information processing often relies on subset matching for data classification and routing. Examples are publish/subscribe and stream processing systems, database systems, social media, and information-centric networking. For instance, an advanced Twitter-like messaging service where users might follow specific publishers as well as specific topics encoded as tag sets must join a stream of published messages with the users and their preferred tag sets so that the user tag set is a subset of the message tags. Subset matching is an old but also notoriously difficult problem. We present TagMatch, a system that solves this problem by taking advantage of a hybrid CPU/GPU stream processing architecture. TagMatch targets large-scale applications with thousands of matching operations per seconds against hundreds of millions of tag sets. We evaluate TagMatch on an advanced message streaming application, with very positive results both in absolute terms and in comparison with existing systems. As a notable example, our experiments demonstrate that TagMatch running on a single, commodity machine with two GPUs can easily sustain the traffic throughput of Twitter even augmented with expressive tag-based selection.\",\"PeriodicalId\":262089,\"journal\":{\"name\":\"Proceedings of the Twelfth European Conference on Computer Systems\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-04-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Twelfth European Conference on Computer Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3064176.3064190\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Twelfth European Conference on Computer Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3064176.3064190","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

大规模信息处理往往依赖于子集匹配进行数据分类和路由。例如发布/订阅和流处理系统、数据库系统、社交媒体和以信息为中心的网络。例如，类似于twitter的高级消息传递服务(用户可以关注特定发布者以及编码为标记集的特定主题)必须将已发布的消息流与用户及其首选标记集连接起来，以便用户标记集成为消息标记的子集。子集匹配是一个古老而又非常困难的问题。我们提出了TagMatch，一个利用混合CPU/GPU流处理架构来解决这个问题的系统。TagMatch的目标是大规模应用程序，每秒对数亿个标记集进行数千次匹配操作。我们在一个高级消息流应用程序上对TagMatch进行了评估，无论从绝对值还是与现有系统的比较来看，都得到了非常积极的结果。作为一个值得注意的例子，我们的实验表明，TagMatch在一台带有两个gpu的普通机器上运行，即使增强了基于表达性标签的选择，也可以很容易地维持Twitter的流量吞吐量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

High-Throughput Subset Matching on Commodity GPU-Based Systems

Large-scale information processing often relies on subset matching for data classification and routing. Examples are publish/subscribe and stream processing systems, database systems, social media, and information-centric networking. For instance, an advanced Twitter-like messaging service where users might follow specific publishers as well as specific topics encoded as tag sets must join a stream of published messages with the users and their preferred tag sets so that the user tag set is a subset of the message tags. Subset matching is an old but also notoriously difficult problem. We present TagMatch, a system that solves this problem by taking advantage of a hybrid CPU/GPU stream processing architecture. TagMatch targets large-scale applications with thousands of matching operations per seconds against hundreds of millions of tag sets. We evaluate TagMatch on an advanced message streaming application, with very positive results both in absolute terms and in comparison with existing systems. As a notable example, our experiments demonstrate that TagMatch running on a single, commodity machine with two GPUs can easily sustain the traffic throughput of Twitter even augmented with expressive tag-based selection.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the Twelfth European Conference on Computer Systems

自引率

0.00%

发文量