过滤后的

The Great Firewall of China Pub Date : 2018-12-07 DOI:10.4324/9781315225869

Emma Felton

{"title":"过滤后的","authors":"Emma Felton","doi":"10.4324/9781315225869","DOIUrl":null,"url":null,"abstract":"High performance is a crucial consideration when executing a complex analytic query on a massive semantic graph. In a semantic graph, vertices and edges carry “attributes” of var-ious types. Analytic queries on semantic graphs typically depend on the values of these attributes; thus, the com-putation must either view the graph through a ﬁlter that passes only those individual vertices and edges of interest, or else must ﬁrst materialize a subgraph or subgraphs consisting of only the vertices and edges of interest. The ﬁltered approach is superior due to its generality, ease of use, and memory eﬃciency, but may carry a performance cost. In the Knowledge Discovery Toolbox (KDT), a Python library for parallel graph computations, the user writes ﬁlters in a high-level language, but those ﬁlters result in rel-atively low performance due to the bottleneck of having to call into the Python interpreter for each edge. In this work, we use the Selective Embedded JIT Specialization (SEJITS) approach to automatically translate ﬁlters deﬁned by pro-grammers into a lower-level eﬃciency language, bypassing the upcall into Python. We evaluate our approach by com-paring it with the high-performance C ++ /MPI Combinatorial BLAS engine, and show that the productivity gained by using a high-level ﬁltering language comes without sacriﬁc-ing performance.","PeriodicalId":155792,"journal":{"name":"The Great Firewall of China","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Filtered\",\"authors\":\"Emma Felton\",\"doi\":\"10.4324/9781315225869\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"High performance is a crucial consideration when executing a complex analytic query on a massive semantic graph. In a semantic graph, vertices and edges carry “attributes” of var-ious types. Analytic queries on semantic graphs typically depend on the values of these attributes; thus, the com-putation must either view the graph through a ﬁlter that passes only those individual vertices and edges of interest, or else must ﬁrst materialize a subgraph or subgraphs consisting of only the vertices and edges of interest. The ﬁltered approach is superior due to its generality, ease of use, and memory eﬃciency, but may carry a performance cost. In the Knowledge Discovery Toolbox (KDT), a Python library for parallel graph computations, the user writes ﬁlters in a high-level language, but those ﬁlters result in rel-atively low performance due to the bottleneck of having to call into the Python interpreter for each edge. In this work, we use the Selective Embedded JIT Specialization (SEJITS) approach to automatically translate ﬁlters deﬁned by pro-grammers into a lower-level eﬃciency language, bypassing the upcall into Python. We evaluate our approach by com-paring it with the high-performance C ++ /MPI Combinatorial BLAS engine, and show that the productivity gained by using a high-level ﬁltering language comes without sacriﬁc-ing performance.\",\"PeriodicalId\":155792,\"journal\":{\"name\":\"The Great Firewall of China\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-12-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The Great Firewall of China\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4324/9781315225869\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Great Firewall of China","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4324/9781315225869","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

摘要

在大规模语义图上执行复杂的分析查询时，高性能是一个至关重要的考虑因素。在语义图中，顶点和边带有各种类型的“属性”。语义图上的分析查询通常依赖于这些属性的值;因此，计算必须通过只通过感兴趣的单个顶点和边的过滤器来查看图，否则必须首先实现仅由感兴趣的顶点和边组成的子图或子图。过滤方法由于其通用性、易用性和内存效率而优于过滤方法，但可能以性能为代价。在知识发现工具箱(KDT)中，一个用于并行图计算的Python库，用户用高级语言编写过滤器，但由于必须为每个边调用Python解释器的瓶颈，这些过滤器导致性能相对较低。在这项工作中，我们使用选择性嵌入式JIT专门化(SEJITS)方法将程序员定义的过滤器自动转换为低级别效率语言，绕过了对Python的向上调用。我们通过与高性能的c++ /MPI组合BLAS引擎进行比较来评估我们的方法，并表明使用高级过滤语言获得的生产力是在不牺牲性能的情况下获得的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Filtered

High performance is a crucial consideration when executing a complex analytic query on a massive semantic graph. In a semantic graph, vertices and edges carry “attributes” of var-ious types. Analytic queries on semantic graphs typically depend on the values of these attributes; thus, the com-putation must either view the graph through a ﬁlter that passes only those individual vertices and edges of interest, or else must ﬁrst materialize a subgraph or subgraphs consisting of only the vertices and edges of interest. The ﬁltered approach is superior due to its generality, ease of use, and memory eﬃciency, but may carry a performance cost. In the Knowledge Discovery Toolbox (KDT), a Python library for parallel graph computations, the user writes ﬁlters in a high-level language, but those ﬁlters result in rel-atively low performance due to the bottleneck of having to call into the Python interpreter for each edge. In this work, we use the Selective Embedded JIT Specialization (SEJITS) approach to automatically translate ﬁlters deﬁned by pro-grammers into a lower-level eﬃciency language, bypassing the upcall into Python. We evaluate our approach by com-paring it with the high-performance C ++ /MPI Combinatorial BLAS engine, and show that the productivity gained by using a high-level ﬁltering language comes without sacriﬁc-ing performance.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

The Great Firewall of China

自引率

0.00%

发文量