{"title":"过滤后的","authors":"Emma Felton","doi":"10.4324/9781315225869","DOIUrl":null,"url":null,"abstract":"High performance is a crucial consideration when executing a complex analytic query on a massive semantic graph. In a semantic graph, vertices and edges carry “attributes” of var-ious types. Analytic queries on semantic graphs typically depend on the values of these attributes; thus, the com-putation must either view the graph through a filter that passes only those individual vertices and edges of interest, or else must first materialize a subgraph or subgraphs consisting of only the vertices and edges of interest. The filtered approach is superior due to its generality, ease of use, and memory efficiency, but may carry a performance cost. In the Knowledge Discovery Toolbox (KDT), a Python library for parallel graph computations, the user writes filters in a high-level language, but those filters result in rel-atively low performance due to the bottleneck of having to call into the Python interpreter for each edge. In this work, we use the Selective Embedded JIT Specialization (SEJITS) approach to automatically translate filters defined by pro-grammers into a lower-level efficiency language, bypassing the upcall into Python. We evaluate our approach by com-paring it with the high-performance C ++ /MPI Combinatorial BLAS engine, and show that the productivity gained by using a high-level filtering language comes without sacrific-ing performance.","PeriodicalId":155792,"journal":{"name":"The Great Firewall of China","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Filtered\",\"authors\":\"Emma Felton\",\"doi\":\"10.4324/9781315225869\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"High performance is a crucial consideration when executing a complex analytic query on a massive semantic graph. In a semantic graph, vertices and edges carry “attributes” of var-ious types. Analytic queries on semantic graphs typically depend on the values of these attributes; thus, the com-putation must either view the graph through a filter that passes only those individual vertices and edges of interest, or else must first materialize a subgraph or subgraphs consisting of only the vertices and edges of interest. The filtered approach is superior due to its generality, ease of use, and memory efficiency, but may carry a performance cost. In the Knowledge Discovery Toolbox (KDT), a Python library for parallel graph computations, the user writes filters in a high-level language, but those filters result in rel-atively low performance due to the bottleneck of having to call into the Python interpreter for each edge. In this work, we use the Selective Embedded JIT Specialization (SEJITS) approach to automatically translate filters defined by pro-grammers into a lower-level efficiency language, bypassing the upcall into Python. We evaluate our approach by com-paring it with the high-performance C ++ /MPI Combinatorial BLAS engine, and show that the productivity gained by using a high-level filtering language comes without sacrific-ing performance.\",\"PeriodicalId\":155792,\"journal\":{\"name\":\"The Great Firewall of China\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-12-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The Great Firewall of China\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4324/9781315225869\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Great Firewall of China","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4324/9781315225869","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
High performance is a crucial consideration when executing a complex analytic query on a massive semantic graph. In a semantic graph, vertices and edges carry “attributes” of var-ious types. Analytic queries on semantic graphs typically depend on the values of these attributes; thus, the com-putation must either view the graph through a filter that passes only those individual vertices and edges of interest, or else must first materialize a subgraph or subgraphs consisting of only the vertices and edges of interest. The filtered approach is superior due to its generality, ease of use, and memory efficiency, but may carry a performance cost. In the Knowledge Discovery Toolbox (KDT), a Python library for parallel graph computations, the user writes filters in a high-level language, but those filters result in rel-atively low performance due to the bottleneck of having to call into the Python interpreter for each edge. In this work, we use the Selective Embedded JIT Specialization (SEJITS) approach to automatically translate filters defined by pro-grammers into a lower-level efficiency language, bypassing the upcall into Python. We evaluate our approach by com-paring it with the high-performance C ++ /MPI Combinatorial BLAS engine, and show that the productivity gained by using a high-level filtering language comes without sacrific-ing performance.