PPAA '14最新文献

Maximal clique enumeration for large graphs on hadoop framework hadoop框架上的大图形的最大团枚举

PPAA '14 Pub Date : 2014-02-16 DOI: 10.1145/2567634.2567640

N. Dasari, D. Ranjan, M. Zubair

{"title":"Maximal clique enumeration for large graphs on hadoop framework","authors":"N. Dasari, D. Ranjan, M. Zubair","doi":"10.1145/2567634.2567640","DOIUrl":"https://doi.org/10.1145/2567634.2567640","url":null,"abstract":"Maximal clique enumeration (MCE) problem for very large graphs appears in many critical applications such as community detection in social networks, aligning 3D protein sequences, finding motifs in genomic data, identifying co-expressed genes and data analytics in communication networks. It is not unusual to have graphs of billions of nodes and edges in these applications. The MCE problem is NP hard, but a number of algorithms both sequential and parallel have been proposed that work efficiently for real graphs. In addition to the large sizes of the input graphs, the MCE algorithms in general result in large intermediate data making it even more challenging to efficiently process the data. Recently an approach has been proposed, referred to as pbitMCE, which is shown to outperform or perform equally well compared to the existing approaches. The approach uses degeneracy ordering of vertices which plays a vital role in the performance of the algorithm. Degeneracy ordering of vertices can be generated in linear time. However it is challenging to find the degeneracy ordering in a distributed environment as it requires extensive communication between the nodes. In some cases generating the ordering can take a significant amount of time. In such cases a different ordering such as ordering by degree can be a better choice than the degeneracy ordering. In this paper we experimentally study the impact of various ordering of vertices on the performance of an MCE algorithm in the context of mapreduce framework. We present an implementation of pbitMCE using mapreduce that takes a large graph and an ordering of vertices as input and enumerates all the maximal cliques. To support the study, we present the experimental results on various graphs using different orderings. The results show that the degree ordering performs comparable to the degeneracy ordering in most cases while it performs poorer in the case of large graphs.","PeriodicalId":379963,"journal":{"name":"PPAA '14","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122353290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

High-speed graph analytics with the galois system 高速图形分析与伽罗瓦系统

PPAA '14 Pub Date : 2014-02-16 DOI: 10.1145/2567634.2567648

K. Pingali

引用次数: 8

Graphs & networks: computing and analytics at lincoln laboratory 图与网络:林肯实验室的计算与分析

PPAA '14 Pub Date : 2014-02-16 DOI: 10.1145/2567634.2567647

R. Bond

引用次数: 0

Rigorous specification and low-latency implementation of technical market indicators 严格规范和低延迟执行技术市场指标

PPAA '14 Pub Date : 2014-02-16 DOI: 10.1145/2567634.2567636

K. Bakanov, I. Spence, H. Vandierendonck, C. Gillan

引用次数: 0

Future directions in analytic applications 分析应用的未来方向

PPAA '14 Pub Date : 2014-02-16 DOI: 10.1145/2567634.2567645

E. Baranoski

引用次数: 0

Load balanced clustering coefficients 负载均衡聚类系数

PPAA '14 Pub Date : 2014-02-16 DOI: 10.1145/2567634.2567635

Oded Green, Lluís-Miquel Munguía, David A. Bader

引用次数: 20

Active workflow system for near real-time extreme-scale science 近实时极端尺度科学的主动工作流系统

PPAA '14 Pub Date : 2014-02-16 DOI: 10.1145/2567634.2567637

Yanwei Zhang, Qing Liu, S. Klasky, M. Wolf, K. Schwan, G. Eisenhauer, J. Choi, N. Podhorszki

{"title":"Active workflow system for near real-time extreme-scale science","authors":"Yanwei Zhang, Qing Liu, S. Klasky, M. Wolf, K. Schwan, G. Eisenhauer, J. Choi, N. Podhorszki","doi":"10.1145/2567634.2567637","DOIUrl":"https://doi.org/10.1145/2567634.2567637","url":null,"abstract":"In recent years, streaming-based data processing has been gaining substantial traction for dealing with overwhelming data generated by real-time applications, from both enterprise sources and scientific computing. In this work, however, we look at an emerging class of scientific data with Near Real-Time (NRT) requirement, in which data is typically generated in a bursty fashion with the near real-time constraints being applied primarily between bursts, rather than within a stream. A key challenge for this types of data sources is that the processing time per data element is not uniform, and not always feasible to predict. Given the observations on the increasing unpredictability of compute load and system dynamics, this work looks to adapt streaming-based approach to the context of this new class of large experiments and simulations that have complex run-time control and analysis issues.\u0000 In particular, we deploy a novel two-tier scheme for handling the increasing unpredictability of runtime behaviors: Instead of relying on determining what and where to run the scientific workflows beforehand or partial dynamically, the decision will also be adaptively enhanced online according to system runtime status. This is enabled by embedding workflow along with data streams. Specifically, we break data outputs generated from experiments or simulations into multiple self-describing \"chunks\", which we call active data objects. As such, if there is a transient hotspot observed, a data object with unfinished workflow pipeline can break its previous schedule and search for a least loaded location to continue the execution. Our preliminary experiment results based on synthetic workloads demonstrate the proposed active workflow system as a very promising solution by outperforming the state-of-the-art semi-dynamic workflow schedulers with an improved workflow completion time, as well as a good scalability.","PeriodicalId":379963,"journal":{"name":"PPAA '14","volume":"138 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133172364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Cognitive computing journey 认知计算之旅

PPAA '14 Pub Date : 2014-02-16 DOI: 10.1145/2567634.2567646

D. Nahamoo

{"title":"Cognitive computing journey","authors":"D. Nahamoo","doi":"10.1145/2567634.2567646","DOIUrl":"https://doi.org/10.1145/2567634.2567646","url":null,"abstract":"Building intelligent machines has been a long dream of humanity. While the journey has been difficult and slow, the progress in Machine Learning, Optimization Techniques and advancement in Deep Belief Networks offers promising ways to engineer cognitive systems. The science behind cognitive computing seeks to develop systems that emulate human brain functions such as perception, knowledge accumulation, goal planning, and logical inference. Cognitive systems will operate at a speed and an informational capacity that far exceeds human capability. They will serve to act as an advisor, partner, helpmate, and co-creator to the humans, collaborating on human terms.\u0000 Cognitive computing is a fundamentally new computing paradigm for tackling real world problems, exploiting enormous amounts of information using massively parallel machines that interact with humans and other cognitive systems. Cognitive systems will bring human-like reasoning to the problems of Big Data, and will also permit us to expand into the white space of domains that require human-like cognition but that either exceed human capacity or are impossible for a live human presence.\u0000 In this talk, I will review the past progress and discuss the future challenges. I will address the architectural challenges of building a general purpose system of systems that can learn, can reason, and can interact in a human natural way.","PeriodicalId":379963,"journal":{"name":"PPAA '14","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114302314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

A performance evaluation of open source graph databases 开源图形数据库的性能评估

PPAA '14 Pub Date : 2014-02-16 DOI: 10.1145/2567634.2567638

R. McColl, David Ediger, Jason A. Poovey, D. Campbell, David A. Bader

引用次数: 94