Proceedings of the 2nd International Workshop on Network Data Analytics最新文献

SCAN-XP: Parallel Structural Graph Clustering Algorithm on Intel Xeon Phi Coprocessors 基于Intel Xeon Phi协处理器的并行结构图聚类算法

Proceedings of the 2nd International Workshop on Network Data Analytics Pub Date : 2017-05-14 DOI: 10.1145/3068943.3068949

Tomokatsu Takahashi, Hiroaki Shiokawa, H. Kitagawa

{"title":"SCAN-XP: Parallel Structural Graph Clustering Algorithm on Intel Xeon Phi Coprocessors","authors":"Tomokatsu Takahashi, Hiroaki Shiokawa, H. Kitagawa","doi":"10.1145/3068943.3068949","DOIUrl":"https://doi.org/10.1145/3068943.3068949","url":null,"abstract":"The structural graph clustering method SCAN, proposed by Xu et al., is successfully used in many applications because it not only detects densely connected nodes as clusters but also extracts sparsely connected nodes as hubs or outliers. However, it is difficult to applying SCAN to large-scale graphs since SCAN needs to evaluate the density for all adjacent nodes included in the given graphs. In this paper, so as to address the above problem, we present a novel algorithm SCAN-XP that performs over Intel Xeon Phi. We designed SCAN-XP in order to make best use of the hardware potential of Intel Xeon Phi by employing the following approaches: First, SCAN-XP avoids the bottlenecks that arise from parallel graph computations by providing good load balances among cores on the Intel Xeon Phi. Second, SCAN-XP effectively exploits 512 bit SIMD instructions implemented in the Intel Xeon Phi to speed up the density evaluations. As a result, SCAN-XP detects clusters, hubs, and outliers from large-scale graphs with much shorter computation time than SCAN. Specifically, SCAN-XP runs approximately 100 times faster than SCAN; for the graphs with 100 million edges, SCAN-XP is able to perform in a few seconds. In this paper, extensive evaluations on real-world graphs demonstrate the performance superiority of SCAN-XP over existing approaches.","PeriodicalId":345682,"journal":{"name":"Proceedings of the 2nd International Workshop on Network Data Analytics","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114716419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 26

Proceedings of the 2nd International Workshop on Network Data Analytics 第二届网络数据分析国际研讨会论文集

Proceedings of the 2nd International Workshop on Network Data Analytics Pub Date : 2017-05-14 DOI: 10.1145/3068943

Akhil Arora, Shourya Roy, A. Bhattacharya

{"title":"Proceedings of the 2nd International Workshop on Network Data Analytics","authors":"Akhil Arora, Shourya Roy, A. Bhattacharya","doi":"10.1145/3068943","DOIUrl":"https://doi.org/10.1145/3068943","url":null,"abstract":"We are delighted to present the papers from the 2nd NDA Workshop on Network Data Analytics, which took place on 19th May, 2017 co-located with the ACM SIGMOD conference in Chicago, Illinois, USA. \u0000 \u0000Networks are prevalent in today's electronic world in a wide variety of domains ranging from engineering to social sciences, life sciences, physical sciences, and so on. Researchers and practitioners have studied networks in multiple ways like defining network metrics, providing theoretical results and examining problems like pattern mining, link prediction, etc. The NDA workshop is a forum for exchanging ideas and methods for mining, querying and learning with real-world networks, developing new common understandings of the problems at hand, sharing of data sets where applicable, and leveraging existing knowledge from different disciplines. The purpose of this workshop is to bring together researchers from academia, industry, and government, to create a forum for discussing recent advances in (large-scale) graph analysis, as well as propose and discuss novel methods and techniques towards addressing domain specific challenges and handling noise in real-world graphs.","PeriodicalId":345682,"journal":{"name":"Proceedings of the 2nd International Workshop on Network Data Analytics","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114907488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Construction of Structured Heterogeneous Networks from Massive Text Data: Extended Abstract 从海量文本数据构建结构化异构网络:扩展摘要

Proceedings of the 2nd International Workshop on Network Data Analytics Pub Date : 2017-05-14 DOI: 10.1145/3068943.3068944

Jiawei Han

引用次数: 0

Repairing Noisy Graphs 修复噪声图

Proceedings of the 2nd International Workshop on Network Data Analytics Pub Date : 2017-05-14 DOI: 10.1145/3068943.3068945

D. Srivastava

{"title":"Repairing Noisy Graphs","authors":"D. Srivastava","doi":"10.1145/3068943.3068945","DOIUrl":"https://doi.org/10.1145/3068943.3068945","url":null,"abstract":"Graphs are a flexible way to represent data in a variety of applications, with nodes representing domain-specific entities (e.g., records in record linkage, products and types in an ontology) and edges capturing a variety of relationships between these entities (e.g., an equivalence relationship between records in record linkage, a type-subtype relationship between types in an ontology). Often, the edges in this graph are noisy, in that some edges are missing (i.e., real-world relationships that do not have corresponding edges in the graph) and some edges are spurious (i.e., edges in the graph that do not have corresponding real-world relationships). Directly analyzing noisy graphs can lead to undesirable outcomes, making it important to repair noisy graphs. In this talk, we describe an approach that takes advantage of properties of real-world relationships and their estimated probabilities to ask oracle queries (an abstraction of crowdsourcing) to efficiently repair the noisy graphs. We illustrate this approach for the case of graphs that are unions of cliques (which is the case for record linkage) and graphs that are trees (which is the case for ontologies), and present theoretical and empirical results for these cases. This is joint work with Donatella Firmani, Sainyam Galhotra and Barna Saha.","PeriodicalId":345682,"journal":{"name":"Proceedings of the 2nd International Workshop on Network Data Analytics","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129437534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Performance Prediction for Graph Queries 图查询的性能预测

Proceedings of the 2nd International Workshop on Network Data Analytics Pub Date : 2017-05-14 DOI: 10.1145/3068943.3068947

M. Namaki, K. Sasani, Yinghui Wu, A. Gebremedhin

{"title":"Performance Prediction for Graph Queries","authors":"M. Namaki, K. Sasani, Yinghui Wu, A. Gebremedhin","doi":"10.1145/3068943.3068947","DOIUrl":"https://doi.org/10.1145/3068943.3068947","url":null,"abstract":"Query performance prediction has shown benefits to query optimization and resource allocation for relational databases. Emerging applications are leading to search scenarios where workloads with heterogeneous, structure-less analytical queries are processed over large-scale graph and network data. This calls for effective models to predict the performance of graph analytical queries, which are often more involved than their relational counterparts. In this paper, we study and evaluate predictive techniques for graph query performance prediction. We make several contributions. (1) We propose a general learning framework that makes use of practical and computationally efficient statistics from query scenarios and employs regression models. (2) We instantiate the framework with two routinely issued query classes, namely, reachability and graph pattern matching, that exhibit different query complexity. We develop modeling and learning algorithms for both query classes. (3) We show that our prediction models readily apply to resource-bounded querying, by providing a learning-based workload optimization strategy. Given a query workload and a time bound, the models select queries to be processed with a maximized query profit and a total cost within the bound. Using real-world graphs, we experimentally demonstrate the efficacy of our framework in terms of accuracy and the effectiveness of workload optimization.","PeriodicalId":345682,"journal":{"name":"Proceedings of the 2nd International Workshop on Network Data Analytics","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128097124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Using Graphical Features To Improve Demographic Prediction From Smart Phone Data 利用图形特征改进智能手机数据的人口预测

Proceedings of the 2nd International Workshop on Network Data Analytics Pub Date : 2017-05-14 DOI: 10.1145/3068943.3068948

S. Akter, L. Holder

{"title":"Using Graphical Features To Improve Demographic Prediction From Smart Phone Data","authors":"S. Akter, L. Holder","doi":"10.1145/3068943.3068948","DOIUrl":"https://doi.org/10.1145/3068943.3068948","url":null,"abstract":"Demographic information such as gender, age, ethnicity, level of education, disabilities, employment, and socio-economic status are important in the area of social science, survey and marketing. But it is difficult to obtain the demographic information from users due to reluctance of users to participate and low response rate. Through automated demographics prediction from smart phone sensor data, researchers can obtain this valuable information in a nonintrusive and cost-effective manner. We approach the problem of demographic prediction, namely, classification of gender, age group and job type, through the use of a graphical feature based framework. The framework represents information collected from sensor networks as graphs, extracts useful and relevant graphical features, and predicts demographic information. We evaluated our approach on the Nokia Mobile Phone dataset for the three classification tasks: gender, age-group and job-type. Our approach produced comparable results with most of the state of the art methods while having the additional advantage of general applicability to sensor networks without using sophisticated and application-specific feature generation techniques, background knowledge and special techniques to address class imbalance.","PeriodicalId":345682,"journal":{"name":"Proceedings of the 2nd International Workshop on Network Data Analytics","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126994089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Graph Mining to Characterize Competition for Employment 图挖掘表征就业竞争

Proceedings of the 2nd International Workshop on Network Data Analytics Pub Date : 2017-05-14 DOI: 10.1145/3068943.3068946

A. Toulis, Lukasz Golab

引用次数: 6

Web and Social Media Analytics towards Enhancing Urban Transportations: A Case for Bangalore 网络和社会媒体分析促进城市交通:以班加罗尔为例

Proceedings of the 2nd International Workshop on Network Data Analytics Pub Date : 2017-05-14 DOI: 10.1145/3068943.3068950

Manjira Sinha, P. Varma, Tridib Mukherjee

{"title":"Web and Social Media Analytics towards Enhancing Urban Transportations: A Case for Bangalore","authors":"Manjira Sinha, P. Varma, Tridib Mukherjee","doi":"10.1145/3068943.3068950","DOIUrl":"https://doi.org/10.1145/3068943.3068950","url":null,"abstract":"Cities today are typically plagued by multiple issues such as âĂŞ traffic jams, garbage, transit overload, public safety, drainage etc. Citizens today tend to discuss these issues in public forums, social media, web blogs, in a widespread manner. Given that issues related to public transportation are most actively reported across web-based sources, we present a holistic framework for collection, categorization, aggregation and visualization of urban public transportation issues. The primary challenges in deriving useful insights from web-based sources, stem from: (a) the number of reports; (b) incomplete or implicit spatio-temporal context; and the (c) unstructured nature of text in these reports. This paper provides the text categorization techniques that can be adopted to address specifically these challenges. The work initiates with the formal complaint data from the largest public transportation agency in Bangalore, complemented by complaint reports from web-based and social media sources. An easy to navigate and well-organized dashboard is developed for efficient visualization. The dashboard is currently being piloted with the largest transportation agency in Bangalore.","PeriodicalId":345682,"journal":{"name":"Proceedings of the 2nd International Workshop on Network Data Analytics","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128342918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2