Proceedings of the 2nd International Workshop on Network Data Analytics最新文献

筛选
英文 中文
SCAN-XP: Parallel Structural Graph Clustering Algorithm on Intel Xeon Phi Coprocessors 基于Intel Xeon Phi协处理器的并行结构图聚类算法
Proceedings of the 2nd International Workshop on Network Data Analytics Pub Date : 2017-05-14 DOI: 10.1145/3068943.3068949
Tomokatsu Takahashi, Hiroaki Shiokawa, H. Kitagawa
{"title":"SCAN-XP: Parallel Structural Graph Clustering Algorithm on Intel Xeon Phi Coprocessors","authors":"Tomokatsu Takahashi, Hiroaki Shiokawa, H. Kitagawa","doi":"10.1145/3068943.3068949","DOIUrl":"https://doi.org/10.1145/3068943.3068949","url":null,"abstract":"The structural graph clustering method SCAN, proposed by Xu et al., is successfully used in many applications because it not only detects densely connected nodes as clusters but also extracts sparsely connected nodes as hubs or outliers. However, it is difficult to applying SCAN to large-scale graphs since SCAN needs to evaluate the density for all adjacent nodes included in the given graphs. In this paper, so as to address the above problem, we present a novel algorithm SCAN-XP that performs over Intel Xeon Phi. We designed SCAN-XP in order to make best use of the hardware potential of Intel Xeon Phi by employing the following approaches: First, SCAN-XP avoids the bottlenecks that arise from parallel graph computations by providing good load balances among cores on the Intel Xeon Phi. Second, SCAN-XP effectively exploits 512 bit SIMD instructions implemented in the Intel Xeon Phi to speed up the density evaluations. As a result, SCAN-XP detects clusters, hubs, and outliers from large-scale graphs with much shorter computation time than SCAN. Specifically, SCAN-XP runs approximately 100 times faster than SCAN; for the graphs with 100 million edges, SCAN-XP is able to perform in a few seconds. In this paper, extensive evaluations on real-world graphs demonstrate the performance superiority of SCAN-XP over existing approaches.","PeriodicalId":345682,"journal":{"name":"Proceedings of the 2nd International Workshop on Network Data Analytics","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114716419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
Proceedings of the 2nd International Workshop on Network Data Analytics 第二届网络数据分析国际研讨会论文集
Akhil Arora, Shourya Roy, A. Bhattacharya
{"title":"Proceedings of the 2nd International Workshop on Network Data Analytics","authors":"Akhil Arora, Shourya Roy, A. Bhattacharya","doi":"10.1145/3068943","DOIUrl":"https://doi.org/10.1145/3068943","url":null,"abstract":"We are delighted to present the papers from the 2nd NDA Workshop on Network Data Analytics, which took place on 19th May, 2017 co-located with the ACM SIGMOD conference in Chicago, Illinois, USA. \u0000 \u0000Networks are prevalent in today's electronic world in a wide variety of domains ranging from engineering to social sciences, life sciences, physical sciences, and so on. Researchers and practitioners have studied networks in multiple ways like defining network metrics, providing theoretical results and examining problems like pattern mining, link prediction, etc. The NDA workshop is a forum for exchanging ideas and methods for mining, querying and learning with real-world networks, developing new common understandings of the problems at hand, sharing of data sets where applicable, and leveraging existing knowledge from different disciplines. The purpose of this workshop is to bring together researchers from academia, industry, and government, to create a forum for discussing recent advances in (large-scale) graph analysis, as well as propose and discuss novel methods and techniques towards addressing domain specific challenges and handling noise in real-world graphs.","PeriodicalId":345682,"journal":{"name":"Proceedings of the 2nd International Workshop on Network Data Analytics","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114907488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Construction of Structured Heterogeneous Networks from Massive Text Data: Extended Abstract 从海量文本数据构建结构化异构网络:扩展摘要
Proceedings of the 2nd International Workshop on Network Data Analytics Pub Date : 2017-05-14 DOI: 10.1145/3068943.3068944
Jiawei Han
{"title":"Construction of Structured Heterogeneous Networks from Massive Text Data: Extended Abstract","authors":"Jiawei Han","doi":"10.1145/3068943.3068944","DOIUrl":"https://doi.org/10.1145/3068943.3068944","url":null,"abstract":"Network data analytics is important, powerful, and exciting. How big role may network data analytics play in the real world? Much real-world data is unstructured, in the form of natural language text. A grand challenges on big data research is to develop effective and scalable methods to turn such massive text data into actionable knowledge. In order to turn such massive unstructured, text-rich, but interconnected data into knowledge, we propose a data-to-network-to-knowledge (D2N2K) paradigm, that is, first transform data into relatively structured heterogeneous information networks, and then mine such text-rich and structure-rich heterogeneous networks to generate useful knowledge. We argue that such a paradigm represents a promising direction and network data analytics will play an essential role in transforming data to knowledge. However, a critical bottleneck in this game is mining structures from text data. We present our recent progress on developing effective methods for mining structures from massive text data and constructing structured heterogeneous information networks.","PeriodicalId":345682,"journal":{"name":"Proceedings of the 2nd International Workshop on Network Data Analytics","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127391264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Repairing Noisy Graphs 修复噪声图
Proceedings of the 2nd International Workshop on Network Data Analytics Pub Date : 2017-05-14 DOI: 10.1145/3068943.3068945
D. Srivastava
{"title":"Repairing Noisy Graphs","authors":"D. Srivastava","doi":"10.1145/3068943.3068945","DOIUrl":"https://doi.org/10.1145/3068943.3068945","url":null,"abstract":"Graphs are a flexible way to represent data in a variety of applications, with nodes representing domain-specific entities (e.g., records in record linkage, products and types in an ontology) and edges capturing a variety of relationships between these entities (e.g., an equivalence relationship between records in record linkage, a type-subtype relationship between types in an ontology). Often, the edges in this graph are noisy, in that some edges are missing (i.e., real-world relationships that do not have corresponding edges in the graph) and some edges are spurious (i.e., edges in the graph that do not have corresponding real-world relationships). Directly analyzing noisy graphs can lead to undesirable outcomes, making it important to repair noisy graphs. In this talk, we describe an approach that takes advantage of properties of real-world relationships and their estimated probabilities to ask oracle queries (an abstraction of crowdsourcing) to efficiently repair the noisy graphs. We illustrate this approach for the case of graphs that are unions of cliques (which is the case for record linkage) and graphs that are trees (which is the case for ontologies), and present theoretical and empirical results for these cases. This is joint work with Donatella Firmani, Sainyam Galhotra and Barna Saha.","PeriodicalId":345682,"journal":{"name":"Proceedings of the 2nd International Workshop on Network Data Analytics","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129437534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Performance Prediction for Graph Queries 图查询的性能预测
Proceedings of the 2nd International Workshop on Network Data Analytics Pub Date : 2017-05-14 DOI: 10.1145/3068943.3068947
M. Namaki, K. Sasani, Yinghui Wu, A. Gebremedhin
{"title":"Performance Prediction for Graph Queries","authors":"M. Namaki, K. Sasani, Yinghui Wu, A. Gebremedhin","doi":"10.1145/3068943.3068947","DOIUrl":"https://doi.org/10.1145/3068943.3068947","url":null,"abstract":"Query performance prediction has shown benefits to query optimization and resource allocation for relational databases. Emerging applications are leading to search scenarios where workloads with heterogeneous, structure-less analytical queries are processed over large-scale graph and network data. This calls for effective models to predict the performance of graph analytical queries, which are often more involved than their relational counterparts. In this paper, we study and evaluate predictive techniques for graph query performance prediction. We make several contributions. (1) We propose a general learning framework that makes use of practical and computationally efficient statistics from query scenarios and employs regression models. (2) We instantiate the framework with two routinely issued query classes, namely, reachability and graph pattern matching, that exhibit different query complexity. We develop modeling and learning algorithms for both query classes. (3) We show that our prediction models readily apply to resource-bounded querying, by providing a learning-based workload optimization strategy. Given a query workload and a time bound, the models select queries to be processed with a maximized query profit and a total cost within the bound. Using real-world graphs, we experimentally demonstrate the efficacy of our framework in terms of accuracy and the effectiveness of workload optimization.","PeriodicalId":345682,"journal":{"name":"Proceedings of the 2nd International Workshop on Network Data Analytics","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128097124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Using Graphical Features To Improve Demographic Prediction From Smart Phone Data 利用图形特征改进智能手机数据的人口预测
Proceedings of the 2nd International Workshop on Network Data Analytics Pub Date : 2017-05-14 DOI: 10.1145/3068943.3068948
S. Akter, L. Holder
{"title":"Using Graphical Features To Improve Demographic Prediction From Smart Phone Data","authors":"S. Akter, L. Holder","doi":"10.1145/3068943.3068948","DOIUrl":"https://doi.org/10.1145/3068943.3068948","url":null,"abstract":"Demographic information such as gender, age, ethnicity, level of education, disabilities, employment, and socio-economic status are important in the area of social science, survey and marketing. But it is difficult to obtain the demographic information from users due to reluctance of users to participate and low response rate. Through automated demographics prediction from smart phone sensor data, researchers can obtain this valuable information in a nonintrusive and cost-effective manner. We approach the problem of demographic prediction, namely, classification of gender, age group and job type, through the use of a graphical feature based framework. The framework represents information collected from sensor networks as graphs, extracts useful and relevant graphical features, and predicts demographic information. We evaluated our approach on the Nokia Mobile Phone dataset for the three classification tasks: gender, age-group and job-type. Our approach produced comparable results with most of the state of the art methods while having the additional advantage of general applicability to sensor networks without using sophisticated and application-specific feature generation techniques, background knowledge and special techniques to address class imbalance.","PeriodicalId":345682,"journal":{"name":"Proceedings of the 2nd International Workshop on Network Data Analytics","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126994089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Graph Mining to Characterize Competition for Employment 图挖掘表征就业竞争
Proceedings of the 2nd International Workshop on Network Data Analytics Pub Date : 2017-05-14 DOI: 10.1145/3068943.3068946
A. Toulis, Lukasz Golab
{"title":"Graph Mining to Characterize Competition for Employment","authors":"A. Toulis, Lukasz Golab","doi":"10.1145/3068943.3068946","DOIUrl":"https://doi.org/10.1145/3068943.3068946","url":null,"abstract":"In this paper, we discuss a novel application of graph analytics to characterize competition in the workforce. We propose a methodology that relies on finding communities in a graph representing prospective employees (with edges connecting people who interviewed for the same job) and communities in a graph representing available jobs (with edges connecting jobs that interviewed the same person). We then apply the proposed methodology to a real dataset corresponding to cooperative internships offered to undergraduate students at a North American post-secondary institution, illustrating the benefits of our approach.","PeriodicalId":345682,"journal":{"name":"Proceedings of the 2nd International Workshop on Network Data Analytics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121310049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Web and Social Media Analytics towards Enhancing Urban Transportations: A Case for Bangalore 网络和社会媒体分析促进城市交通:以班加罗尔为例
Proceedings of the 2nd International Workshop on Network Data Analytics Pub Date : 2017-05-14 DOI: 10.1145/3068943.3068950
Manjira Sinha, P. Varma, Tridib Mukherjee
{"title":"Web and Social Media Analytics towards Enhancing Urban Transportations: A Case for Bangalore","authors":"Manjira Sinha, P. Varma, Tridib Mukherjee","doi":"10.1145/3068943.3068950","DOIUrl":"https://doi.org/10.1145/3068943.3068950","url":null,"abstract":"Cities today are typically plagued by multiple issues such as âĂŞ traffic jams, garbage, transit overload, public safety, drainage etc. Citizens today tend to discuss these issues in public forums, social media, web blogs, in a widespread manner. Given that issues related to public transportation are most actively reported across web-based sources, we present a holistic framework for collection, categorization, aggregation and visualization of urban public transportation issues. The primary challenges in deriving useful insights from web-based sources, stem from: (a) the number of reports; (b) incomplete or implicit spatio-temporal context; and the (c) unstructured nature of text in these reports. This paper provides the text categorization techniques that can be adopted to address specifically these challenges. The work initiates with the formal complaint data from the largest public transportation agency in Bangalore, complemented by complaint reports from web-based and social media sources. An easy to navigate and well-organized dashboard is developed for efficient visualization. The dashboard is currently being piloted with the largest transportation agency in Bangalore.","PeriodicalId":345682,"journal":{"name":"Proceedings of the 2nd International Workshop on Network Data Analytics","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128342918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信