Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining最新文献

筛选
英文 中文
Network-Wide Traffic States Imputation Using Self-interested Coalitional Learning 基于自感兴趣联合学习的全网络流量状态归算
Huiling Qin, Xianyuan Zhan, Yuanxun Li, Xiaodu Yang, Yu Zheng
{"title":"Network-Wide Traffic States Imputation Using Self-interested Coalitional Learning","authors":"Huiling Qin, Xianyuan Zhan, Yuanxun Li, Xiaodu Yang, Yu Zheng","doi":"10.1145/3447548.3467424","DOIUrl":"https://doi.org/10.1145/3447548.3467424","url":null,"abstract":"Accurate network-wide traffic state estimation is vital to many transportation operations and urban applications. However, existing methods often suffer from the scalability issue when performing real-time inference at the city-level, or not robust enough under limited data. Currently, GPS trajectory data from probe vehicles has become a popular data source for many transportation applications. GPS trajectory data has large coverage area, which is ideal for network-wide applications, but also has the disadvantage of being sparse and highly heterogeneous among different time and locations. In this study, we focus on developing a robust and interpretable network-wide traffic state imputation framework using partially observed traffic information. We introduce a new learning strategy, called self-interested coalitional learning (SCL), which forges cooperation between a main self-interested semi-supervised learning task and a discriminator as a critic to facilitate main task training while providing interpretability on the results. In our detailed model, we use a temporal graph convolutional variational autoencoder (TG-VAE) as the reconstructor, which models the complex spatio-temporal pattern in data and solves the main traffic state imputation task. A discriminator is introduced to output interpretable imputation confidence on the estimated results and also help to enhance the performance of the reconstructor. The framework is evaluated using a large GPS trajectory dataset from taxis in Jinan, China. Extensive experiments against the state-of-the-art baselines demonstrate the effectiveness and robustness of the proposed method for network-wide traffic state estimation.","PeriodicalId":421090,"journal":{"name":"Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining","volume":"122 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116380950","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Physical Equation Discovery Using Physics-Consistent Neural Network (PCNN) Under Incomplete Observability 不完全可观测条件下物理一致神经网络(PCNN)的物理方程发现
Haoran Li, Yang Weng
{"title":"Physical Equation Discovery Using Physics-Consistent Neural Network (PCNN) Under Incomplete Observability","authors":"Haoran Li, Yang Weng","doi":"10.1145/3447548.3467448","DOIUrl":"https://doi.org/10.1145/3447548.3467448","url":null,"abstract":"Deep neural networks (DNNs) have been extensively applied to various fields, including physical-system monitoring and control. However, the requirement of a high confidence level in physical systems made system operators hard to trust black-box type DNNs. For example, while DNN can perform well at both training data and testing data, but when the physical system changes its operation points at a completely different range, never appeared in the history records, DNN can fail. To open the black box as much as possible, we propose a Physics-Consistent Neural Network (PCNN) for physical systems with the following properties: (1) PCNN can be shrunk to physical equations for sub-areas with full observability, (2) PCNN reduces unobservable areas into some virtual nodes, leading to a reduced network. Thus, for such a network, PCNN can also represent its underlying physical equation via a specifically designed deep-shallow hierarchy, and (3) PCNN is theoretically proved that the shallow NN in the PCNN is convex with respect to physical variables, leading to a set of convex optimizations to seek for the physics-consistent initial guess for the PCNN. We also develop a physical rule-based approach for initial guesses, significantly shortening the searching time for large systems. Comprehensive experiments on diversified systems are implemented to illustrate the outstanding performance of our PCNN.","PeriodicalId":421090,"journal":{"name":"Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122852043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Analysis of Faces in a Decade of US Cable TV News 十年来美国有线电视新闻的面孔分析
James Hong, Will Crichton, Haotian Zhang, Daniel Y. Fu, Jacob Ritchie, Jeremy Barenholtz, Ben Hannel, Xinwei Yao, Michaela Murray, Geraldine Moriba, Maneesh Agrawala, K. Fatahalian
{"title":"Analysis of Faces in a Decade of US Cable TV News","authors":"James Hong, Will Crichton, Haotian Zhang, Daniel Y. Fu, Jacob Ritchie, Jeremy Barenholtz, Ben Hannel, Xinwei Yao, Michaela Murray, Geraldine Moriba, Maneesh Agrawala, K. Fatahalian","doi":"10.1145/3447548.3467134","DOIUrl":"https://doi.org/10.1145/3447548.3467134","url":null,"abstract":"Cable (TV) news reaches millions of US households each day. News stakeholders such as communications researchers, journalists, and media monitoring organizations are interested in the visual content of cable news, especially who is on-screen. Manual analysis, however, is labor intensive and limits the size of prior studies. We conduct a large-scale, quantitative analysis of the faces in a decade of cable news video from the top three US cable news networks (CNN, FOX, and MSNBC), totaling 244,038 hours between January 2010 and July 2019. Our work uses technologies such as automatic face and gender recognition to measure the \"screen time\" of faces and to enable visual analysis and exploration at scale. Our analysis method gives insight into a broad set of socially relevant topics. For instance, male-presenting faces receive much more screen time than female-presenting faces (2.4x in 2010, 1.9x in 2019). To make our dataset and annotations accessible, we release a public interface at https://tvnews.stanford.edu that allows the general public to write queries and to perform their own analyses.","PeriodicalId":421090,"journal":{"name":"Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123031749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
A Hyper-surface Arrangement Model of Ranking Distributions 排序分布的超曲面排列模型
S. Kaji, Akira Horiguchi, T. Abe, Yohsuke Watanabe
{"title":"A Hyper-surface Arrangement Model of Ranking Distributions","authors":"S. Kaji, Akira Horiguchi, T. Abe, Yohsuke Watanabe","doi":"10.1145/3447548.3467253","DOIUrl":"https://doi.org/10.1145/3447548.3467253","url":null,"abstract":"A distribution on the permutations over a fixed finite set is called a ranking distribution. Modelling ranking distributions is one of the major topics in preference learning as such distributions appear as the ranking data produced by many judges. In this paper, we propose a geometric model for ranking distributions. Our idea is to use hyper-surface arrangements in a metric space as the representation space, where each component cut out by hyper-surfaces corresponds to a total ordering, and its volume is proportional to the probability. In this setting, the union of components corresponds to a partial ordering and its probability is also estimated by the volume. Similarly, the probability of a partial ordering conditioned by another partial ordering is estimated by the ratio of volumes. We provide a simple iterative algorithm to fit our model to a given dataset. We show our model can represent the distribution of a real-world dataset faithfully and can be used for prediction and visualisation purposes.","PeriodicalId":421090,"journal":{"name":"Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129499050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Color-blind 3-Approximation for Chromatic Correlation Clustering and Improved Heuristics 色相关聚类的色盲3-逼近及改进启发式
Nicolas Klodt, Lars Seifert, Arthur Zahn, Katrin Casel, Davis Issac, T. Friedrich
{"title":"A Color-blind 3-Approximation for Chromatic Correlation Clustering and Improved Heuristics","authors":"Nicolas Klodt, Lars Seifert, Arthur Zahn, Katrin Casel, Davis Issac, T. Friedrich","doi":"10.1145/3447548.3467446","DOIUrl":"https://doi.org/10.1145/3447548.3467446","url":null,"abstract":"Chromatic Correlation Clustering (CCC) models clustering of objects with categorical pairwise relationships. The model can be viewed as clustering the vertices of a graph with edge-labels (colors). Bonchi et al. [KDD 2012] introduced it as a natural generalization of the well studied problem Correlation Clustering (CC), motivated by real-world applications from data-mining, social networks and bioinformatics. We give theoretical as well as practical contributions to the study of CCC. Our main theoretical contribution is an alternative analysis of the famous Pivot algorithm for CC. We show that, when simply run color-blind, Pivot is also a linear time 3-approximation for CCC. The previous best theoretical results for CCC were a 4-approximation with a high-degree polynomial runtime and a linear time 11-approximation, both by Anava et al. [WWW 2015]. While this theoretical result justifies Pivot as a baseline comparison for other heuristics, its blunt color-blindness performs poorly in practice. We develop a color-sensitive, practical heuristic we call Greedy Expansion that empirically outperforms all heuristics proposed for CCC so far, both on real-world and synthetic instances. Further, we propose a novel generalization of CCC allowing for multi-labelled edges. We argue that it is more suitable for many of the real-world applications and extend our results to this model.","PeriodicalId":421090,"journal":{"name":"Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129687125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
2021 KDD Workshop on Understanding Public Perceptions for Applied Data Science: How Important is it to Engage Society in Technology Development? 了解公众对应用数据科学的看法:让社会参与技术开发有多重要?
ChanGhee Koh, SoYoung Kim, Nathaniel Tan
{"title":"2021 KDD Workshop on Understanding Public Perceptions for Applied Data Science: How Important is it to Engage Society in Technology Development?","authors":"ChanGhee Koh, SoYoung Kim, Nathaniel Tan","doi":"10.1145/3447548.3469459","DOIUrl":"https://doi.org/10.1145/3447548.3469459","url":null,"abstract":"The International Workshop on \"Understanding Public Perceptions for Applied Data Science\" (UPP4DS'21) presents interdisciplinary contributions that address AI trustworthiness from the perspective of the public as end-users. With healthcare as the context for discussion, the workshop will bring together leading experts in applied data science, healthcare, public policy, and social sciences to examine the role of public trust and public engagement in the development of healthcare AI applications. The workshop will also feature original research from the social sciences with significant implications in the adoption of healthcare technologies and invite open discussions on the role of society in the development of acceptable technologies.","PeriodicalId":421090,"journal":{"name":"Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128216040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FedRS
Xin-Chun Li, De-chuan Zhan
{"title":"FedRS","authors":"Xin-Chun Li, De-chuan Zhan","doi":"10.1145/3447548.3467254","DOIUrl":"https://doi.org/10.1145/3447548.3467254","url":null,"abstract":"Federated Learning (FL) aims to generate a global shared model via collaborating decentralized clients with privacy considerations. Unlike standard distributed optimization, FL takes multiple optimization steps on local clients and then aggregates the model updates via a parameter server. Although this significantly reduces communication costs, the non-iid property across heterogeneous devices could make the local update diverge a lot, posing a fundamental challenge to aggregation. In this paper, we focus on a special kind of non-iid scene, i.e., label distribution skew, where each client can only access a partial set of the whole class set. Considering top layers of neural networks are more task-specific, we advocate that the last classification layer is more vulnerable to the shift of label distribution. Hence, we in-depth study the classifier layer and point out that the standard softmax will encounter several problems caused by missing classes. As an alternative, we propose \"Restricted Softmax\" to limit the update of missing classes' weights during the local procedure. Our proposed FedRS is very easy to implement with only a few lines of code. We investigate our methods on both public datasets and a real-world service awareness application. Abundant experimental results verify the superiorities of our methods.","PeriodicalId":421090,"journal":{"name":"Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128680741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 53
Context-aware Outstanding Fact Mining from Knowledge Graphs 从知识图中挖掘上下文感知的杰出事实
Yueji Yang, Yuchen Li, Panagiotis Karras, A. Tung
{"title":"Context-aware Outstanding Fact Mining from Knowledge Graphs","authors":"Yueji Yang, Yuchen Li, Panagiotis Karras, A. Tung","doi":"10.1145/3447548.3467272","DOIUrl":"https://doi.org/10.1145/3447548.3467272","url":null,"abstract":"An Outstanding Fact (OF) is an attribute that makes a target entity stand out from its peers. The mining of OFs has important applications, especially in Computational Journalism, such as news promotion, fact-checking, and news story finding. However, existing approaches to OF mining: (i) disregard the context in which the target entity appears, hence may report facts irrelevant to that context; and (ii) require relational data, which are often unavailable or incomplete in many application domains. In this paper, we introduce the novel problem of mining Context-aware Outstanding Facts (COFs) for a target entity under a given context specified by a context entity. We propose FMiner, a context-aware mining framework that leverages knowledge graphs (KGs) for COF mining. FMiner generates COFs in two steps. First, it discovers top-k relevant relationships between the target and the context entity from a KG. We propose novel optimizations and pruning techniques to expedite this operation, as this process is very expensive on large KGs due to its exponential complexity. Second, for each derived relationship, we find the attributes of the target entity that distinguish it from peer entities that have the same relationship with the context entity, yielding the top-l COFs. As such, the mining process is modeled as a top-(k,l) search problem. Context-awareness is ensured by relying on the relevant relationships with the context entity to derive peer entities for COF extraction. Consequently, FMiner can effectively navigate the search to obtain context-aware OFs by incorporating a context entity. We conduct extensive experiments, including a user study, to validate the efficiency and the effectiveness of FMiner.","PeriodicalId":421090,"journal":{"name":"Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129333706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
MixGCF
Tinglin Huang, Yuxiao Dong, Ming Ding, Zhen Yang, Wenzheng Feng, Xinyu Wang, Jie Tang
{"title":"MixGCF","authors":"Tinglin Huang, Yuxiao Dong, Ming Ding, Zhen Yang, Wenzheng Feng, Xinyu Wang, Jie Tang","doi":"10.1145/3447548.3467408","DOIUrl":"https://doi.org/10.1145/3447548.3467408","url":null,"abstract":"Graph neural networks (GNNs) have recently emerged as state-of-the-art collaborative filtering (CF) solution. A fundamental challenge of CF is to distill negative signals from the implicit feedback, but negative sampling in GNN-based CF has been largely unexplored. In this work, we propose to study negative sampling by leveraging both the user-item graph structure and GNNs' aggregation process. We present the MixGCF method---a general negative sampling plugin that can be directly used to train GNN-based recommender systems. In MixGCF, rather than sampling raw negatives from data, we design the hop mixing technique to synthesize hard negatives. Specifically, the idea of hop mixing is to generate the synthetic negative by aggregating embeddings from different layers of raw negatives' neighborhoods. The layer and neighborhood selection process are optimized by a theoretically-backed hard selection strategy. Extensive experiments demonstrate that by using MixGCF, state-of-the-art GNN-based recommendation models can be consistently and significantly improved, e.g., 26% for NGCF and 22% for LightGCN in terms of NDCG@20.","PeriodicalId":421090,"journal":{"name":"Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130233884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 83
Graph Deep Factors for Forecasting with Applications to Cloud Resource Allocation 深度因子图在云资源分配预测中的应用
Hongjie Chen, Ryan A. Rossi, K. Mahadik, Sungchul Kim, Hoda Eldardiry
{"title":"Graph Deep Factors for Forecasting with Applications to Cloud Resource Allocation","authors":"Hongjie Chen, Ryan A. Rossi, K. Mahadik, Sungchul Kim, Hoda Eldardiry","doi":"10.1145/3447548.3467357","DOIUrl":"https://doi.org/10.1145/3447548.3467357","url":null,"abstract":"Deep probabilistic forecasting techniques have recently been proposed for modeling large collections of time-series. However, these techniques explicitly assume either complete independence (local model) or complete dependence (global model) between time-series in the collection. This corresponds to the two extreme cases where every time-series is disconnected from every other time-series in the collection or likewise, that every time-series is related to every other time-series resulting in a completely connected graph. In this work, we propose a deep hybrid probabilistic graph-based forecasting framework called Graph Deep Factors (GraphDF) that goes beyond these two extremes by allowing nodes and their time-series to be connected to others in an arbitrary fashion. GraphDF is a hybrid forecasting framework that consists of a relational global and relational local model. In particular, we propose a relational global model that learns complex non-linear time-series patterns globally using the structure of the graph to improve both forecasting accuracy and computational efficiency. Similarly, instead of modeling every time-series independently, we learn a relational local model that not only considers its individual time-series but also the time-series of nodes that are connected in the graph. The experiments demonstrate the effectiveness of the proposed deep hybrid graph-based forecasting model compared to the state-of-the-art methods in terms of its forecasting accuracy, runtime, and scalability. Our case study reveals that GraphDF can successfully generate cloud usage forecasts and opportunistically schedule workloads to increase cloud cluster utilization by 47.5% on average.","PeriodicalId":421090,"journal":{"name":"Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129073654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信