Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining最新文献_第2页

Network-Wide Traffic States Imputation Using Self-interested Coalitional Learning 基于自感兴趣联合学习的全网络流量状态归算

Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining Pub Date : 2021-08-14 DOI: 10.1145/3447548.3467424

Huiling Qin, Xianyuan Zhan, Yuanxun Li, Xiaodu Yang, Yu Zheng

{"title":"Network-Wide Traffic States Imputation Using Self-interested Coalitional Learning","authors":"Huiling Qin, Xianyuan Zhan, Yuanxun Li, Xiaodu Yang, Yu Zheng","doi":"10.1145/3447548.3467424","DOIUrl":"https://doi.org/10.1145/3447548.3467424","url":null,"abstract":"Accurate network-wide traffic state estimation is vital to many transportation operations and urban applications. However, existing methods often suffer from the scalability issue when performing real-time inference at the city-level, or not robust enough under limited data. Currently, GPS trajectory data from probe vehicles has become a popular data source for many transportation applications. GPS trajectory data has large coverage area, which is ideal for network-wide applications, but also has the disadvantage of being sparse and highly heterogeneous among different time and locations. In this study, we focus on developing a robust and interpretable network-wide traffic state imputation framework using partially observed traffic information. We introduce a new learning strategy, called self-interested coalitional learning (SCL), which forges cooperation between a main self-interested semi-supervised learning task and a discriminator as a critic to facilitate main task training while providing interpretability on the results. In our detailed model, we use a temporal graph convolutional variational autoencoder (TG-VAE) as the reconstructor, which models the complex spatio-temporal pattern in data and solves the main traffic state imputation task. A discriminator is introduced to output interpretable imputation confidence on the estimated results and also help to enhance the performance of the reconstructor. The framework is evaluated using a large GPS trajectory dataset from taxis in Jinan, China. Extensive experiments against the state-of-the-art baselines demonstrate the effectiveness and robustness of the proposed method for network-wide traffic state estimation.","PeriodicalId":421090,"journal":{"name":"Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining","volume":"122 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116380950","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 19

Physical Equation Discovery Using Physics-Consistent Neural Network (PCNN) Under Incomplete Observability 不完全可观测条件下物理一致神经网络(PCNN)的物理方程发现

Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining Pub Date : 2021-08-14 DOI: 10.1145/3447548.3467448

Haoran Li, Yang Weng

{"title":"Physical Equation Discovery Using Physics-Consistent Neural Network (PCNN) Under Incomplete Observability","authors":"Haoran Li, Yang Weng","doi":"10.1145/3447548.3467448","DOIUrl":"https://doi.org/10.1145/3447548.3467448","url":null,"abstract":"Deep neural networks (DNNs) have been extensively applied to various fields, including physical-system monitoring and control. However, the requirement of a high confidence level in physical systems made system operators hard to trust black-box type DNNs. For example, while DNN can perform well at both training data and testing data, but when the physical system changes its operation points at a completely different range, never appeared in the history records, DNN can fail. To open the black box as much as possible, we propose a Physics-Consistent Neural Network (PCNN) for physical systems with the following properties: (1) PCNN can be shrunk to physical equations for sub-areas with full observability, (2) PCNN reduces unobservable areas into some virtual nodes, leading to a reduced network. Thus, for such a network, PCNN can also represent its underlying physical equation via a specifically designed deep-shallow hierarchy, and (3) PCNN is theoretically proved that the shallow NN in the PCNN is convex with respect to physical variables, leading to a set of convex optimizations to seek for the physics-consistent initial guess for the PCNN. We also develop a physical rule-based approach for initial guesses, significantly shortening the searching time for large systems. Comprehensive experiments on diversified systems are implemented to illustrate the outstanding performance of our PCNN.","PeriodicalId":421090,"journal":{"name":"Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122852043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Analysis of Faces in a Decade of US Cable TV News 十年来美国有线电视新闻的面孔分析

Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining Pub Date : 2021-08-14 DOI: 10.1145/3447548.3467134

James Hong, Will Crichton, Haotian Zhang, Daniel Y. Fu, Jacob Ritchie, Jeremy Barenholtz, Ben Hannel, Xinwei Yao, Michaela Murray, Geraldine Moriba, Maneesh Agrawala, K. Fatahalian

{"title":"Analysis of Faces in a Decade of US Cable TV News","authors":"James Hong, Will Crichton, Haotian Zhang, Daniel Y. Fu, Jacob Ritchie, Jeremy Barenholtz, Ben Hannel, Xinwei Yao, Michaela Murray, Geraldine Moriba, Maneesh Agrawala, K. Fatahalian","doi":"10.1145/3447548.3467134","DOIUrl":"https://doi.org/10.1145/3447548.3467134","url":null,"abstract":"Cable (TV) news reaches millions of US households each day. News stakeholders such as communications researchers, journalists, and media monitoring organizations are interested in the visual content of cable news, especially who is on-screen. Manual analysis, however, is labor intensive and limits the size of prior studies. We conduct a large-scale, quantitative analysis of the faces in a decade of cable news video from the top three US cable news networks (CNN, FOX, and MSNBC), totaling 244,038 hours between January 2010 and July 2019. Our work uses technologies such as automatic face and gender recognition to measure the \"screen time\" of faces and to enable visual analysis and exploration at scale. Our analysis method gives insight into a broad set of socially relevant topics. For instance, male-presenting faces receive much more screen time than female-presenting faces (2.4x in 2010, 1.9x in 2019). To make our dataset and annotations accessible, we release a public interface at https://tvnews.stanford.edu that allows the general public to write queries and to perform their own analyses.","PeriodicalId":421090,"journal":{"name":"Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123031749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

A Hyper-surface Arrangement Model of Ranking Distributions 排序分布的超曲面排列模型

Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining Pub Date : 2021-08-14 DOI: 10.1145/3447548.3467253

S. Kaji, Akira Horiguchi, T. Abe, Yohsuke Watanabe

引用次数: 0

A Color-blind 3-Approximation for Chromatic Correlation Clustering and Improved Heuristics 色相关聚类的色盲3-逼近及改进启发式

Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining Pub Date : 2021-08-14 DOI: 10.1145/3447548.3467446

Nicolas Klodt, Lars Seifert, Arthur Zahn, Katrin Casel, Davis Issac, T. Friedrich

{"title":"A Color-blind 3-Approximation for Chromatic Correlation Clustering and Improved Heuristics","authors":"Nicolas Klodt, Lars Seifert, Arthur Zahn, Katrin Casel, Davis Issac, T. Friedrich","doi":"10.1145/3447548.3467446","DOIUrl":"https://doi.org/10.1145/3447548.3467446","url":null,"abstract":"Chromatic Correlation Clustering (CCC) models clustering of objects with categorical pairwise relationships. The model can be viewed as clustering the vertices of a graph with edge-labels (colors). Bonchi et al. [KDD 2012] introduced it as a natural generalization of the well studied problem Correlation Clustering (CC), motivated by real-world applications from data-mining, social networks and bioinformatics. We give theoretical as well as practical contributions to the study of CCC. Our main theoretical contribution is an alternative analysis of the famous Pivot algorithm for CC. We show that, when simply run color-blind, Pivot is also a linear time 3-approximation for CCC. The previous best theoretical results for CCC were a 4-approximation with a high-degree polynomial runtime and a linear time 11-approximation, both by Anava et al. [WWW 2015]. While this theoretical result justifies Pivot as a baseline comparison for other heuristics, its blunt color-blindness performs poorly in practice. We develop a color-sensitive, practical heuristic we call Greedy Expansion that empirically outperforms all heuristics proposed for CCC so far, both on real-world and synthetic instances. Further, we propose a novel generalization of CCC allowing for multi-labelled edges. We argue that it is more suitable for many of the real-world applications and extend our results to this model.","PeriodicalId":421090,"journal":{"name":"Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129687125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

2021 KDD Workshop on Understanding Public Perceptions for Applied Data Science: How Important is it to Engage Society in Technology Development? 了解公众对应用数据科学的看法:让社会参与技术开发有多重要?

Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining Pub Date : 2021-08-14 DOI: 10.1145/3447548.3469459

ChanGhee Koh, SoYoung Kim, Nathaniel Tan

引用次数: 0

FedRS

Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining Pub Date : 2021-08-14 DOI: 10.1145/3447548.3467254

Xin-Chun Li, De-chuan Zhan

{"title":"FedRS","authors":"Xin-Chun Li, De-chuan Zhan","doi":"10.1145/3447548.3467254","DOIUrl":"https://doi.org/10.1145/3447548.3467254","url":null,"abstract":"Federated Learning (FL) aims to generate a global shared model via collaborating decentralized clients with privacy considerations. Unlike standard distributed optimization, FL takes multiple optimization steps on local clients and then aggregates the model updates via a parameter server. Although this significantly reduces communication costs, the non-iid property across heterogeneous devices could make the local update diverge a lot, posing a fundamental challenge to aggregation. In this paper, we focus on a special kind of non-iid scene, i.e., label distribution skew, where each client can only access a partial set of the whole class set. Considering top layers of neural networks are more task-specific, we advocate that the last classification layer is more vulnerable to the shift of label distribution. Hence, we in-depth study the classifier layer and point out that the standard softmax will encounter several problems caused by missing classes. As an alternative, we propose \"Restricted Softmax\" to limit the update of missing classes' weights during the local procedure. Our proposed FedRS is very easy to implement with only a few lines of code. We investigate our methods on both public datasets and a real-world service awareness application. Abundant experimental results verify the superiorities of our methods.","PeriodicalId":421090,"journal":{"name":"Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128680741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 53

Context-aware Outstanding Fact Mining from Knowledge Graphs 从知识图中挖掘上下文感知的杰出事实

Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining Pub Date : 2021-08-14 DOI: 10.1145/3447548.3467272

Yueji Yang, Yuchen Li, Panagiotis Karras, A. Tung

{"title":"Context-aware Outstanding Fact Mining from Knowledge Graphs","authors":"Yueji Yang, Yuchen Li, Panagiotis Karras, A. Tung","doi":"10.1145/3447548.3467272","DOIUrl":"https://doi.org/10.1145/3447548.3467272","url":null,"abstract":"An Outstanding Fact (OF) is an attribute that makes a target entity stand out from its peers. The mining of OFs has important applications, especially in Computational Journalism, such as news promotion, fact-checking, and news story finding. However, existing approaches to OF mining: (i) disregard the context in which the target entity appears, hence may report facts irrelevant to that context; and (ii) require relational data, which are often unavailable or incomplete in many application domains. In this paper, we introduce the novel problem of mining Context-aware Outstanding Facts (COFs) for a target entity under a given context specified by a context entity. We propose FMiner, a context-aware mining framework that leverages knowledge graphs (KGs) for COF mining. FMiner generates COFs in two steps. First, it discovers top-k relevant relationships between the target and the context entity from a KG. We propose novel optimizations and pruning techniques to expedite this operation, as this process is very expensive on large KGs due to its exponential complexity. Second, for each derived relationship, we find the attributes of the target entity that distinguish it from peer entities that have the same relationship with the context entity, yielding the top-l COFs. As such, the mining process is modeled as a top-(k,l) search problem. Context-awareness is ensured by relying on the relevant relationships with the context entity to derive peer entities for COF extraction. Consequently, FMiner can effectively navigate the search to obtain context-aware OFs by incorporating a context entity. We conduct extensive experiments, including a user study, to validate the efficiency and the effectiveness of FMiner.","PeriodicalId":421090,"journal":{"name":"Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129333706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

MixGCF

Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining Pub Date : 2021-08-14 DOI: 10.1145/3447548.3467408

Tinglin Huang, Yuxiao Dong, Ming Ding, Zhen Yang, Wenzheng Feng, Xinyu Wang, Jie Tang

引用次数: 83

Graph Deep Factors for Forecasting with Applications to Cloud Resource Allocation 深度因子图在云资源分配预测中的应用

Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining Pub Date : 2021-08-14 DOI: 10.1145/3447548.3467357

Hongjie Chen, Ryan A. Rossi, K. Mahadik, Sungchul Kim, Hoda Eldardiry

{"title":"Graph Deep Factors for Forecasting with Applications to Cloud Resource Allocation","authors":"Hongjie Chen, Ryan A. Rossi, K. Mahadik, Sungchul Kim, Hoda Eldardiry","doi":"10.1145/3447548.3467357","DOIUrl":"https://doi.org/10.1145/3447548.3467357","url":null,"abstract":"Deep probabilistic forecasting techniques have recently been proposed for modeling large collections of time-series. However, these techniques explicitly assume either complete independence (local model) or complete dependence (global model) between time-series in the collection. This corresponds to the two extreme cases where every time-series is disconnected from every other time-series in the collection or likewise, that every time-series is related to every other time-series resulting in a completely connected graph. In this work, we propose a deep hybrid probabilistic graph-based forecasting framework called Graph Deep Factors (GraphDF) that goes beyond these two extremes by allowing nodes and their time-series to be connected to others in an arbitrary fashion. GraphDF is a hybrid forecasting framework that consists of a relational global and relational local model. In particular, we propose a relational global model that learns complex non-linear time-series patterns globally using the structure of the graph to improve both forecasting accuracy and computational efficiency. Similarly, instead of modeling every time-series independently, we learn a relational local model that not only considers its individual time-series but also the time-series of nodes that are connected in the graph. The experiments demonstrate the effectiveness of the proposed deep hybrid graph-based forecasting model compared to the state-of-the-art methods in terms of its forecasting accuracy, runtime, and scalability. Our case study reveals that GraphDF can successfully generate cloud usage forecasts and opportunistically schedule workloads to increase cloud cluster utilization by 47.5% on average.","PeriodicalId":421090,"journal":{"name":"Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129073654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8