Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining最新文献_第9页

Reducing Negative Effects of the Biases of Language Models in Zero-Shot Setting 减少零射击环境下语言模型偏差的负面影响

Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining Pub Date : 2023-02-27 DOI: 10.1145/3539597.3570382

Xiaosu Wang, Yun Xiong, Beichen Kang, Yao Zhang, P. Yu, Yangyong Zhu

{"title":"Reducing Negative Effects of the Biases of Language Models in Zero-Shot Setting","authors":"Xiaosu Wang, Yun Xiong, Beichen Kang, Yao Zhang, P. Yu, Yangyong Zhu","doi":"10.1145/3539597.3570382","DOIUrl":"https://doi.org/10.1145/3539597.3570382","url":null,"abstract":"Pre-trained language models (PLMs) such as GPTs have been revealed to be biased towards certain target classes because of the prompt and the model's intrinsic biases. In contrast to the fully supervised scenario where there are a large number of costly labeled samples that can be used to fine-tune model parameters to correct for biases, there are no labeled samples available for the zero-shot setting. We argue that a key to calibrating the biases of a PLM on a target task in zero-shot setting lies in detecting and estimating the biases, which remains a challenge. In this paper, we first construct probing samples with the randomly generated token sequences, which are simple but effective in detecting inputs for stimulating GPTs to show the biases; and we pursue an in-depth research on the plausibility of utilizing class scores for the probing samples to reflect and estimate the biases of GPTs on a downstream target task. Furtherly, in order to effectively utilize the probing samples and thus reduce negative effects of the biases of GPTs, we propose a lightweight model Calibration Adapter (CA) along with a self-guided training strategy that carries out distribution-level optimization, which enables us to take advantage of the probing samples to fine-tune and select only the proposed CA, respectively, while keeping the PLM encoder frozen. To demonstrate the effectiveness of our study, we have conducted extensive experiments, where the results indicate that the calibration ability acquired by CA on the probing samples can be successfully transferred to reduce negative effects of the biases of GPTs on a downstream target task, and our approach can yield better performance than state-of-the-art (SOTA) models in zero-shot settings.","PeriodicalId":227804,"journal":{"name":"Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117221465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Towards an Event-Aware Urban Mobility Prediction System 面向事件感知的城市交通预测系统

Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining Pub Date : 2023-02-27 DOI: 10.1145/3539597.3575783

Zhaonan Wang, Renhe Jiang, Z. Fan, Xuan Song, R. Shibasaki

{"title":"Towards an Event-Aware Urban Mobility Prediction System","authors":"Zhaonan Wang, Renhe Jiang, Z. Fan, Xuan Song, R. Shibasaki","doi":"10.1145/3539597.3575783","DOIUrl":"https://doi.org/10.1145/3539597.3575783","url":null,"abstract":"Today, thanks to the rapid developing mobile and sensor networks in IoT (Internet of Things) systems, spatio-temporal big data are being constantly generated. They have brought us a data-driven possibility to sense and understand crowd mobility on a city scale. A fundamental task towards the next-generation mobility services, such as Intelligent Transportation Systems (ITS), Mobility-as-a-Service (MaaS), is spatio-temporal predictive modeling of the geo-sensory signals. There is a recent line of research leveraging deep learning techniques to boost the forecasting performance on such tasks. While simulating the regularity of mobility behaviors (e.g., routines, periodicity) in a more sophisticated way, the existing studies ignore an important part of urban activities, i.e., events. Including holidays, extreme weathers, pandemic, accidents, various urban events happen from time to time and cause non-stationary phenomena, which by nature make the spatio-temporal forecasting task challenging. We thereby envision an event-aware urban mobility prediction model that is capable of fast adapting and making reliable predictions in different scenarios, which is crucial to decision making towards emergency response and urban resilience.","PeriodicalId":227804,"journal":{"name":"Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115161553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

MMBench: The Match Making Benchmark MMBench:匹配基准

Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining Pub Date : 2023-02-27 DOI: 10.1145/3539597.3573023

Yongsheng Liu, Yanxing Qi, Jiangwei Zhang, Connie Kou, Qiaolin Chen

{"title":"MMBench: The Match Making Benchmark","authors":"Yongsheng Liu, Yanxing Qi, Jiangwei Zhang, Connie Kou, Qiaolin Chen","doi":"10.1145/3539597.3573023","DOIUrl":"https://doi.org/10.1145/3539597.3573023","url":null,"abstract":"Video gaming has gained huge popularity over the last few decades. As reported, there are about 2.9 billion gamers globally. Among all genres, competitive games are one of the most popular ones. Matchmaking is a core problem for competitive games, which determines the player satisfaction, hence influences the game success. Most matchmaking systems group the queuing players into opposing teams with similar skill levels. The key challenge is to accurately rate the players' skills based on their match performances. There has been an increasing amount of effort on developing such rating systems such as Elo, Glicko. However, games with different game-plays might have different game modes, which might require an extensive amount of effort for rating system customization. Even though there are many rating system choices and various customization strategies, there is a clear lack of a systematic framework with which different rating systems can be analysed and compared against each other. Such a framework could help game developers to identify the bottlenecks of their matchmaking systems and enhance the performance of their matchmaking systems. To bridge the gap, we present MMBench, the first benchmark framework for evaluating different rating systems. It serves as a fair means of comparison for different rating systems and enables a deeper understanding of different rating systems. In this paper, we will present how MMBench could benchmark the three major rating systems, Elo, Glicko, Trueskill in the battle modes of 1 vs 1, n vs n, battle royal and teamed battle royal over both real and synthetic datasets.","PeriodicalId":227804,"journal":{"name":"Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134428329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Incorporating Fairness in Large Scale NLU Systems 大规模NLU系统中公平性的引入

Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining Pub Date : 2023-02-27 DOI: 10.1145/3539597.3575785

Rahul Gupta, Lisa Bauer, Kai-Wei Chang, J. Dhamala, A. Galstyan, Palash Goyal, Qian Hu, Avni Khatri, Rohit Parimi, Charith S. Peris, Apurv Verma, R. Zemel, Premkumar Natarajan

引用次数: 0

Responsible AI for Trusted AI-powered Enterprise Platforms 负责任的人工智能支持可信的人工智能企业平台

Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining Pub Date : 2023-02-27 DOI: 10.1145/3539597.3575784

S. Hoi

引用次数: 5

Simultaneous Linear Multi-view Attributed Graph Representation Learning and Clustering 同时线性多视图属性图表示学习与聚类

Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining Pub Date : 2023-02-27 DOI: 10.1145/3539597.3570367

Chakib Fettal, Lazhar Labiod, M. Nadif

{"title":"Simultaneous Linear Multi-view Attributed Graph Representation Learning and Clustering","authors":"Chakib Fettal, Lazhar Labiod, M. Nadif","doi":"10.1145/3539597.3570367","DOIUrl":"https://doi.org/10.1145/3539597.3570367","url":null,"abstract":"Over the last few years, various multi-view graph clustering methods have shown promising performances. However, we argue that these methods can have limitations. In particular, they are often unnecessarily complex, leading to scalability problems that make them prohibitive for most real-world graph applications. Furthermore, many of them can handle only specific types of multi-view graphs. Another limitation is that the process of learning graph representations is separated from the clustering process, and in some cases these methods do not even learn a graph representation, which severely restricts their flexibility and usefulness. In this paper we propose a simple yet effective linear model that addresses the dual tasks of multi-view attributed graph representation learning and clustering in a unified framework. The model starts by performing a first-order neighborhood smoothing step for the different individual views, then gives each one a weight corresponding to its importance. Finally, an iterative process of simultaneous clustering and representation learning is performed w.r.t. the importance of each view, yielding a consensus embedding and partition of the graph. Our model is generic and can deal with any type of multi-view graph. Finally, we show through extensive experimentation that this simple model consistently achieves competitive performances w.r.t. state-of-the-art multi-view attributed graph clustering models, while at the same time having training times that are shorter, in some cases by orders of magnitude.","PeriodicalId":227804,"journal":{"name":"Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129228345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Adversarial Autoencoder for Unsupervised Time Series Anomaly Detection and Interpretation 用于无监督时间序列异常检测和解释的对抗性自编码器

Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining Pub Date : 2023-02-27 DOI: 10.1145/3539597.3570371

Xuanhao Chen, Liwei Deng, Yan Zhao, Kaiyu Zheng

{"title":"Adversarial Autoencoder for Unsupervised Time Series Anomaly Detection and Interpretation","authors":"Xuanhao Chen, Liwei Deng, Yan Zhao, Kaiyu Zheng","doi":"10.1145/3539597.3570371","DOIUrl":"https://doi.org/10.1145/3539597.3570371","url":null,"abstract":"In many complex systems, devices are typically monitored and generating massive multivariate time series. However, due to the complex patterns and little useful labeled data, it is a great challenge to detect anomalies from these time series data. Existing methods either rely on less regularizations, or require a large number of labeled data, leading to poor accuracy in anomaly detection. To overcome the limitations, in this paper, we propose an adversarial autoencoder anomaly detection and interpretation framework named DAEMON, which performs robustly for various datasets. The key idea is to use two discriminators to adversarially train an autoencoder to learn the normal pattern of multivariate time series, and thereafter use the reconstruction error to detect anomalies. The robustness of DAEMON is guaranteed by the regularization of hidden variables and reconstructed data using the adversarial generation method. An unsupervised approach used to detect anomalies is proposed. Moreover, in order to help operators better diagnose anomalies, DAEMON provides anomaly interpretation by computing the gradients of anomalous data. An extensive empirical study on real data offers evidence that the framework is capable of outperforming state-of-the-art methods in terms of the overall F1-score and interpretation accuracy for time series anomaly detection.","PeriodicalId":227804,"journal":{"name":"Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128993958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Learning to Distinguish Multi-User Coupling Behaviors for TV Recommendation 学习区分多用户耦合行为的电视推荐

Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining Pub Date : 2023-02-27 DOI: 10.1145/3539597.3570374

Jiarui Qin, Jiachen Zhu, Yankai Liu, Junchao Gao, J. Ying, Chaoxiong Liu, Ding Wang, Junlan Feng, Chao Deng, Xiaozheng Wang, Jian Jiang, Cong Liu, Yong Yu, Haitao Zeng, Weinan Zhang

{"title":"Learning to Distinguish Multi-User Coupling Behaviors for TV Recommendation","authors":"Jiarui Qin, Jiachen Zhu, Yankai Liu, Junchao Gao, J. Ying, Chaoxiong Liu, Ding Wang, Junlan Feng, Chao Deng, Xiaozheng Wang, Jian Jiang, Cong Liu, Yong Yu, Haitao Zeng, Weinan Zhang","doi":"10.1145/3539597.3570374","DOIUrl":"https://doi.org/10.1145/3539597.3570374","url":null,"abstract":"This paper is concerned with TV recommendation, where one major challenge is the coupling behavior issue that the behaviors of multiple users are coupled together and not directly distinguishable because the users share the same account. Unable to identify the current watching user and use the coupling behaviors directly could lead to sub-optimal recommendation results due to the noise introduced by the behaviors of other users. Most existing methods deal with this issue either by unsupervised clustering algorithms or depending on latent user representation learning with strong assumptions. However, they neglect to sophisticatedly model the current session behaviors, which carry the information of user identification. Another critical limitation of the existing models is the lack of supervision signal on distinguishing behaviors because they solely depend on the final click label, which is insufficient to provide effective supervision. To address the above problems, we propose the Coupling Sequence Model (COSMO) for TV recommendation. In COSMO, we design a session-aware co-attention mechanism that uses both the candidate item and session behaviors as the query to attend to the historical behaviors in a fine-grained manner. Furthermore, we propose to use the data of accounts with multiple devices (e.g., families with various TV sets), which means the behaviors of one account are generated on different devices. We regard the device information as weak supervision and propose a novel pair-wise attention loss for learning to distinguish the coupling behaviors. Extensive offline experiments and online A/B tests over a commercial TV service provider demonstrate the efficacy of COSMO compared to the existing models.","PeriodicalId":227804,"journal":{"name":"Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131224939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Can Pre-trained Language Models Understand Chinese Humor? 预训练的语言模型能理解中国人的幽默吗?

Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining Pub Date : 2023-02-27 DOI: 10.1145/3539597.3570431

Yuyan Chen, Zhixu Li, Jiaqing Liang, Yanghua Xiao, Bang Liu, Yunwen Chen

引用次数: 0

UnCommonSense in Action! Informative Negations for Commonsense Knowledge Bases 行动中的非常识!常识知识库的信息性否定

Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining Pub Date : 2023-02-27 DOI: 10.1145/3539597.3573027

Hiba Arnaout, Tuan-Phong Nguyen, S. Razniewski, G. Weikum

引用次数: 0