Hussein Al-Bazzaz, Muhammad Azam, Manar Amayri, Nizar Bouguila
{"title":"Explainable finite mixture of mixtures of bounded asymmetric generalized Gaussian and Uniform distributions learning for energy demand management","authors":"Hussein Al-Bazzaz, Muhammad Azam, Manar Amayri, Nizar Bouguila","doi":"10.1145/3653980","DOIUrl":"https://doi.org/10.1145/3653980","url":null,"abstract":"<p>We introduce a mixture of mixtures of bounded asymmetric generalized Gaussian and uniform distributions. Based on this framework, we propose model-based classification and model-based clustering algorithms. We develop an objective function for the minimum message length (MML) model selection criterion to discover the optimal number of clusters for the unsupervised approach of our proposed model. Given the crucial attention received by Explainable AI (XAI) in recent years, we introduce a method to interpret the predictions obtained from the proposed model in both learning settings by defining their boundaries in terms of the crucial features. Integrating Explainability within our proposed algorithm increases the credibility of the algorithm’s predictions since it would be explainable to the user’s perspective through simple If-Then statements using a small binary decision tree. In this paper, the proposed algorithm proves its reliability and superiority to several state-of-the-art machine learning algorithms within the following real-world applications: fault detection and diagnosis (FDD) in chillers, occupancy estimation and categorization of residential energy consumers.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140315532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chenhao Zhang, Weitong Chen, Wei Emma Zhang, Miao Xu
{"title":"Mitigating the Impact of Inaccurate Feedback in Dynamic Learning-to-Rank: A Study of Overlooked Interesting Items","authors":"Chenhao Zhang, Weitong Chen, Wei Emma Zhang, Miao Xu","doi":"10.1145/3653983","DOIUrl":"https://doi.org/10.1145/3653983","url":null,"abstract":"<p>Dynamic Learning-to-Rank (DLTR) is a method of updating a ranking policy in real-time based on user feedback, which may not always be accurate. Although previous DLTR work has achieved fair and unbiased DLTR under inaccurate feedback, they face the trade-off between fairness and user utility and also have limitations in the setting of feeding items. Existing DLTR works improve ranking utility by eliminating bias from inaccurate feedback on observed items, but the impact of another pervasive form of inaccurate feedback, overlooked or ignored interesting items, remains unclear. For example, users may browse the rankings too quickly to catch interesting items or miss interesting items because the snippets are not optimized enough. This phenomenon raises two questions: i) <i>Will overlooked interesting items affect the ranking results?</i> ii) <i>Is it possible to improve utility without sacrificing fairness if these effects are eliminated?</i> These questions are particularly relevant for small and medium-sized retailers who are just starting out and may have limited data, leading to the use of inaccurate feedback to update their models. In this paper, we find that inaccurate feedback in the form of overlooked interesting items has a negative impact on DLTR performance in terms of utility. To address this, we treat the overlooked interesting items as noise and propose a novel DLTR method, the Co-teaching Rank (CoTeR), that has good utility and fairness performance when inaccurate feedback is present in the form of overlooked interesting items. Our solution incorporates a co-teaching-based component with a customized loss function and data sampling strategy, as well as a mean pooling strategy to further accommodate newly added products without historical data. Through experiments, we demonstrate that CoTeRx not only enhances utilities but also preserves ranking fairness, and can smoothly handle newly introduced items.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140316893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jinwei Zeng, Guozhen Zhang, Jian Yuan, Yong Li, Depeng Jin
{"title":"Empowering Predictive Modeling by GAN-based Causal Information Learning","authors":"Jinwei Zeng, Guozhen Zhang, Jian Yuan, Yong Li, Depeng Jin","doi":"10.1145/3652610","DOIUrl":"https://doi.org/10.1145/3652610","url":null,"abstract":"<p>Generally speaking, we can easily specify many causal relationships in the prediction tasks of ubiquitous computing, such as human activity prediction, mobility prediction, and health prediction. However, most of the existing methods in these fields failed to take advantage of this prior causal knowledge. They typically make predictions only based on correlations in the data, which hinders the prediction performance in real-world scenarios because a distribution shift between training data and testing data generally exists. To fill in this gap, we proposed a <underline>G</underline>AN-based <underline>C</underline>ausal <underline>I</underline>nformation <underline>L</underline>earning prediction framework (GCIL), which can effectively leverage causal information to improve the prediction performance of existing ubiquitous computing deep learning models. Specifically, faced with a unique challenge that the treatment variable, referring to the intervention that influences the target in a causal relationship, is generally continuous in ubiquitous computing, the framework employs a representation learning approach with a GAN-based deep learning model. By projecting all variables except the treatment into a latent space, it effectively minimizes confounding bias and leverages the learned latent representation for accurate predictions. In this way, it deals with the continuous treatment challenge, and in the meantime, it can be easily integrated with existing deep learning models to lift their prediction performance in practical scenarios with causal information. Extensive experiments on two large-scale real-world datasets demonstrate its superior performance over multiple state-of-the-art baselines. We also propose an analytical framework together with extensive experiments to empirically show that our framework achieves better performance gain under two conditions: when the distribution differences between the training data and the testing data are more significant and when the treatment effects are larger. Overall, this work suggests that learning causal information is a promising way to improve the prediction performance of ubiquitous computing tasks. We open both our dataset and code<sup>1</sup> and call for more research attention in this area.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140172409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaojin Zhang, Yan Kang, Lixin Fan, Kai Chen, Qiang Yang
{"title":"A Meta-learning Framework for Tuning Parameters of Protection Mechanisms in Trustworthy Federated Learning","authors":"Xiaojin Zhang, Yan Kang, Lixin Fan, Kai Chen, Qiang Yang","doi":"10.1145/3652612","DOIUrl":"https://doi.org/10.1145/3652612","url":null,"abstract":"<p>Trustworthy Federated Learning (TFL) typically leverages protection mechanisms to guarantee privacy. However, protection mechanisms inevitably introduce utility loss or efficiency reduction while protecting data privacy. Therefore, protection mechanisms and their parameters should be carefully chosen to strike an optimal trade-off between <i>privacy leakage</i>, <i>utility loss</i>, and <i>efficiency reduction</i>. To this end, federated learning practitioners need tools to measure the three factors and optimize the trade-off between them to choose the protection mechanism that is most appropriate to the application at hand. Motivated by this requirement, we propose a framework that (1) formulates TFL as a problem of finding a protection mechanism to optimize the trade-off between privacy leakage, utility loss, and efficiency reduction and (2) formally defines bounded measurements of the three factors. We then propose a meta-learning algorithm to approximate this optimization problem and find optimal protection parameters for representative protection mechanisms, including Randomization, Homomorphic Encryption, Secret Sharing, and Compression. We further design estimation algorithms to quantify these found optimal protection parameters in a practical horizontal federated learning setting and provide a theoretical analysis of the estimation error.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2024-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140152199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rongchang Li, Tianyang Xu, Xiao-Jun Wu, Zhongwei Shen, Josef Kittler
{"title":"Perceiving Actions via Temporal Video Frame Pairs","authors":"Rongchang Li, Tianyang Xu, Xiao-Jun Wu, Zhongwei Shen, Josef Kittler","doi":"10.1145/3652611","DOIUrl":"https://doi.org/10.1145/3652611","url":null,"abstract":"<p>Video action recognition aims to classify the action category in given videos. In general, semantic-relevant video frame pairs reflect significant action patterns such as object appearance variation and abstract temporal concepts like speed, rhythm, etc. However, existing action recognition approaches tend to holistically extract spatiotemporal features. Though effective, there is still a risk of neglecting the crucial action features occurring across frames with a long-term temporal span. Motivated by this, in this paper, we propose to perceive actions via frame pairs directly and devise a novel Nest Structure with frame pairs as basic units. Specifically, we decompose a video sequence into all possible frame pairs and hierarchically organize them according to temporal frequency and order, thus transforming the original video sequence into a Nest Structure. Through naturally decomposing actions, the proposed structure can flexibly adapt to diverse action variations such as speed or rhythm changes. Next, we devise a Temporal Pair Analysis module (TPA) to extract discriminative action patterns based on the proposed Nest Structure. The designed TPA module consists of a pair calculation part to calculate the pair features and a pair fusion part to hierarchically fuse the pair features for recognizing actions. The proposed TPA can be flexibly integrated into existing backbones, serving as a side branch to capture various action patterns from multi-level features. Extensive experiments show that the proposed TPA module can achieve consistent improvements over several typical backbones, reaching or updating CNN-based SOTA results on several challenging action recognition benchmarks.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2024-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140156575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Ensuring Fairness and Gradient Privacy in Personalized Heterogeneous Federated Learning","authors":"Cody Lewis, Vijay Varadharajan, Nasimul Noman, Uday Tupakula","doi":"10.1145/3652613","DOIUrl":"https://doi.org/10.1145/3652613","url":null,"abstract":"<p>With the increasing tension between conflicting requirements of the availability of large amounts of data for effective machine learning based analysis, and for ensuring their privacy, the paradigm of federated learning has emerged, a distributed machine learning setting where the clients provide only the machine learning model updates to the server rather than the actual data for decision making. However, the distributed nature of federated learning raises specific challenges related to fairness in a heterogeneous setting. This motivates the focus of our paper, on the heterogeneity of client devices having different computational capabilities and their impact on fairness in federated learning. Furthermore, our aim is to achieve fairness in heterogeneity while ensuring privacy. As far as we are aware there are no existing works that address all these three aspects of fairness, device heterogeneity and privacy simultaneously in federated learning. In this paper, we propose a novel federated learning algorithm with personalization in the context of heterogeneous devices while maintaining compatibility with the gradient privacy preservation techniques of secure aggregation. We analyze the proposed federated learning algorithm under different environments with different datasets, and show that it achieves performance close to or greater than the state-of-the-art in heterogeneous device personalized federated learning. We also provide theoretical proofs for the fairness and convergence properties of our proposed algorithm.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2024-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140124512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multimodal Dialogue Systems via Capturing Context-aware Dependencies and Ordinal Information of Semantic Elements","authors":"Weidong He, Zhi Li, Hao Wang, Tong Xu, Zhefeng Wang, Baoxing Huai, Nicholas Jing Yuan, Enhong Chen","doi":"10.1145/3645099","DOIUrl":"https://doi.org/10.1145/3645099","url":null,"abstract":"<p>The topic of multimodal conversation systems has recently garnered significant attention across various industries, including travel, retail, and others. While pioneering works in this field have shown promising performance, they often focus solely on context information at the utterance level, overlooking the context-aware dependencies of multimodal semantic elements like words and images. Furthermore, the ordinal information of images, which indicates the relevance between visual context and users’ demands, remains underutilized during the integration of visual content. Additionally, the exploration of how to effectively utilize corresponding attributes provided by users when searching for desired products is still largely unexplored. To address these challenges, we propose a Position-aware Multimodal diAlogue system with semanTic Elements, abbreviated as PMATE. Specifically, to obtain semantic representations at the element-level, we first unfold the multimodal historical utterances and devise a position-aware multimodal element-level encoder. This component considers all images that may be relevant to the current turn and introduces a novel position-aware image selector to choose related images before fusing the information from the two modalities. Finally, we present a knowledge-aware two-stage decoder and an attribute-enhanced image searcher for the tasks of generating textual responses and selecting image responses, respectively. We extensively evaluate our model on two large-scale multimodal dialog datasets, and the results of our experiments demonstrate that our approach outperforms several baseline methods.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140152192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shenghai Zhong, Shu Guo, Jing Liu, Hongren Huang, Lihong Wang, Jianxin Li, Chen Li, Yiming Hei
{"title":"Self-supervised Bipartite Graph Representation Learning: A Dirichlet Max-margin Matrix Factorization Approach","authors":"Shenghai Zhong, Shu Guo, Jing Liu, Hongren Huang, Lihong Wang, Jianxin Li, Chen Li, Yiming Hei","doi":"10.1145/3645098","DOIUrl":"https://doi.org/10.1145/3645098","url":null,"abstract":"<p>Bipartite graph representation learning aims to obtain node embeddings by compressing sparse vectorized representations of interactions between two types of nodes, e.g., users and items. Incorporating structural attributes among homogeneous nodes, such as user communities, improves the identification of similar interaction preferences, namely, user/item embeddings, for downstream tasks. However, existing methods often fail to proactively discover and fully utilize these latent structural attributes. Moreover, the manual collection and labeling of structural attributes is always costly. In this paper, we propose a novel approach called Dirichlet Max-margin Matrix Factorization (DM3F), which adopts a self-supervised strategy to discover latent structural attributes and model discriminative node representations. Specifically, in self-supervised learning, our approach generates pseudo group labels (i.e., structural attributes) as a supervised signal using the Dirichlet process without relying on manual collection and labeling, and employs them in a max-margin classification. Additionally, we introduce a Variational Markov Chain Monte Carlo algorithm (Variational MCMC) to effectively update the parameters. The experimental results on six real datasets demonstrate that, in the majority of cases, the proposed method outperforms existing approaches based on matrix factorization and neural networks. Furthermore, the modularity analysis confirms the effectiveness of our model in capturing structural attributes to produce high-quality user embeddings.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2024-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140070486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Deconfounded Cross-modal Matching for Content-based Micro-video Background Music Recommendation","authors":"Jing Yi, Zhenzhong Chen","doi":"10.1145/3650042","DOIUrl":"https://doi.org/10.1145/3650042","url":null,"abstract":"<p>Object-oriented micro-video background music recommendation is a complicated task where the matching degree between videos and background music is a major issue. However, music selections in user-generated content (UGC) are prone to selection bias caused by historical preferences of uploaders. Since historical preferences are not fully reliable and may reflect obsolete behaviors, over-reliance on them should be avoided as knowledge and interests dynamically evolve. In this paper, we propose a Deconfounded Cross-Modal (DecCM) matching model to mitigate such bias. Specifically, uploaders’ personal preferences of music genres are identified as confounders that spuriously correlate music embeddings and background music selections, causing the learned system to over-recommend music from majority groups. To resolve such confounders, backdoor adjustment is utilized to deconfound the spurious correlation between music embeddings and prediction scores. We further utilize Monte Carlo (MC) estimator with batch-level average as the approximations to avoid integrating the entire confounder space calculated by the adjustment. Furthermore, we design a teacher-student network to utilize the matching of music videos, which is professionally-generated content (PGC) with specialized matching, to better recommend content-matching background music. The PGC data is modeled by a teacher network to guide the matching of uploader-selected UGC data of student network by Kullback-Leibler-based knowledge transfer. Extensive experiments on the TT-150k-genre dataset demonstrate the effectiveness of the proposed method. The code is publicly available on: https://github.com/jing-1/DecCM.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2024-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140046964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ivan Sekulić, Mohammad Aliannejadi, Fabio Crestani
{"title":"Analysing Utterances in LLM-based User Simulation for Conversational Search","authors":"Ivan Sekulić, Mohammad Aliannejadi, Fabio Crestani","doi":"10.1145/3650041","DOIUrl":"https://doi.org/10.1145/3650041","url":null,"abstract":"<p>Clarifying the underlying user information need by asking clarifying questions is an important feature of modern conversational search systems. However, evaluation of such systems through answering prompted clarifying questions requires significant human effort, which can be time-consuming and expensive. In our recent work, we proposed an approach to tackle these issues with a user simulator, <i>USi</i>. Given a description of an information need, <i>USi</i> is capable of automatically answering clarifying questions about the topic throughout the search session. However, while the answers generated by <i>USi</i> are both in line with the underlying information need and in natural language, a deeper understanding of such utterances is lacking. Thus, in this work, we explore utterance formulation of large language model (LLM) based user simulators. To this end, we first analyze the differences between <i>USi</i>, based on GPT-2, and the next generation of generative LLMs, such as GPT-3. Then, to gain a deeper understanding of LLM-based utterance generation, we compare the generated answers to the recently proposed set of patterns of human-based query reformulations. Finally, we discuss potential applications, as well as limitations, of LLM-based user simulators and outline promising directions for future work on the topic.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2024-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140036123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}