{"title":"Personalized Fashion Recommendations for Diverse Body Shapes and Local Preferences with Contrastive Multimodal Cross-Attention Network","authors":"Jianghong Ma, Huiyue Sun, Dezhao Yang, Haijun Zhang","doi":"10.1145/3637217","DOIUrl":"https://doi.org/10.1145/3637217","url":null,"abstract":"<p>Fashion recommendation has become a prominent focus in the realm of online shopping, with various tasks being explored to enhance the customer experience. Recent research has particularly emphasized fashion recommendation based on body shapes, yet a critical aspect of incorporating multimodal data relevance has been overlooked. In this paper, we present the Contrastive Multimodal Cross-Attention Network, a novel approach specifically designed for fashion recommendation catering to diverse body shapes. By incorporating multimodal representation learning and leveraging contrastive learning techniques, our method effectively captures both inter- and intra-sample relationships, resulting in improved accuracy in fashion recommendations tailored to individual body types. Additionally, we propose a locality-aware cross-attention module to align and understand the local preferences between body shapes and clothing items, thus enhancing the matching process. Experimental results conducted on a diverse dataset demonstrate the state-of-the-art performance achieved by our approach, reinforcing its potential to significantly enhance the personalized online shopping experience for consumers with varying body shapes and preferences.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"28 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2023-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138572071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ana NEACȘU, Jean-Christophe Pesquet, Corneliu Burileanu
{"title":"EMG-Based Automatic Gesture Recognition Using Lipschitz-Regularized Neural Networks","authors":"Ana NEACȘU, Jean-Christophe Pesquet, Corneliu Burileanu","doi":"10.1145/3635159","DOIUrl":"https://doi.org/10.1145/3635159","url":null,"abstract":"<p>This paper introduces a novel approach for building a robust Automatic Gesture Recognition system based on Surface Electromyographic (sEMG) signals, acquired at the forearm level. Our main contribution is to propose new constrained learning strategies that ensure robustness against adversarial perturbations by controlling the Lipschitz constant of the classifier. We focus on nonnegative neural networks for which accurate Lipschitz bounds can be derived, and we propose different spectral norm constraints offering robustness guarantees from a theoretical viewpoint. Experimental results on four publicly available datasets highlight that a good trade-off in terms of accuracy and performance is achieved. We then demonstrate the robustness of our models, compared to standard trained classifiers in four scenarios, considering both white-box and black-box attacks.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"1 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2023-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138553238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Eunji Lee, Sihyeon Kim, Sundong Kim, Soyeon Jung, Heeja Kim, Meeyoung Cha
{"title":"Explainable Product Classification for Customs","authors":"Eunji Lee, Sihyeon Kim, Sundong Kim, Soyeon Jung, Heeja Kim, Meeyoung Cha","doi":"10.1145/3635158","DOIUrl":"https://doi.org/10.1145/3635158","url":null,"abstract":"<p>The task of assigning internationally accepted commodity codes (aka HS codes) to traded goods is a critical function of customs offices. Like court decisions made by judges, this task follows the doctrine of precedent and can be nontrivial even for experienced officers. Together with the Korea Customs Service (KCS), we propose a first-ever explainable decision supporting model that suggests the most likely subheadings (i.e., the first six digits) of the HS code. The model also provides reasoning for its suggestion in the form of a document that is interpretable by customs officers. We evaluated the model using 5,000 cases that recently received a classification request. The results showed that the top-3 suggestions made by our model had an accuracy of 93.9% when classifying 925 challenging subheadings. A user study with 32 customs experts further confirmed that our algorithmic suggestions accompanied by explainable reasonings, can substantially reduce the time and effort taken by customs officers for classification reviews.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"10 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138529966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Survey on Graph Representation Learning Methods","authors":"Shima Khoshraftar, Aijun An","doi":"10.1145/3633518","DOIUrl":"https://doi.org/10.1145/3633518","url":null,"abstract":"<p>Graphs representation learning has been a very active research area in recent years. The goal of graph representation learning is to generate graph representation vectors that capture the structure and features of large graphs accurately. This is especially important because the quality of the graph representation vectors will affect the performance of these vectors in downstream tasks such as node classification, link prediction and anomaly detection. Many techniques have been proposed for generating effective graph representation vectors, which generally fall into two categories: traditional graph embedding methods and graph neural nets (GNN) based methods. These methods can be applied to both static and dynamic graphs. A static graph is a single fixed graph, while a dynamic graph evolves over time and its nodes and edges can be added or deleted from the graph. In this survey, we review the graph embedding methods in both traditional and GNN-based categories for both static and dynamic graphs and include the recent papers published until the time of submission. In addition, we summarize a number of limitations of GNNs and the proposed solutions to these limitations. Such a summary has not been provided in previous surveys. Finally, we explore some open and ongoing research directions for future work.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"15 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2023-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138529997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"E2Storyline: Visualizing the Relationship with Triplet Entities and Event Discovery","authors":"Yunchao Wang, Guodao Sun, Zihao Zhu, Tong Li, Ling Chen, Ronghua Liang","doi":"10.1145/3633519","DOIUrl":"https://doi.org/10.1145/3633519","url":null,"abstract":"<p>The narrative progression of events, evolving into a cohesive story, relies on the entity-entity relationships. Among the plethora of visualization techniques, storyline visualization has gained significant recognition for its effectiveness in offering an overview of story trends, revealing entity relationships, and facilitating visual communication. However, existing methods for storyline visualization often fall short in accurately depicting the specific relationships between entities. In this study, we present <i>E</i><sup>2</sup>Storyline, a novel approach that emphasizes simplicity and aesthetics of layout while effectively conveying entity-entity relationships to users. To achieve this, we begin by extracting entity-entity relationships from textual data and representing them as subject-predicate-object (SPO) triplets, thereby obtaining structured data. By considering three types of design requirements, we establish new optimization objectives and model the layout problem using multi-objective optimization (MOO) techniques. The aforementioned SPO triplets, together with time and event information, are incorporated into the optimization model to ensure a straightforward and easily comprehensible storyline layout. Through a qualitative user study, we determine that a pixel-based view is the most suitable method for displaying the relationships between entities. Finally, we apply <i>E</i><sup>2</sup>Storyline to real-world data, including movie synopses and live text commentaries. Through comprehensive case studies, we demonstrate that <i>E</i><sup>2</sup>Storyline enables users to better extract information from stories and comprehend the relationships between entities.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"215 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2023-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138529995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Responsible Recommendation Services with Blockchain Empowered Asynchronous Federated Learning","authors":"Waqar Ali, Rajesh Kumar, Xiangmin Zhou, Jie Shao","doi":"10.1145/3633520","DOIUrl":"https://doi.org/10.1145/3633520","url":null,"abstract":"<p>Privacy and trust are highly demanding in practical recommendation engines. Although Federated Learning (FL) has significantly addressed privacy concerns, commercial operators are still worried about several technical challenges while bringing FL into production. Additionally, classical FL has several intrinsic operational limitations such as single-point failure, data and model tampering, and heterogenic clients participating in the FL process. To address these challenges in practical recommenders, we propose a responsible recommendation generation framework based on blockchain-empowered asynchronous FL that can be adopted for any model-based recommender system. In standard FL settings, we build an additional aggregation layer in which multiple trusted nodes guided by a mediator component perform gradient aggregation to achieve an optimal model locally in a parallel fashion. The mediator partitions users into <i>K</i> clusters, and each cluster is represented by a cluster head. Once a cluster gets semi-global convergence, the cluster head transmits model gradients to the FL server for global aggregation. Additionally, the trusted cluster heads are responsible to submit the converged semi-global model to a blockchain to ensure tamper resilience. In our settings, an additional mediator component works like an independent observer that monitors the performance of each cluster head, updates a reward score, and records it into a digital ledger. Finally, evaluation results on three diversified benchmarks illustrate that the recommendation performance on selected measures is considerably comparable with the standard and federated version of a well-known neural collaborative filtering recommender.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"26 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2023-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138529969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hierarchical Pruning of Deep Ensembles with Focal Diversity","authors":"Yanzhao Wu, Ka-Ho Chow, Wenqi Wei, Ling Liu","doi":"10.1145/3633286","DOIUrl":"https://doi.org/10.1145/3633286","url":null,"abstract":"<p>Deep neural network ensembles combine the wisdom of multiple deep neural networks to improve the generalizability and robustness over individual networks. It has gained increasing popularity to study and apply deep ensemble techniques in the deep learning community. Some mission-critical applications utilize a large number of deep neural networks to form deep ensembles to achieve desired accuracy and resilience, which introduces high time and space costs for ensemble execution. However, it still remains a critical challenge whether a small subset of the entire deep ensemble can achieve the same or better generalizability and how to effectively identify these small deep ensembles for improving the space and time efficiency of ensemble execution. This paper presents a novel deep ensemble pruning approach, which can efficiently identify smaller deep ensembles and provide higher ensemble accuracy than the entire deep ensemble of a large number of member networks. Our hierarchical ensemble pruning approach (HQ) leverages three novel ensemble pruning techniques. First, we show that the focal ensemble diversity metrics can accurately capture the complementary capacity of the member networks of an ensemble team, which can guide ensemble pruning. Second, we design a focal ensemble diversity based hierarchical pruning approach, which will iteratively find high quality deep ensembles with low cost and high accuracy. Third, we develop a focal diversity consensus method to integrate multiple focal diversity metrics to refine ensemble pruning results, where smaller deep ensembles can be effectively identified to offer high accuracy, high robustness and high ensemble execution efficiency. Evaluated using popular benchmark datasets, we demonstrate that the proposed hierarchical ensemble pruning approach can effectively identify high quality deep ensembles with better classification generalizability while being more time and space efficient in ensemble decision making. We have released the source codes on GitHub at https://github.com/git-disl/HQ-Ensemble.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"7 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2023-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138542210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dynamic Weights and Prior Reward in Policy Fusion for Compound Agent Learning","authors":"Meng Xu, Yechao She, Yang Jin, Jianping Wang","doi":"10.1145/3623405","DOIUrl":"https://doi.org/10.1145/3623405","url":null,"abstract":"<p>In Deep Reinforcement Learning (DRL) domain, a compound learning task is often decomposed into several sub-tasks in a divide-and-conquer manner, each trained separately and then fused concurrently to achieve the original task, referred to as policy fusion. However, the state-of-the-art (SOTA) policy fusion methods treat the importance of sub-tasks equally throughout the task process, eliminating the possibility of the agent relying on different sub-tasks at various stages. To address this limitation, we propose a generic policy fusion approach, referred to as Policy Fusion Learning with Dynamic Weights and Prior Reward (PFLDWPR), to automate the time-varying selection of sub-tasks. Specifically, PFLDWPR produces a time-varying one-hot vector for sub-tasks to dynamically select a suitable sub-task and mask the rest throughout the entire task process, enabling the fused strategy to optimally guide the agent in executing the compound task. The sub-tasks with the dynamic one-hot vector are then aggregated to obtain the action policy for the original task. Moreover, we collect sub-tasks’s rewards at the pre-training stage as a prior reward, which, along with the current reward, is used to train the policy fusion network. Thus, this approach reduces fusion bias by leveraging prior experience. Experimental results under three popular learning tasks demonstrate that the proposed method significantly improves three SOTA policy fusion methods in terms of task duration, episode reward, and score difference.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"35 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2023-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138529964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"What Your Next Check-in Might Look Like: Next Check-in Behavior Prediction","authors":"Heli Sun, Chen Cao, Xuguang Chu, Tingting Hu, Junzhi Lu, Liang He, Zhi Wang, Hui He, Hui Xiong","doi":"10.1145/3625234","DOIUrl":"https://doi.org/10.1145/3625234","url":null,"abstract":"<p>In recent years, the next-POI recommendation has become a trending research topic in the field of trajectory data mining. For protection of user privacy, users’ complete GPS trajectories are difficult to obtain. The check-in information posted by users on social networks has become an important data source for Spatio-temporal Trajectory research. However, state-of-the-art methods neglect the social meaning and the information dissemination function of check-in behavior. The social meaning is an important reason why users are willing to post check-in on social networks, and the information dissemination function means, users can affect each other’s behavior by check-ins. The above characteristics of the check-in behavior make it different from the visiting behavior. We consider a new problem of predicting the next check-in behavior including the check-in time, the POI (point-of-interest) where the check-in is located, functional semantics of the POI, and so on. To solve the proposed problem, we build a multi-task learning model called DPMTM, and a pre-training module is designed to extract dynamic social semantics of check-in behaviors. Our results show that the DPMTM model works well in the check-in behavior problem.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"23 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2023-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138529971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jianhang Zhou, Guancheng Wang, Shaoning Zeng, Bob Zhang
{"title":"Learning with Euler Collaborative Representation for Robust Pattern Analysis","authors":"Jianhang Zhou, Guancheng Wang, Shaoning Zeng, Bob Zhang","doi":"10.1145/3625235","DOIUrl":"https://doi.org/10.1145/3625235","url":null,"abstract":"<p>The Collaborative Representation (CR) framework has provided various effective and efficient solutions to pattern analysis. By leveraging between discriminative coefficient coding (l<sub>2</sub> regularization) and the best reconstruction quality (collaboration), the CR framework can exploit discriminative patterns efficiently in high-dimensional space. Due to the limitations of its linear representation mechanism, the CR must sacrifice its superior efficiency for capturing the non-linear information with the kernel trick. Besides this, even if the coding is indispensable, there is no mechanism designed to keep the CR free from inevitable noise brought by real-world information systems. In addition, the CR only emphasizes exploiting discriminative patterns on coefficients rather than on the reconstruction. To tackle the problems of primitive CR with a unified framework, in this article we propose the Euler Collaborative Representation (E-CR) framework. Inferred from the Euler formula, in the proposed method, we map the samples to a complex space to capture discriminative and non-linear information without the high-dimensional hidden kernel space. Based on the proposed E-CR framework, we form two specific classifiers: the Euler Collaborative Representation based Classifier (E-CRC) and the Euler Probabilistic Collaborative Representation based Classifier (E-PROCRC). Furthermore, we specifically designed a robust algorithm for E-CR (termed as <i>R-E-CR</i>) to deal with the inevitable noises in real-world systems. Robust iterative algorithms have been specially designed for solving E-CRC and E-PROCRC. We correspondingly present a series of theoretical proofs to ensure the completeness of the theory for the proposed robust algorithms. We evaluated E-CR and R-E-CR with various experiments to show its competitive performance and efficiency.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"52 1","pages":""},"PeriodicalIF":5.0,"publicationDate":"2023-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138529994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}