Daniel Glake, Fabian Panse, Ulfia A. Lenfers, T. Clemen, N. Ritter
{"title":"Spatio-temporal Trajectory Learning using Simulation Systems","authors":"Daniel Glake, Fabian Panse, Ulfia A. Lenfers, T. Clemen, N. Ritter","doi":"10.1145/3511808.3557457","DOIUrl":"https://doi.org/10.1145/3511808.3557457","url":null,"abstract":"Spatio-temporal trajectories are essential factors for systems used in public transport, social ecology, and many other disciplines where movement is a relevant dynamic process. Each trajectory describes multiple state changes over time, induced by individual decision-making, based on psychological and social factors with physical constraints. Since a crucial factor of such systems is to reason about the potential trajectories in a closed environment, the primary problem is the realistic replication of individual decision making. Mental factors are often uncertain, not available or cannot be observed in reality. Thus, models for data generation must be derived from abstract studies using probabilities. To solve these problems, we present Multi-Agent-Trajectory-Learning (MATL), a state transition model to learn and generate human-like Spatio-temporal trajectory data. MATL combines Generative Adversarial Imitation Learning (GAIL) with a simulation system that uses constraints given by an agent-based model (Aℬℳ). We use GAIL to learn policies in conjunction with the Aℬℳ, resulting in a novel concept of individual decision making. Experiments with standard trajectory predictions show that our approach produces similar results to real-world observations.","PeriodicalId":389624,"journal":{"name":"Proceedings of the 31st ACM International Conference on Information & Knowledge Management","volume":"106 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133923163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kaiping Zheng, Thao Nguyen, Changshuo Liu, C. E. Goh, B. Ooi
{"title":"eDental: Managing Your Dental Care in Diet Diaries","authors":"Kaiping Zheng, Thao Nguyen, Changshuo Liu, C. E. Goh, B. Ooi","doi":"10.1145/3511808.3557215","DOIUrl":"https://doi.org/10.1145/3511808.3557215","url":null,"abstract":"The demand for satisfactory dental care management has attracted a great deal of attention from both dentists and patients. Reviews of existing systems and approaches reveal that they either fail to take into account patients' daily diets that are a significant risk factor for dental decay, or are too complicated for patients. To facilitate patients' tracking and management of their dietary risk factors for dental decay, and improve dentists' identification of decay-related dietary patterns, we develop a system called eDental, in collaboration with dentists and oral surgeons, as a mechanism to record users' detailed daily diet diaries by snapping food photos. The system identifies the food using a state-of-the-art deep learning model and analyzes patients' dental care conditions and potential dental risks. eDental is a full-fledged oral care system with easy-to-use user interfaces. In this demonstration, we showcase eDental's key functionalities for managing patients' dental care via diet diaries.","PeriodicalId":389624,"journal":{"name":"Proceedings of the 31st ACM International Conference on Information & Knowledge Management","volume":"36 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131753274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Graph-based Weakly Supervised Framework for Semantic Relevance Learning in E-commerce","authors":"Zhiyuan Zeng, Yuzhi Huang, Tianshu Wu, Hongbo Deng, Jian Xu, Bo Zheng","doi":"10.1145/3511808.3557143","DOIUrl":"https://doi.org/10.1145/3511808.3557143","url":null,"abstract":"Product searching is fundamental in online e-commerce systems, it needs to quickly and accurately find the products that users required. Relevance is essential for e-commerce search, which role is avoiding displaying products that do not match search intent and optimizing user experience. Measuring semantic relevance is necessary because distributional biases between search queries and product titles may lead to large lexical differences between relevant textual expressions. Several problems limit the performance of semantic relevance learning, including extremely long-tail product distribution and low-quality labeled data. Recent works attempt to conduct relevance learning through user behaviors. However, noisy user behavior can easily cause inadequately semantic modeling. Therefore, it is valuable but challenging to utilize user behavior in relevance learning. In this paper, we first propose a weakly supervised contrastive learning framework that focuses on how to provide effective semantic supervision and generate reasonable representation. We utilize topology structure information contained in a user behavior heterogeneous graph to design a semantically aware data construction strategy. Besides, we propose a contrastive learning framework suitable for e-commerce scenarios with targeted improvements in data augmentation and training objectives. For relevance calculation, we propose a novel hybrid method that combines fine-tuning and transfer learning. It eliminates the negative impacts caused by distributional bias and guarantees semantic matching capabilities. Extensive experiments and analyses show the promising performance of proposed methods in relevance learning.","PeriodicalId":389624,"journal":{"name":"Proceedings of the 31st ACM International Conference on Information & Knowledge Management","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132777121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lili Zhao, Linan Yue, Yanqing An, Yuren Zhang, Jun Yu, Qi Liu, Enhong Chen
{"title":"CPEE: Civil Case Judgment Prediction centering on the Trial Mode of Essential Elements","authors":"Lili Zhao, Linan Yue, Yanqing An, Yuren Zhang, Jun Yu, Qi Liu, Enhong Chen","doi":"10.1145/3511808.3557273","DOIUrl":"https://doi.org/10.1145/3511808.3557273","url":null,"abstract":"Civil Case Judgment Prediction (CCJP) is a fundamental task in the legal intelligence of the civil law system, which aims to automatically predict the judgment results on each plea of the plaintiff. Existing studies mainly focus on making judgment predictions only on a certain civil cause (e.g., the divorce dispute) by utilizing the fact descriptions and pleas of the plaintiff, which still suffer from the various causes and complicated legal essential elements in the real court. Thus, in this paper, we formalize CCJP as a multi-task learning problem and propose a CCJP method centering on the trial mode of essential elements, CPEE, which explores the practical judicial process and analyzes comprehensive legal essential elements to make judgment predictions. Specifically, we first construct three tasks (i.e., the predictions on the civil causes, law articles, and the final judgment on each plea) necessary for CCJP, that follow the judgment process and exploit the results of intermediate subtasks to make judgment predictions. Then we design a logic-enhanced network to predict the results of three tasks and conduct a comprehensive study of civil cases. Finally, owing to the interlinked and dependent relationships among each task, we adopt the cause prediction result to help predict law articles and incorporate them into final judgment prediction through a gate mechanism. Furthermore, since the existing dataset fails to provide sufficient case information, we construct a real-world CCJP dataset that contains various causes and comprehensive legal elements. Extensive experimental results on the dataset validate the effectiveness of our method.","PeriodicalId":389624,"journal":{"name":"Proceedings of the 31st ACM International Conference on Information & Knowledge Management","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130947070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SmartQuery: An Active Learning Framework for Graph Neural Networks through Hybrid Uncertainty Reduction","authors":"Xiaoting Li, Yuhang Wu, Vineeth Rakesh, Yusan Lin, Hao Yang, Fei Wang","doi":"10.1145/3511808.3557701","DOIUrl":"https://doi.org/10.1145/3511808.3557701","url":null,"abstract":"Graph neural networks have achieved significant success in representation learning. However, the performance gains come at a cost; acquiring comprehensive labeled data for training can be prohibitively expensive. Active learning mitigates this issue by searching the unexplored data space and prioritizing the selection of data to maximize model's performance gain. In this paper, we propose a novel method SMARTQUERY, a framework to learn a graph neural network with very few labeled nodes using a hybrid uncertainty reduction function. This is achieved using two key steps: (a) design a multi-stage active graph learning framework by exploiting diverse explicit graph information and (b) introduce label propagation to efficiently exploit known labels to assess the implicit embedding information. Using a comprehensive set of experiments on three network datasets, we demonstrate the competitive performance of our method against state-of-the-arts on very few labeled data (up to 5 labeled nodes per class).","PeriodicalId":389624,"journal":{"name":"Proceedings of the 31st ACM International Conference on Information & Knowledge Management","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131000739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-Aggregator Time-Warping Heterogeneous Graph Neural Network for Personalized Micro-Video Recommendation","authors":"Jinkun Han, Wei Li, Zhipeng Cai, Yingshu Li","doi":"10.1145/3511808.3557403","DOIUrl":"https://doi.org/10.1145/3511808.3557403","url":null,"abstract":"Micro-video recommendation is attracting global attention and becoming a popular daily service for people of all ages. Recently, Graph Neural Networks-based micro-video recommendation has displayed performance improvement for many kinds of recommendation tasks. However, the existing works fail to fully consider the characteristics of micro-videos, such as the high timeliness of news nature micro-video recommendation and sequential interactions of frequently changed interests. In this paper, a novel Multi-aggregator Time-warping Heterogeneous Graph Neural Network (MTHGNN) is proposed for personalized news nature micro-video recommendation based on sequential sessions, where characteristics of micro-videos are comprehensively studied, users' preference is mined via multi-aggregator, the temporal and dynamic changes of users' preference are captured, and timeliness is considered. Through the comparison with the state-of-the-arts, the experimental results validate the superiority of our MTHGNN model.","PeriodicalId":389624,"journal":{"name":"Proceedings of the 31st ACM International Conference on Information & Knowledge Management","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133583275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Subspace Co-clustering with Two-Way Graph Convolution","authors":"Chakib Fettal, Lazhar Labiod, M. Nadif","doi":"10.1145/3511808.3557706","DOIUrl":"https://doi.org/10.1145/3511808.3557706","url":null,"abstract":"Subspace clustering aims to cluster high dimensional data lying in a union of low-dimensional subspaces. It has shown good results on the task of image clustering but text clustering, using document-term matrices, proved more impervious to advances based on this approach. We hypothesize that this is because, compared to image data, text data is generally higher dimensional and sparser. This renders subspace clustering impractical in such a context. Here, we leverage subspace clustering for text by addressing these issues. We first extend the concept of subspace clustering to co-clustering, which has been extensively used on document-term matrices due to the resulting interplay between the document and term representations. We then address the sparsity problem through a two-way graph convolution, which promotes the grouping effect that has been credited for the effectiveness of some subspace clustering models. The proposed formulation results in an algorithm that is efficient both in terms of computational and spatial complexity. We show the competitiveness of our model w.r.t the state-of-the-art on document-term attributed graph datasets in terms of performance and efficiency.","PeriodicalId":389624,"journal":{"name":"Proceedings of the 31st ACM International Conference on Information & Knowledge Management","volume":"356 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133283973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"℘-MinHash Algorithm for Continuous Probability Measures: Theory and Application to Machine Learning","authors":"Ping Li, Xiaoyun Li, G. Samorodnitsky","doi":"10.1145/3511808.3557413","DOIUrl":"https://doi.org/10.1145/3511808.3557413","url":null,"abstract":"This paper studies the scale-invariant \"probability Jaccard'' (ProbJ), noted as ℐ℘, which is another variant of weighted Jaccard similarity. The standard and commonly used Jaccard index is not invariant of data scaling. Thus, the probability Jaccard can be a potentially useful extension to probability distributions. Before our paper, the problem of hashing the ℐ℘ for continuous probability measures is an open problem, where rigorous definitions and analysis are still absent in literature. In our work, we solve this problem systematically and completely. Specifically, we formalize the definition of ℐ℘ in continuous measure space, and propose a general ℘-MinHash sampling algorithm which generates samples following any target distribution, and preserves ℐ℘ between two distributions by the hash collision. In addition, a refined early stopping rule is proposed under a practical boundedness assumption. We validate the theory through simulation and experiments, and demonstrate the application of our method in machine learning problems.","PeriodicalId":389624,"journal":{"name":"Proceedings of the 31st ACM International Conference on Information & Knowledge Management","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115368562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abhisek Tiwari, Manisimha Manthena, S. Saha, P. Bhattacharyya, Minakshi Dhar, Sarbajeet Tiwari
{"title":"Dr. Can See: Towards a Multi-modal Disease Diagnosis Virtual Assistant","authors":"Abhisek Tiwari, Manisimha Manthena, S. Saha, P. Bhattacharyya, Minakshi Dhar, Sarbajeet Tiwari","doi":"10.1145/3511808.3557296","DOIUrl":"https://doi.org/10.1145/3511808.3557296","url":null,"abstract":"Artificial Intelligence-based clinical decision support is gaining ever-growing popularity and demand in both the research and industry communities. One such manifestation is automatic disease diagnosis, which aims to assist clinicians in conducting symptom investigations and disease diagnoses. When we consult with doctors, we often report and describe our health conditions with visual aids. Moreover, many people are unacquainted with several symptoms and medical terms, such as mouth ulcer and skin growth. Therefore, visual form of symptom reporting is a necessity. Motivated by the efficacy of visual form of symptom reporting, we propose and build a novel end-to-end Multi-modal Disease Diagnosis Virtual Assistant (MDD-VA) using reinforcement learning technique. In conversation, users' responses are heavily influenced by the ongoing dialogue context, and multi-modal responses appear to be of no difference. We also propose and incorporate a Context-aware Symptom Image Identification module that leverages discourse context in addition to the symptom image for identifying symptoms effectively. Furthermore, we first curate a multi-modal conversational medical dialogue corpus in English that is annotated with intent, symptoms, and visual information. The proposed MDD-VA outperforms multiple uni-modal baselines in both automatic and human evaluation, which firmly establishes the critical role of symptom information provided by visuals . The dataset and code are available at https://github.com/NLP-RL/DrCanSee","PeriodicalId":389624,"journal":{"name":"Proceedings of the 31st ACM International Conference on Information & Knowledge Management","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115756053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"LCD: Adaptive Label Correction for Denoising Music Recommendation","authors":"Quanyu Dai, Yalei Lv, Jieming Zhu, Junjie Ye, Zhenhua Dong, Rui Zhang, Shutao Xia, Ruiming Tang","doi":"10.1145/3511808.3557625","DOIUrl":"https://doi.org/10.1145/3511808.3557625","url":null,"abstract":"Music recommendation is usually modeled as a Click-Through Rate (CTR) prediction problem, which estimates the probability of a user listening a recommended song. CTR prediction can be formulated as a binary classification problem where the played songs are labeled as positive samples and the skipped songs are labeled as negative samples. However, such naively defined labels are noisy and biased in practice, causing inaccurate model predictions. In this work, we first identify serious label noise issues in an industrial music App, and then propose an adaptive Label Correction method for Denoising (LCD) music recommendation by ensembling the noisy labels and the model outputs to encourage a consensus prediction. Extensive offline experiments are conducted to evaluate the effectiveness of LCD on both industrial and public datasets. Furthermore, in a one-week online AB test, LCD also significantly increases both the music play count and time per user by 1% to 5%.","PeriodicalId":389624,"journal":{"name":"Proceedings of the 31st ACM International Conference on Information & Knowledge Management","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114480752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}