{"title":"Causal Discovery from Temporal Data: An Overview and New Perspectives","authors":"Chang Gong, Chuzhe Zhang, Di Yao, Jingping Bi, Wenbin Li, YongJun Xu","doi":"10.1145/3705297","DOIUrl":"https://doi.org/10.1145/3705297","url":null,"abstract":"Temporal data, representing chronological observations of complex systems, has always been a typical data structure that can be widely generated by many domains, such as industry, finance, healthcare and climatology. Analyzing the underlying structures, <jats:italic>i.e.</jats:italic> , the causal relations, could be extremely valuable for various applications. Recently, causal discovery from temporal data has been considered as an interesting yet critical task and attracted much research attention. According to the nature and structure of temporal data, existing causal discovery works can be divided into two highly correlated categories <jats:italic>i.e.</jats:italic> , multivariate time series causal discovery, and event sequence causal discovery. However, most previous surveys are only focused on the multivariate time series causal discovery but ignore the second category. In this paper, we specify the similarity between the two categories and provide an overview of existing solutions. Furthermore, we provide public datasets, evaluation metrics and new perspectives for temporal data causal discovery.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"184 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142694336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Naeem Ullah, Javed Ali Khan, Ivanoe De Falco, Giovanna Sannino
{"title":"Explainable Artificial Intelligence: Importance, Use Domains, Stages, Output Shapes, and Challenges","authors":"Naeem Ullah, Javed Ali Khan, Ivanoe De Falco, Giovanna Sannino","doi":"10.1145/3705724","DOIUrl":"https://doi.org/10.1145/3705724","url":null,"abstract":"There is an urgent need in many application areas for eXplainable ArtificiaI Intelligence (XAI) approaches to boost people’s confidence and trust in Artificial Intelligence methods. Current works concentrate on specific aspects of XAI and avoid a comprehensive perspective. This study undertakes a systematic survey of importance, approaches, methods, and application domains to address this gap and provide a comprehensive understanding of the XAI domain. Applying the Systematic Literature Review approach has resulted in finding and discussing 155 papers, allowing a wide discussion on the strengths, limitations, and challenges of XAI methods and future research directions.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"27 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142694355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Seyma Yucer, Furkan Tektas, Noura Al Moubayed, Toby Breckon
{"title":"Racial Bias within Face Recognition: A Survey","authors":"Seyma Yucer, Furkan Tektas, Noura Al Moubayed, Toby Breckon","doi":"10.1145/3705295","DOIUrl":"https://doi.org/10.1145/3705295","url":null,"abstract":"Facial recognition is one of the most academically studied and industrially developed areas within computer vision where we readily find associated applications deployed globally. This widespread adoption has uncovered significant performance variation across subjects of different racial profiles leading to focused research attention on racial bias within face recognition spanning both current causation and future potential solutions. In support, this study provides an extensive taxonomic review of research on racial bias within face recognition exploring every aspect and stage of the associated facial processing pipeline. Firstly, we discuss the problem definition of racial bias, starting with race definition, grouping strategies, and the societal implications of using race or race-related groupings. Secondly, we divide the common face recognition processing pipeline into four stages: image acquisition, face localisation, face representation, face verification and identification, and review the relevant corresponding literature associated with each stage. The overall aim is to provide comprehensive coverage of the racial bias problem with respect to each and every stage of the face recognition processing pipeline whilst also highlighting the potential pitfalls and limitations of contemporary mitigation strategies that need to be considered within future research endeavours or commercial applications alike.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"115 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142690743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Survey of Machine Learning for Urban Decision Making: Applications in Planning, Transportation, and Healthcare","authors":"Yu Zheng, Qianyue Hao, Jingwei Wang, Changzheng Gao, Jinwei Chen, Depeng Jin, Yong Li","doi":"10.1145/3695986","DOIUrl":"https://doi.org/10.1145/3695986","url":null,"abstract":"Developing smart cities is vital for ensuring sustainable development and improving human well-being. One critical aspect of building smart cities is designing intelligent methods to address various decision-making problems that arise in urban areas. As machine learning techniques continue to advance rapidly, a growing body of research has been focused on utilizing these methods to achieve intelligent urban decision making. In this survey, we conduct a systematic literature review on the application of machine learning methods in urban decision making, with a focus on planning, transportation, and healthcare. First, we provide a taxonomy based on typical applications of machine learning methods for urban decision making. We then present background knowledge on these tasks and the machine learning techniques that have been adopted to solve them. Next, we examine the challenges and advantages of applying machine learning in urban decision making, including issues related to urban complexity, urban heterogeneity and computational cost. Afterward and primarily, we elaborate on the existing machine learning methods that aim to solve urban decision making tasks in planning, transportation, and healthcare, highlighting their strengths and limitations. Finally, we discuss open problems and the future directions of applying machine learning to enable intelligent urban decision making, such as developing foundation models and combining reinforcement learning algorithms with human feedback. We hope this survey can help researchers in related fields understand the recent progress made in existing works, and inspire novel applications of machine learning in smart cities.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"15 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142691069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yujia Qin, Shengding Hu, Yankai Lin, Weize Chen, Ning Ding, Ganqu Cui, Zheni Zeng, Xuanhe Zhou, Yufei Huang, Chaojun Xiao, Chi Han, Yi Ren Fung, Yusheng Su, Huadong Wang, Cheng Qian, Runchu Tian, Kunlun Zhu, Shihao Liang, Xingyu Shen, Bokai Xu, Zhen Zhang, Yining Ye, Bowen Li, Ziwei Tang, Jing Yi, Yuzhang Zhu, Zhenning Dai, Lan Yan, Xin Cong, Yaxi Lu, Weilin Zhao, Yuxiang Huang, Junxi Yan, Xu Han, Xian Sun, Dahai Li, Jason Phang, Cheng Yang, Tongshuang Wu, Heng Ji, Guoliang Li, Zhiyuan Liu, Maosong Sun
{"title":"Tool Learning with Foundation Models","authors":"Yujia Qin, Shengding Hu, Yankai Lin, Weize Chen, Ning Ding, Ganqu Cui, Zheni Zeng, Xuanhe Zhou, Yufei Huang, Chaojun Xiao, Chi Han, Yi Ren Fung, Yusheng Su, Huadong Wang, Cheng Qian, Runchu Tian, Kunlun Zhu, Shihao Liang, Xingyu Shen, Bokai Xu, Zhen Zhang, Yining Ye, Bowen Li, Ziwei Tang, Jing Yi, Yuzhang Zhu, Zhenning Dai, Lan Yan, Xin Cong, Yaxi Lu, Weilin Zhao, Yuxiang Huang, Junxi Yan, Xu Han, Xian Sun, Dahai Li, Jason Phang, Cheng Yang, Tongshuang Wu, Heng Ji, Guoliang Li, Zhiyuan Liu, Maosong Sun","doi":"10.1145/3704435","DOIUrl":"https://doi.org/10.1145/3704435","url":null,"abstract":"Humans possess an extraordinary ability to create and utilize tools. With the advent of foundation models, artificial intelligence systems have the potential to be equally adept in tool use as humans. This paradigm, which is dubbed as <jats:italic>tool learning with foundation models</jats:italic> , combines the strengths of tools and foundation models to achieve enhanced accuracy, efficiency, and automation in problem-solving. This paper presents a systematic investigation and comprehensive review of tool learning. We first introduce the background of tool learning, including its cognitive origins, the paradigm shift of foundation models, and the complementary roles of tools and models. Then we recapitulate existing tool learning research and formulate a general framework: starting from understanding the user instruction, models should learn to decompose a complex task into several subtasks, dynamically adjust their plan through reasoning, and effectively conquer each sub-task by selecting appropriate tools. We also discuss how to train models for improved tool-use capabilities and facilitate generalization in tool learning. Finally, we discuss several open problems that require further investigation, such as ensuring trustworthy tool use, enabling tool creation with foundation models, and addressing personalization challenges. Overall, we hope this paper could inspire future research in integrating tools with foundation models.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"15 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142684949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
David Jin, Niclas Kannengießer, Sascha Rank, Ali Sunyaev
{"title":"Collaborative Distributed Machine Learning","authors":"David Jin, Niclas Kannengießer, Sascha Rank, Ali Sunyaev","doi":"10.1145/3704807","DOIUrl":"https://doi.org/10.1145/3704807","url":null,"abstract":"Various collaborative distributed machine learning (CDML) systems, including federated learning systems and swarm learning systems, with different key traits were developed to leverage resources for the development and use of machine learning (ML) models in a confidentiality-preserving way. To meet use case requirements, suitable CDML systems need to be selected. However, comparison between CDML systems to assess their suitability for use cases is often difficult. To support comparison of CDML systems and introduce scientific and practical audiences to the principal functioning and key traits of CDML systems, this work presents a CDML system conceptualization and CDML archetypes.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"14 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142678439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Motivations, Challenges, Best Practices, and Benefits for Bots and Conversational Agents in Software Engineering: A Multivocal Literature Review","authors":"Stefano Lambiase, Gemma Catolino, Fabio Palomba, Filomena Ferrucci","doi":"10.1145/3704806","DOIUrl":"https://doi.org/10.1145/3704806","url":null,"abstract":"<jats:italic> Bots </jats:italic> are software systems designed to support users by automating specific processes, tasks, or activities. When these systems implement a conversational component to interact with users, they are also known as <jats:italic> conversational agents </jats:italic> or <jats:italic>chatbots</jats:italic> . Bots—particularly in their conversation-oriented version and AI-powered—have seen increased adoption over time for software development and engineering purposes. Despite their exciting potential, which has been further enhanced by the advent of Generative AI and Large Language Models, bots still face challenges in terms of development and integration into the development cycle, as practitioners report that bots can add difficulties rather than provide improvements. In this work, we aim to provide a taxonomy for characterizing bots, as well as a series of challenges for their adoption in software engineering, accompanied by potential mitigation strategies. To achieve our objectives, we conducted a <jats:italic>multivocal literature review</jats:italic> , examining both research and practitioner literature. Through such an approach, we hope to contribute to both researchers and practitioners by providing (i) a series of future research directions to pursue, (ii) a list of strategies to adopt for improving the use of bots for software engineering purposes, and (iii) fostering technology and knowledge transfer from the research field to practice—one of the primary goals of multivocal literature reviews.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"23 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142678491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Corinne Allaart, Saba Amiri, Henri Bal, Adam Belloum, Leon Gommans, Aart van Halteren, Sander Klous
{"title":"Private and Secure Distributed Deep Learning: A Survey","authors":"Corinne Allaart, Saba Amiri, Henri Bal, Adam Belloum, Leon Gommans, Aart van Halteren, Sander Klous","doi":"10.1145/3703452","DOIUrl":"https://doi.org/10.1145/3703452","url":null,"abstract":"Traditionally, deep learning practitioners would bring data into a central repository for model training and inference. Recent developments in distributed learning, such as federated learning and deep learning as a service (DLaaS) do not require centralized data and instead push computing to where the distributed datasets reside. These decentralized training schemes, however, introduce additional security and privacy challenges. This survey first structures the field of distributed learning into two main paradigms and then provides an overview of the recently published protective measures for each. This work highlights both secure training methods as well as private inference measures. Our analyses show that recent publications while being highly dependent on the problem definition, report progress in terms of security, privacy, and efficiency. Nevertheless, we also identify several current issues within the private and secure distributed deep learning (PSDDL) field that require more research. We discuss these issues and provide a general overview of how they might be resolved.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"165 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142642913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shaobo Zhang, Yimeng Pan, Qin Liu, Zheng Yan, Kim-Kwang Raymond Choo, Guojun Wang
{"title":"Backdoor Attacks and Defenses Targeting Multi-Domain AI Models: A Comprehensive Review","authors":"Shaobo Zhang, Yimeng Pan, Qin Liu, Zheng Yan, Kim-Kwang Raymond Choo, Guojun Wang","doi":"10.1145/3704725","DOIUrl":"https://doi.org/10.1145/3704725","url":null,"abstract":"Since the emergence of security concerns in artificial intelligence (AI), there has been significant attention devoted to the examination of backdoor attacks. Attackers can utilize backdoor attacks to manipulate model predictions, leading to significant potential harm. However, current research on backdoor attacks and defenses in both theoretical and practical fields still has many shortcomings. To systematically analyze these shortcomings and address the lack of comprehensive reviews, this paper presents a comprehensive and systematic summary of both backdoor attacks and defenses targeting multi-domain AI models. Simultaneously, based on the design principles and shared characteristics of triggers in different domains and the implementation stages of backdoor defense, this paper proposes a new classification method for backdoor attacks and defenses. We use this method to extensively review backdoor attacks in the fields of computer vision and natural language processing, and also examine the current applications of backdoor attacks in audio recognition, video action recognition, multimodal tasks, time series tasks, generative learning, and reinforcement learning, while critically analyzing the open problems of various backdoor attack techniques and defense strategies. Finally, this paper builds upon the analysis of the current state of AI security to further explore potential future research directions for backdoor attacks and defenses.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"5 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142642616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anton Danholt Lautrup, Tobias Hyrup, Arthur Zimek, Peter Schneider-Kamp
{"title":"Systematic Review of Generative Modelling Tools and Utility Metrics for Fully Synthetic Tabular Data","authors":"Anton Danholt Lautrup, Tobias Hyrup, Arthur Zimek, Peter Schneider-Kamp","doi":"10.1145/3704437","DOIUrl":"https://doi.org/10.1145/3704437","url":null,"abstract":"Sharing data with third parties is essential for advancing science, but it is becoming more and more difficult with the rise of data protection regulations, ethical restrictions, and growing fear of misuse. Fully synthetic data, which transcends anonymisation, may be the key to unlocking valuable untapped insights stored away in secured data vaults. This review examines current synthetic data generation methods and their utility measurement. We found that more traditional generative models such as Classification and Regression Tree models alongside Bayesian Networks remain highly relevant and are still capable of surpassing deep learning alternatives like Generative Adversarial Networks. However, our findings also display the same lack of agreement on metrics for evaluation, uncovered in earlier reviews, posing a persistent obstacle to advancing the field. We propose a tool for evaluating the utility of synthetic data and illustrate how it can be applied to three synthetic data generation models. By streamlining evaluation and promoting agreement on metrics, researchers can explore novel methods and generate compelling results that will convince data curators and lawmakers to embrace synthetic data. Our review emphasises the potential of synthetic data and highlights the need for greater collaboration and standardisation to unlock its full potential.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"21 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142637686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}