Chiyu Zhang, Lu Zhou, Xiaogang Xu, Jiafei Wu, Zhe Liu
{"title":"Adversarial Attacks of Vision Tasks in the Past 10 Years: A Survey","authors":"Chiyu Zhang, Lu Zhou, Xiaogang Xu, Jiafei Wu, Zhe Liu","doi":"10.1145/3743126","DOIUrl":"https://doi.org/10.1145/3743126","url":null,"abstract":"With the advent of Large Vision-Language Models (LVLMs), new attack vectors, such as cognitive bias, prompt injection, and jailbreaking, have emerged. Understanding these attacks promotes system robustness improvement and neural networks demystification. However, existing surveys often target attack taxonomy and lack in-depth analysis like 1) unified insights into adversariality, transferability, and generalization; 2) detailed evaluations framework; 3) motivation-driven attack categorizations; and 4) an integrated perspective on both traditional and LVLM attacks. This article addresses these gaps by offering a thorough summary of traditional and LVLM adversarial attacks, emphasizing their connections and distinctions, and providing actionable insights for future research.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"85 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2025-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144237154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shreyas Chaudhari, Pranjal Aggarwal, Vishvak Murahari, Tanmay Rajpurohit, Ashwin Kalyan, Karthik Narasimhan, Ameet Deshpande, Bruno Castro da Silva
{"title":"RLHF Deciphered: A Critical Analysis of Reinforcement Learning from Human Feedback for LLMs","authors":"Shreyas Chaudhari, Pranjal Aggarwal, Vishvak Murahari, Tanmay Rajpurohit, Ashwin Kalyan, Karthik Narasimhan, Ameet Deshpande, Bruno Castro da Silva","doi":"10.1145/3743127","DOIUrl":"https://doi.org/10.1145/3743127","url":null,"abstract":"A significant challenge in training large language models (LLMs) as effective assistants is aligning them with human preferences. Reinforcement learning from human feedback (RLHF) has emerged as a promising solution. However, our understanding of RLHF is often limited to initial design choices. This paper analyzes RLHF through reinforcement learning principles, focusing on the reward model. It examines modeling choices and function approximation caveats, highlighting assumptions about reward expressivity and revealing limitations like incorrect generalization, model misspecification, and sparse feedback. A categorical review of current literature provides insights for researchers to understand the challenges of RLHF and build upon existing methods.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"17 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2025-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144228644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards Message Brokers for Generative AI: Survey, Challenges, and Opportunities","authors":"Alaa Saleh, Roberto Morabito, Schahram Dustdar, Sasu Tarkoma, Susanna Pirttikangas, Lauri Lovén","doi":"10.1145/3742891","DOIUrl":"https://doi.org/10.1145/3742891","url":null,"abstract":"In today’s digital world, GenAI is becoming increasingly prevalent by enabling unparalleled content generation capabilities for a wide range of advanced applications. This surge in adoption has sparked a significant increase in demand for data-centric GenAI models spanning the distributed edge-cloud continuum, placing increasing demands on communication infrastructures, highlighting the necessity for robust communication solutions. Central to this need are message brokers, which serve as essential channels for data transfer within various system components. This survey aims to delve into a comprehensive analysis of traditional and modern message brokers based on a variety of criteria, highlighting their critical role in enabling efficient data exchange in distributed AI systems. Furthermore, we explore the intrinsic constraints that the design and operation of each message broker might impose, highlighting their impact on real-world applicability. Finally, this study explores the enhancement of message broker mechanisms tailored to GenAI environments. It considers key factors such as scalability, semantic communication, and distributed inference that can guide future innovations and infrastructure advancements in the realm of GenAI data communication.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"25 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2025-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144228643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andrea Cini, Ivan Marisca, Daniele Zambon, Cesare Alippi
{"title":"Graph Deep Learning for Time Series Forecasting","authors":"Andrea Cini, Ivan Marisca, Daniele Zambon, Cesare Alippi","doi":"10.1145/3742784","DOIUrl":"https://doi.org/10.1145/3742784","url":null,"abstract":"Graph deep learning methods have become popular tools to process collections of correlated time series. Unlike traditional multivariate forecasting methods, graph-based predictors leverage pairwise relationships by conditioning forecasts on graphs spanning the time series collection. The conditioning takes the form of architectural inductive biases on the forecasting architecture, resulting in a family of models called spatiotemporal graph neural networks. These biases allow for training global forecasting models on large collections of time series while localizing predictions w.r.t. each element in the set (nodes) by accounting for correlations among them (edges). Recent advances in graph neural networks and deep learning for time series forecasting make the adoption of such processing framework appealing and timely. However, most studies focus on refining existing architectures by exploiting modern deep-learning practices. Conversely, foundational and methodological aspects have not been subject to systematic investigation. To fill this void, this tutorial paper aims to introduce a comprehensive methodological framework formalizing the forecasting problem and providing design principles for graph-based predictors, as well as methods to assess their performance. In addition, together with an overview of the field, we provide design guidelines and best practices, as well as an in-depth discussion of open challenges and future directions.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"41 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144210783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Benchmarking Relaxed Differential Privacy in Private Learning: A Comparative Survey","authors":"Zhaolong Zheng, Lin Yao, Haibo Hu, Guowei Wu","doi":"10.1145/3729216","DOIUrl":"https://doi.org/10.1145/3729216","url":null,"abstract":"Differential privacy (DP), a rigorously quantifiable privacy preservation technique, has found widespread application within the domain of machine learning. As DP techniques are implemented in machine learning algorithms, a significant and intricate trade-off between privacy and utility emerges, garnering extensive attention from researchers. In the pursuit of striking a delicate equilibrium between safeguarding sensitive data and optimizing its utility, researchers have introduced various variants of Relaxed Differential Privacy (RDP) definitions. These nuanced formulations, however, exhibit substantial diversity in their underlying principles and interpretations of the core concept of DP, thereby engendering a current void in the comprehensive synthesis of these related works. The principal objective of this article is twofold. Firstly, it aims to provide a comprehensive summary of pertinent research endeavors pertaining to RDP within the realm of machine learning. Secondly, it endeavors to empirically assess the impact on both privacy and utility stemming from machine learning algorithms founded upon these RDP definitions. Additionally, this article undertakes a systematic analysis of the foundational principles underpinning distinct variants of relaxed definitions, culminating in the development of a taxonomy that categorizes these RDP definitions.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"8 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144210784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Liang Shi, Zhengju Tang, Nan Zhang, Xiaotong Zhang, Zhi Yang
{"title":"A Survey on Employing Large Language Models for Text-to-SQL Tasks","authors":"Liang Shi, Zhengju Tang, Nan Zhang, Xiaotong Zhang, Zhi Yang","doi":"10.1145/3737873","DOIUrl":"https://doi.org/10.1145/3737873","url":null,"abstract":"With the development of the Large Language Models (LLMs), a large range of LLM-based Text-to-SQL(Text2SQL) methods have emerged. This survey provides a comprehensive review of LLM-based Text2SQL studies. We first enumerate classic benchmarks and evaluation metrics. For the two mainstream methods, prompt engineering and finetuning, we introduce a comprehensive taxonomy and offer practical insights into each subcategory. We present an overall analysis of the above methods and various models evaluated on well-known datasets and extract some characteristics. Finally, we discuss the challenges and future directions in this field.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"54 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144210833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Survey of Program Analysis for Distributed Software Systems","authors":"Haipeng Cai","doi":"10.1145/3742900","DOIUrl":"https://doi.org/10.1145/3742900","url":null,"abstract":"Distributed software systems are pervasive today and they are increasingly developed/deployed to meet the growing needs for scalable computing. Given their critical roles in modern information infrastructures, assuring the quality of distributed software is crucial. As a fundamental methodology for software quality assurance in general, program analysis underlies a range of techniques and tools for constructing and assuring distributed systems. Yet to this date there remains a lack of systematical understandings of what have been done and how far we are in the field of program analysis for distributed systems. To gain a comprehensive and coherent view of this area hence inform relevant future research, this paper provides a systematic literature review of the (1) technical <jats:italic>approaches</jats:italic> , including analysis methodology, modality, underlying representation, algorithmic design, data utilized, and scope, (2) <jats:italic>applications</jats:italic> , with respect to the quality aspects served, and (3) <jats:italic>evaluation</jats:italic> , including the datasets and metrics considered, of various program analyses in the domain of distributed software in the past 30 years (1995–2024). In addition to knowledge systematization, we also extend our insights into the <jats:italic>limitations</jats:italic> of and <jats:italic>challenges</jats:italic> faced by current technique and evaluation designs, which shed light on potentially promising <jats:italic>future research directions</jats:italic> .","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"31 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144210832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Investigating EEG Microstate Analysis in Cognitive Software Engineering Tasks: A Systematic Mapping Study and Taxonomy","authors":"Willian Bolzan, Kleinner Farias","doi":"10.1145/3742899","DOIUrl":"https://doi.org/10.1145/3742899","url":null,"abstract":"Performing software engineering (SE) tasks requires the activation of software developers’ brain neural networks. Electroencephalography (EEG) microstate analysis emerges as a promising neurophysiological method to investigate the spatiotemporal dynamics of brain networks at high temporal resolution. An EEG microstate represents a unique topography of electric potentials over the multichannel EEG records. However, academia has neglected classifying published studies on EEG microstate analysis related to SE. Hence, a careful understanding of state-of-the-art studies remains limited and inconclusive. This article aims to classify studies on the EEG microstate analysis in cognitive SE tasks. We conducted a systematic mapping study following well-established guidelines to answer ten research questions. After careful filtering, 54 primary studies (out of 1.545) were selected from 8 electronic databases. The main results are that most primary studies focus on revealing brain dynamics, exploring a wide range of EEG microstate application contexts and experimental tasks, running empirical studies in a controlled environment, using K-means as a clustering method, applying ICA-based strategy to filter artifacts, such as muscle activity and eye blinks. However, No study has applied EEG microstate analysis to SE, highlighting a significant gap and the need for further research. Finally, this article presents a classification taxonomy and identifies critical challenges and future research directions.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"62 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144210782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Survey on Factuality in Large Language Models","authors":"Cunxiang Wang, Xiaoze Liu, Yuanhao Yue, Qipeng Guo, Xiangkun Hu, Xiangru Tang, Tianhang Zhang, Cheng Jiayang, Yunzhi Yao, Xuming Hu, Zehan Qi, Wenyang Gao, Yidong Wang, Linyi Yang, Jindong Wang, Xing Xie, Zheng Zhang, Yue Zhang","doi":"10.1145/3742420","DOIUrl":"https://doi.org/10.1145/3742420","url":null,"abstract":"This survey addresses the crucial issue of factuality in Large Language Models (LLMs). As LLMs find applications across diverse domains, the reliability and accuracy of their outputs become vital. We define the “factuality issue” as the probability of LLMs to produce content inconsistent with established facts. We first delve into the implications of these inaccuracies. Subsequently, we analyze the mechanisms through which LLMs store and process facts, seeking the primary causes of factual errors. Our discussion then transitions to methodologies for evaluating LLM factuality, emphasizing key metrics, benchmarks, and studies. We further explore strategies for enhancing LLM factuality. Our survey offers a structured guide for researchers aiming to fortify the factual reliability of LLMs. We consistently maintain and update the related open-source materials at https://github.com/wangcunxiang/LLM-Factuality-Survey.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"6 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2025-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144201711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An End-to-End Pipeline Perspective on Video Streaming in Best-Effort Networks: A Survey and Tutorial","authors":"Leonardo Peroni, Sergey Gorinsky","doi":"10.1145/3742472","DOIUrl":"https://doi.org/10.1145/3742472","url":null,"abstract":"Remaining a dominant force in Internet traffic, video streaming captivates end users, service providers, and researchers. This paper takes a pragmatic approach to reviewing recent advances in the field by focusing on the prevalent streaming paradigm that involves delivering long-form two-dimensional videos over the best-effort Internet with client-side adaptive bitrate (ABR) algorithms and assistance from content delivery networks (CDNs). To enhance accessibility, we supplement the survey with tutorial material. Unlike existing surveys that offer fragmented views, our work provides a holistic perspective on the entire end-to-end streaming pipeline, from video capture by a camera-equipped device to playback by the end user. Our novel perspective covers the ingestion, processing, and distribution stages of the pipeline and addresses key challenges such as video compression, upload, transcoding, ABR algorithms, CDN support, and quality of experience. We review over 200 papers and classify streaming designs by problem-solving methodology, whether based on intuition, theory, or machine learning. The survey further refines these methodology-based categories and characterizes each design by additional traits such as compatible codecs. We connect the reviewed research to real-world applications by discussing the practices of commercial streaming platforms. Finally, the survey highlights prominent current trends and outlines future directions in video streaming.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"38 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2025-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144189017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}