ACM Computing Surveys最新文献

Influence Maximization on Signed Social Networks: A Survey 社交网络影响力最大化：一项调查

IF 16.6 1区计算机科学

ACM Computing Surveys Pub Date : 2025-10-16 DOI: 10.1145/3772061

Shashank Sheshar Singh, Vishal Srivastava, Avadh Kishor, Jayendra Barua, Neeraj Kumar

引用次数: 0

AI Alignment: A Contemporary Survey AI对齐：当代调查

IF 16.6 1区计算机科学

ACM Computing Surveys Pub Date : 2025-10-15 DOI: 10.1145/3770749

Jiaming Ji, Tianyi Qiu, Boyuan Chen, Jiayi Zhou, Borong Zhang, Donghai Hong, Hantao Lou, Kaile Wang, Yawen Duan, Zhonghao He, Lukas Vierling, Zhaowei Zhang, Fanzhi Zeng, Juntao Dai, Xuehai Pan, Hua Xu, Aidan O'Gara, Kwan Ng, Brian Tse, Jie Fu, Stephen McAleer, Yanfeng Wang, Mingchuan Yang, Yunhuai Liu, Yizhou Wang, Song-Chun Zhu, Yike Guo, Yaodong Yang, Wen Gao

{"title":"AI Alignment: A Contemporary Survey","authors":"Jiaming Ji, Tianyi Qiu, Boyuan Chen, Jiayi Zhou, Borong Zhang, Donghai Hong, Hantao Lou, Kaile Wang, Yawen Duan, Zhonghao He, Lukas Vierling, Zhaowei Zhang, Fanzhi Zeng, Juntao Dai, Xuehai Pan, Hua Xu, Aidan O'Gara, Kwan Ng, Brian Tse, Jie Fu, Stephen McAleer, Yanfeng Wang, Mingchuan Yang, Yunhuai Liu, Yizhou Wang, Song-Chun Zhu, Yike Guo, Yaodong Yang, Wen Gao","doi":"10.1145/3770749","DOIUrl":"https://doi.org/10.1145/3770749","url":null,"abstract":"AI alignment aims to make AI systems behave in line with human intentions and values. As AI systems grow more capable, so do risks from misalignment. To provide a comprehensive and up-to-date overview of the alignment field, in this survey, we delve into the core concepts, methodology, and practice of alignment. First, we identify four principles as the key objectives of AI alignment: Robustness, Interpretability, Controllability, and Ethicality ( RICE ). Guided by these four principles, we outline the landscape of current alignment research and decompose them into two key components: forward alignment and backward alignment . The former aims to make AI systems aligned via alignment training, while the latter aims to gain evidence about the systems’ alignment and govern them appropriately to avoid exacerbating misalignment risks. On forward alignment, we discuss techniques for learning from feedback and learning under the distribution shift. Specifically, we survey traditional preference modeling methods and reinforcement learning from human feedback and further discuss potential frameworks to reach scalable oversight for tasks where effective human oversight is hard to obtain. Within learning under distribution shift, we also cover data distribution interventions such as adversarial training that helps expand the distribution of training data and algorithmic interventions to combat goal misgeneralization. On backward alignment, we discuss assurance techniques and governance practices. Specifically, we survey assurance methods of AI systems throughout their lifecycle, covering safety evaluation, interpretability, and human value compliance. We discuss current and prospective governance practices adopted by governments, industry actors, and other third parties, aimed at managing existing and future AI risks. This survey aims to provide a comprehensive yet beginner-friendly review of alignment research topics. Based on this, we also release and continually update the website www.alignmentsurvey.com which features tutorials, collections of papers, blog posts, and other resources.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"65 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2025-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145295089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Digital Privacy Under Attack: Challenges and Enablers 受到攻击的数字隐私：挑战和推动因素

IF 16.6 1区计算机科学

ACM Computing Surveys Pub Date : 2025-10-15 DOI: 10.1145/3770853

Baobao Song, Shiva Raj Pokhrel, Mengyue Deng, Qiujun Lan, Robin Ram Doss, Tianqing Zhu, Gang Li

引用次数: 0

Remote Respiration Measurement with RGB Cameras: A Review and Benchmark RGB相机的远程呼吸测量：回顾和基准

IF 16.6 1区计算机科学

ACM Computing Surveys Pub Date : 2025-10-14 DOI: 10.1145/3771763

Giuseppe Boccignone, Vittorio Cuculo, Alessandro D'Amelio, Giuliano Grossi, Raffaella Lanzarotti, Sabrina Patania

{"title":"Remote Respiration Measurement with RGB Cameras: A Review and Benchmark","authors":"Giuseppe Boccignone, Vittorio Cuculo, Alessandro D'Amelio, Giuliano Grossi, Raffaella Lanzarotti, Sabrina Patania","doi":"10.1145/3771763","DOIUrl":"https://doi.org/10.1145/3771763","url":null,"abstract":"Remote measurement of respiratory behaviour through RGB cameras has gained significant attention in the last couple of decades. Unlike traditional contact-based methods that may cause discomfort and require specialised equipment, contactless physiological measurement techniques offer a non-invasive way to monitor vital signs. In this survey paper, we comprehensively review the literature and techniques related to estimating respiratory information from RGB cameras. We categorise the approaches into three main groups: methods utilising respiration-induced body movements, methods extracting respiratory information from blood volume pulse signals obtained via remote photoplethysmography, and deep learning-based techniques for direct respiratory signal extraction. To evaluate these approaches, we perform a comparative assessment using publicly available datasets. As a result, we uncover emerging trends while identifying strengths and weaknesses in the field. Our contributions include a detailed review of the literature, a benchmark of representative methods on multiple datasets, and the introduction of a new Python package called resPyre that implements the benchmarked approaches, making them accessible to the research community. This survey aims to promote reproducibility, facilitate further research, and guide the development of more accurate and practical methods for remote respiration measurement using RGB cameras.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"96 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2025-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145282790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Primer on Temporal Graph Learning 时间图学习入门

IF 16.6 1区计算机科学

ACM Computing Surveys Pub Date : 2025-10-13 DOI: 10.1145/3771693

Aniq Ur Rahman, Ahmed A. Elhag, Justin P. Coon

引用次数: 0

Structure-Based Drug Design with Geometric Deep Learning: A Comprehensive Survey 基于结构的几何深度学习药物设计综述

IF 16.6 1区计算机科学

ACM Computing Surveys Pub Date : 2025-10-13 DOI: 10.1145/3769677

Zaixi Zhang, Jiaxian Yan, Yining Huang, Qi Liu, Enhong Chen, Mengdi Wang, Marinka Zitnik

{"title":"Structure-Based Drug Design with Geometric Deep Learning: A Comprehensive Survey","authors":"Zaixi Zhang, Jiaxian Yan, Yining Huang, Qi Liu, Enhong Chen, Mengdi Wang, Marinka Zitnik","doi":"10.1145/3769677","DOIUrl":"https://doi.org/10.1145/3769677","url":null,"abstract":"Structure-based drug design (SBDD) leverages the three-dimensional geometry of proteins to identify potential drug candidates. Traditional approaches, rooted in physicochemical modeling and domain expertise, are often resource-intensive. Recent advancements in geometric deep learning, which effectively integrate and process 3D geometric data, alongside breakthroughs in accurate protein structure predictions from tools like AlphaFold, have significantly propelled the field forward. This paper systematically reviews the state-of-the-art in geometric deep learning for SBDD. We begin by outlining foundational tasks in SBDD, discussing prevalent 3D protein representations, and highlighting representative predictive and generative models. Next, we provide an in-depth review of key tasks, including binding site prediction, binding pose generation, de novo molecule generation, linker design, protein pocket generation, and binding affinity prediction. For each task, we present formal problem definitions, key methods, datasets, evaluation metrics, and performance benchmarks. Lastly, we explore current challenges and future opportunities in SBDD. Challenges include oversimplified problem formulations, limited out-of-distribution generalization, biosecurity concerns related to the misuse of structural data, insufficient evaluation metrics and large-scale benchmarks, and the need for experimental validation and enhanced model interpretability. Opportunities lie in integrating biomedical AI agents, leveraging multimodal datasets, developing comprehensive benchmarks, establishing criteria aligned with clinical outcomes, and designing foundation models to expand the scope of design tasks. We also curate https://github.com/zaixizhang/Awesome-SBDD , reflecting ongoing contributions and new datasets in SBDD.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"135 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2025-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145282835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Video is Worth a Thousand Images: Exploring the Latest Trends in Long Video Generation 视频胜过千张图片：探索长视频生成的最新趋势

IF 16.6 1区计算机科学

ACM Computing Surveys Pub Date : 2025-10-10 DOI: 10.1145/3771724

Faraz Waseem, Muhammad Shahzad

{"title":"Video is Worth a Thousand Images: Exploring the Latest Trends in Long Video Generation","authors":"Faraz Waseem, Muhammad Shahzad","doi":"10.1145/3771724","DOIUrl":"https://doi.org/10.1145/3771724","url":null,"abstract":"An image may convey a thousand words, but a video, composed of hundreds or thousands of image frames, tells a more intricate story. Despite significant progress in multimodal large language models (MLLMs), generating extended videos remains a formidable challenge. As of this writing, OpenAI’s Sora [1], the current state-of-the-art system, is still limited to producing videos of up to one minute in length. This limitation stems from the complexity of long video generation, which requires more than generative AI techniques for approximating density functions. Critical elements, such as planning, narrative construction, and spatiotemporal continuity, pose significant challenges. Integrating generative AI with a divide-and-conquer approach could improve scalability for longer videos while offering greater control. In this survey, we examine the current landscape of long video generation, covering foundational techniques such as GANs and diffusion models, video generation strategies, large-scale training datasets, quality metrics for evaluating long videos, and future research areas to address the limitations of existing video generation capabilities. We believe it would serve as a comprehensive foundation, offering extensive information to guide future advancements and research in the field of long video generation.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"114 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2025-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145255103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Representation Learning in Complex Logical Query Answering on Knowledge Graphs: A Survey 知识图上复杂逻辑查询回答的表示学习研究

IF 16.6 1区计算机科学

ACM Computing Surveys Pub Date : 2025-10-10 DOI: 10.1145/3771692

Chau D. M. Nguyen, Tim French, Michael Stewart, Melinda Hodkiewicz, Wei Liu

引用次数: 0

A Comprehensive Survey of Transformers in Text Recognition: Techniques, Challenges, and Future Directions 文本识别中变形的综合研究：技术、挑战和未来方向

IF 16.6 1区计算机科学

ACM Computing Surveys Pub Date : 2025-10-09 DOI: 10.1145/3771273

Ali Afkari-Fahandari, Elham Shabaninia, Fatemeh Asadi-Zeydabadi, Hossein Nezamabadi-Pour

{"title":"A Comprehensive Survey of Transformers in Text Recognition: Techniques, Challenges, and Future Directions","authors":"Ali Afkari-Fahandari, Elham Shabaninia, Fatemeh Asadi-Zeydabadi, Hossein Nezamabadi-Pour","doi":"10.1145/3771273","DOIUrl":"https://doi.org/10.1145/3771273","url":null,"abstract":"Optical character recognition is a rapidly evolving field within pattern recognition, enabling the automatic conversion of printed or handwritten text images into machine-readable formats. This technology plays a critical role across various sectors, including banking, healthcare, government, and education. While Optical character recognition systems encompass multiple stages such as text detection, segmentation, and post-processing, this paper focuses on text recognition as a core and technically challenging component. In particular, we provide an in-depth review of recent advances driven by Transformer-based models, which have significantly pushed the state-of-the-art. To contextualize these advancements, a detailed comparative analysis of Transformer-based techniques is presented against earlier deep learning approaches, highlighting their respective limitations and the improvements introduced by Transformers, including parallel sequence processing, global context modeling, better handling of long-range dependencies, and enhanced robustness to irregular or noisy text layouts. We also examine widely used benchmark datasets in the literature and provide a detailed discussion of the performance achieved by recent state-of-the-art methods. Finally, this survey outlines open research challenges and potential future directions. It aims to serve as a comprehensive reference for both novice and experienced researchers by summarizing the latest developments in text recognition, including architectures, datasets, evaluation metrics, and practical considerations in model performance trade-offs and deployment.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"11 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2025-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145247628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Survey on Proactive Deepfake Defense: Disruption and Watermarking 主动深度伪造防御研究综述：干扰与水印

IF 16.6 1区计算机科学

ACM Computing Surveys Pub Date : 2025-10-08 DOI: 10.1145/3771296

Hong-Hanh Nguyen-Le, Van-Tuan Tran, Thuc Nguyen, Nhien-An Le-Khac

{"title":"A Survey on Proactive Deepfake Defense: Disruption and Watermarking","authors":"Hong-Hanh Nguyen-Le, Van-Tuan Tran, Thuc Nguyen, Nhien-An Le-Khac","doi":"10.1145/3771296","DOIUrl":"https://doi.org/10.1145/3771296","url":null,"abstract":"The rapid proliferation of generative AI has led to led to unprecedented capabilities in synthesizing realistic deepfakes (DFs) across multiple modalities. This raises significant concerns regarding privacy, security, and copyright protection. Unlike passive detection approaches that operate after DFs have been created and distributed, proactive defense mechanisms aim to prevent the generation of malicious synthetic content at its source. This paper provides a comprehensive survey of current proactive DF defense strategies, including Disruption and Watermarking. Disruption approaches protect individuals’ data by introducing imperceptible perturbations that prevent unauthorized exploitation by generative models, while watermarking approaches embed verifiable messages into data or models to enable content authentication and attribution. We also analyze proactive approaches across various evaluation metrics (imperceptibility, protectability/detectability, transferability, traceability, and robustness), and examine their effectiveness in real-world settings. Furthermore, we review the evolution of DF generation techniques, highlighting their rapid developments. Finally, we identify key challenges and promising future research directions to enhance proactive defense mechanisms.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"7 1","pages":""},"PeriodicalIF":16.6,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145247631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0