{"title":"视频扩散生成:全面回顾和开放性问题","authors":"Wenping Ma, Xiaoting Yang, Licheng Jiao, Lingling Li, Xu Liu, Fang Liu, Puhua Chen, Yuting Yang, Mengru Ma, Long Sun, Ruohan Zhang, Xueli Geng, Yuwei Guo, Shuyuan Yang, Zhixi Feng","doi":"10.1007/s10462-025-11331-6","DOIUrl":null,"url":null,"abstract":"<div><p>Video generation has become an increasingly important component of AI-generated content (AIGC), owing to its rich semantic expressiveness and growing application potential. Among various generative paradigms, diffusion models have recently gained prominence due to their strong controllability, competitive visual quality, and compatibility with multimodal inputs. However, most existing surveys provide limited coverage of diffusion-based video generation, often lacking systematic analysis and comprehensive comparisons. To address this gap, this paper presents a thorough and structured review of diffusion models for video generation. We first outline the theoretical foundations and core architectures of diffusion models, and then the key design principles of representative methods for video generation were introduced. We propose a unified taxonomy that categorizes over two hundred methods, analyzing their key characteristics, strengths, and limitations. In addition, we compared the performance of classical methods and summarized commonly used datasets and evaluation metrics in this field for ease of model benchmarking and selection. Finally, we discuss open problems and future research directions, aiming to provide a valuable reference for both academic research and practical development.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"58 11","pages":""},"PeriodicalIF":13.9000,"publicationDate":"2025-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11331-6.pdf","citationCount":"0","resultStr":"{\"title\":\"Video diffusion generation: comprehensive review and open problems\",\"authors\":\"Wenping Ma, Xiaoting Yang, Licheng Jiao, Lingling Li, Xu Liu, Fang Liu, Puhua Chen, Yuting Yang, Mengru Ma, Long Sun, Ruohan Zhang, Xueli Geng, Yuwei Guo, Shuyuan Yang, Zhixi Feng\",\"doi\":\"10.1007/s10462-025-11331-6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Video generation has become an increasingly important component of AI-generated content (AIGC), owing to its rich semantic expressiveness and growing application potential. Among various generative paradigms, diffusion models have recently gained prominence due to their strong controllability, competitive visual quality, and compatibility with multimodal inputs. However, most existing surveys provide limited coverage of diffusion-based video generation, often lacking systematic analysis and comprehensive comparisons. To address this gap, this paper presents a thorough and structured review of diffusion models for video generation. We first outline the theoretical foundations and core architectures of diffusion models, and then the key design principles of representative methods for video generation were introduced. We propose a unified taxonomy that categorizes over two hundred methods, analyzing their key characteristics, strengths, and limitations. In addition, we compared the performance of classical methods and summarized commonly used datasets and evaluation metrics in this field for ease of model benchmarking and selection. Finally, we discuss open problems and future research directions, aiming to provide a valuable reference for both academic research and practical development.</p></div>\",\"PeriodicalId\":8449,\"journal\":{\"name\":\"Artificial Intelligence Review\",\"volume\":\"58 11\",\"pages\":\"\"},\"PeriodicalIF\":13.9000,\"publicationDate\":\"2025-08-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://link.springer.com/content/pdf/10.1007/s10462-025-11331-6.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Artificial Intelligence Review\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10462-025-11331-6\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence Review","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10462-025-11331-6","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Video diffusion generation: comprehensive review and open problems
Video generation has become an increasingly important component of AI-generated content (AIGC), owing to its rich semantic expressiveness and growing application potential. Among various generative paradigms, diffusion models have recently gained prominence due to their strong controllability, competitive visual quality, and compatibility with multimodal inputs. However, most existing surveys provide limited coverage of diffusion-based video generation, often lacking systematic analysis and comprehensive comparisons. To address this gap, this paper presents a thorough and structured review of diffusion models for video generation. We first outline the theoretical foundations and core architectures of diffusion models, and then the key design principles of representative methods for video generation were introduced. We propose a unified taxonomy that categorizes over two hundred methods, analyzing their key characteristics, strengths, and limitations. In addition, we compared the performance of classical methods and summarized commonly used datasets and evaluation metrics in this field for ease of model benchmarking and selection. Finally, we discuss open problems and future research directions, aiming to provide a valuable reference for both academic research and practical development.
期刊介绍:
Artificial Intelligence Review, a fully open access journal, publishes cutting-edge research in artificial intelligence and cognitive science. It features critical evaluations of applications, techniques, and algorithms, providing a platform for both researchers and application developers. The journal includes refereed survey and tutorial articles, along with reviews and commentary on significant developments in the field.