Yuang Wang , Siyeop Yoon , Pengfei Jin , Matthew Tivnan , Sifan Song , Zhennong Chen , Rui Hu , Li Zhang , Quanzheng Li , Zhiqiang Chen , Dufan Wu
{"title":"隐式图像到图像Schrödinger桥梁图像恢复","authors":"Yuang Wang , Siyeop Yoon , Pengfei Jin , Matthew Tivnan , Sifan Song , Zhennong Chen , Rui Hu , Li Zhang , Quanzheng Li , Zhiqiang Chen , Dufan Wu","doi":"10.1016/j.patcog.2025.111627","DOIUrl":null,"url":null,"abstract":"<div><div>Diffusion-based models have demonstrated remarkable effectiveness in image restoration tasks; however, their iterative denoising process, which starts from Gaussian noise, often leads to slow inference speeds. The Image-to-Image Schrödinger Bridge (I<span><math><msup><mrow></mrow><mrow><mn>2</mn></mrow></msup></math></span>SB) offers a promising alternative by initializing the generative process from corrupted images while leveraging training techniques from score-based diffusion models. In this paper, we introduce the Implicit Image-to-Image Schrödinger Bridge (I<span><math><msup><mrow></mrow><mrow><mn>3</mn></mrow></msup></math></span>SB) to further accelerate the generative process of I<span><math><msup><mrow></mrow><mrow><mn>2</mn></mrow></msup></math></span>SB. I<span><math><msup><mrow></mrow><mrow><mn>3</mn></mrow></msup></math></span>SB restructures the generative process into a non-Markovian framework by incorporating the initial corrupted image at each generative step, effectively preserving and utilizing its information. To enable direct use of pretrained I<span><math><msup><mrow></mrow><mrow><mn>2</mn></mrow></msup></math></span>SB models without additional training, we ensure consistency in marginal distributions. Extensive experiments across many image corruptions—including noise, low resolution, JPEG compression, and sparse sampling—and multiple image modalities—such as natural, human face, and medical images— demonstrate the acceleration benefits of I<span><math><msup><mrow></mrow><mrow><mn>3</mn></mrow></msup></math></span>SB. Compared to I<span><math><msup><mrow></mrow><mrow><mn>2</mn></mrow></msup></math></span>SB, I<span><math><msup><mrow></mrow><mrow><mn>3</mn></mrow></msup></math></span>SB achieves the same perceptual quality with fewer generative steps, while maintaining or improving fidelity to the ground truth.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"165 ","pages":"Article 111627"},"PeriodicalIF":7.5000,"publicationDate":"2025-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Implicit Image-to-Image Schrödinger Bridge for image restoration\",\"authors\":\"Yuang Wang , Siyeop Yoon , Pengfei Jin , Matthew Tivnan , Sifan Song , Zhennong Chen , Rui Hu , Li Zhang , Quanzheng Li , Zhiqiang Chen , Dufan Wu\",\"doi\":\"10.1016/j.patcog.2025.111627\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Diffusion-based models have demonstrated remarkable effectiveness in image restoration tasks; however, their iterative denoising process, which starts from Gaussian noise, often leads to slow inference speeds. The Image-to-Image Schrödinger Bridge (I<span><math><msup><mrow></mrow><mrow><mn>2</mn></mrow></msup></math></span>SB) offers a promising alternative by initializing the generative process from corrupted images while leveraging training techniques from score-based diffusion models. In this paper, we introduce the Implicit Image-to-Image Schrödinger Bridge (I<span><math><msup><mrow></mrow><mrow><mn>3</mn></mrow></msup></math></span>SB) to further accelerate the generative process of I<span><math><msup><mrow></mrow><mrow><mn>2</mn></mrow></msup></math></span>SB. I<span><math><msup><mrow></mrow><mrow><mn>3</mn></mrow></msup></math></span>SB restructures the generative process into a non-Markovian framework by incorporating the initial corrupted image at each generative step, effectively preserving and utilizing its information. To enable direct use of pretrained I<span><math><msup><mrow></mrow><mrow><mn>2</mn></mrow></msup></math></span>SB models without additional training, we ensure consistency in marginal distributions. Extensive experiments across many image corruptions—including noise, low resolution, JPEG compression, and sparse sampling—and multiple image modalities—such as natural, human face, and medical images— demonstrate the acceleration benefits of I<span><math><msup><mrow></mrow><mrow><mn>3</mn></mrow></msup></math></span>SB. Compared to I<span><math><msup><mrow></mrow><mrow><mn>2</mn></mrow></msup></math></span>SB, I<span><math><msup><mrow></mrow><mrow><mn>3</mn></mrow></msup></math></span>SB achieves the same perceptual quality with fewer generative steps, while maintaining or improving fidelity to the ground truth.</div></div>\",\"PeriodicalId\":49713,\"journal\":{\"name\":\"Pattern Recognition\",\"volume\":\"165 \",\"pages\":\"Article 111627\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2025-04-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Pattern Recognition\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0031320325002870\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320325002870","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Implicit Image-to-Image Schrödinger Bridge for image restoration
Diffusion-based models have demonstrated remarkable effectiveness in image restoration tasks; however, their iterative denoising process, which starts from Gaussian noise, often leads to slow inference speeds. The Image-to-Image Schrödinger Bridge (ISB) offers a promising alternative by initializing the generative process from corrupted images while leveraging training techniques from score-based diffusion models. In this paper, we introduce the Implicit Image-to-Image Schrödinger Bridge (ISB) to further accelerate the generative process of ISB. ISB restructures the generative process into a non-Markovian framework by incorporating the initial corrupted image at each generative step, effectively preserving and utilizing its information. To enable direct use of pretrained ISB models without additional training, we ensure consistency in marginal distributions. Extensive experiments across many image corruptions—including noise, low resolution, JPEG compression, and sparse sampling—and multiple image modalities—such as natural, human face, and medical images— demonstrate the acceleration benefits of ISB. Compared to ISB, ISB achieves the same perceptual quality with fewer generative steps, while maintaining or improving fidelity to the ground truth.
期刊介绍:
The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.