{"title":"基于分数的图像到图像的同步扩散回归","authors":"Hao Xin, M. Zhu","doi":"10.1109/ICMLA55696.2022.00056","DOIUrl":null,"url":null,"abstract":"Image-to-image regression is an important computer vision task. In this paper, we propose a novel image-to-image regression model following the recent trend in generative modeling that employs Stochastic Differential Equations (SDEs) and score matching. We first apply diffusion processes to regression data using designed SDEs, and then perform inference by gradually reversing the processes. In particular, our method uses synchronized diffusion, which simultaneously applies diffusion to both input and response images to stabilize diffusion and subsequent parameter learning. Furthermore, based on the Expectation-Maximization (EM) algorithm, we develop an effective algorithm for prediction. We implement a conditional U-Net architecture with pre-trained DenseNet encoder for our proposed model and refer to it as DenseSocre. Our new model is able to generate diverse outcomes for image colorization, and the proposed prediction algorithm is able to achieve close to state-of-art performance on high-resolution monocular depth estimation.","PeriodicalId":128160,"journal":{"name":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"219 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Score-based Image-to-Image Regression with Synchronized Diffusion\",\"authors\":\"Hao Xin, M. Zhu\",\"doi\":\"10.1109/ICMLA55696.2022.00056\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Image-to-image regression is an important computer vision task. In this paper, we propose a novel image-to-image regression model following the recent trend in generative modeling that employs Stochastic Differential Equations (SDEs) and score matching. We first apply diffusion processes to regression data using designed SDEs, and then perform inference by gradually reversing the processes. In particular, our method uses synchronized diffusion, which simultaneously applies diffusion to both input and response images to stabilize diffusion and subsequent parameter learning. Furthermore, based on the Expectation-Maximization (EM) algorithm, we develop an effective algorithm for prediction. We implement a conditional U-Net architecture with pre-trained DenseNet encoder for our proposed model and refer to it as DenseSocre. Our new model is able to generate diverse outcomes for image colorization, and the proposed prediction algorithm is able to achieve close to state-of-art performance on high-resolution monocular depth estimation.\",\"PeriodicalId\":128160,\"journal\":{\"name\":\"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)\",\"volume\":\"219 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMLA55696.2022.00056\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA55696.2022.00056","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Score-based Image-to-Image Regression with Synchronized Diffusion
Image-to-image regression is an important computer vision task. In this paper, we propose a novel image-to-image regression model following the recent trend in generative modeling that employs Stochastic Differential Equations (SDEs) and score matching. We first apply diffusion processes to regression data using designed SDEs, and then perform inference by gradually reversing the processes. In particular, our method uses synchronized diffusion, which simultaneously applies diffusion to both input and response images to stabilize diffusion and subsequent parameter learning. Furthermore, based on the Expectation-Maximization (EM) algorithm, we develop an effective algorithm for prediction. We implement a conditional U-Net architecture with pre-trained DenseNet encoder for our proposed model and refer to it as DenseSocre. Our new model is able to generate diverse outcomes for image colorization, and the proposed prediction algorithm is able to achieve close to state-of-art performance on high-resolution monocular depth estimation.