{"title":"用于立体图像重定位的深度感应变压器内部到内部网络","authors":"Xiaoting Fan , Long Sun , Zhong Zhang","doi":"10.1016/j.jestch.2025.102029","DOIUrl":null,"url":null,"abstract":"<div><div>With the advancement of three-dimension visual applications, stereoscopic image editing technologies have attracted widespread popularity in both industry and entertainment. In this paper, we focus on the fundamental stereoscopic image content editing problem, <em>i.e.</em> stereoscopic image retargeting, which aims to transform stereoscopic images to specific resolution with prescribed aspect ratios adaptively. Due to the additional binocular information present between the left and right views in stereoscopic images, the CNN-based stereoscopic image retargeting methods have some obvious limitations in capturing long-range dependencies. To address these issues, we present a depth-induced intra-to-inter Transformer network (DITrans-Net) for stereoscopic image retargeting, which learns the long-range dependencies information between intra-view and inter-view by an intra-to-inter feature extraction module and aggregates the depth information of left view and right view by a depth-induced feature integration module. Specifically, an intra-to-inter feature extraction module exploits intra-to-inter Transformer blocks for long-range dependencies information extraction firstly. Furthermore, a depth-induced feature integration module employs disparity attention learning mechanism to learn stereo correspondence and enhance disparity varying consistency. Finally, a hybrid loss function is applied to improve the stereoscopic image retargeting quality. Extensive experiments demonstrate that the proposed DITrans-Net achieves significant improvements and outperforms state-of-the-art methods both quantitatively and qualitatively on the various benchmark datasets.</div></div>","PeriodicalId":48609,"journal":{"name":"Engineering Science and Technology-An International Journal-Jestech","volume":"64 ","pages":"Article 102029"},"PeriodicalIF":5.1000,"publicationDate":"2025-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Depth-Induced Intra-to-Inter Transformer network for stereoscopic image retargeting\",\"authors\":\"Xiaoting Fan , Long Sun , Zhong Zhang\",\"doi\":\"10.1016/j.jestch.2025.102029\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>With the advancement of three-dimension visual applications, stereoscopic image editing technologies have attracted widespread popularity in both industry and entertainment. In this paper, we focus on the fundamental stereoscopic image content editing problem, <em>i.e.</em> stereoscopic image retargeting, which aims to transform stereoscopic images to specific resolution with prescribed aspect ratios adaptively. Due to the additional binocular information present between the left and right views in stereoscopic images, the CNN-based stereoscopic image retargeting methods have some obvious limitations in capturing long-range dependencies. To address these issues, we present a depth-induced intra-to-inter Transformer network (DITrans-Net) for stereoscopic image retargeting, which learns the long-range dependencies information between intra-view and inter-view by an intra-to-inter feature extraction module and aggregates the depth information of left view and right view by a depth-induced feature integration module. Specifically, an intra-to-inter feature extraction module exploits intra-to-inter Transformer blocks for long-range dependencies information extraction firstly. Furthermore, a depth-induced feature integration module employs disparity attention learning mechanism to learn stereo correspondence and enhance disparity varying consistency. Finally, a hybrid loss function is applied to improve the stereoscopic image retargeting quality. Extensive experiments demonstrate that the proposed DITrans-Net achieves significant improvements and outperforms state-of-the-art methods both quantitatively and qualitatively on the various benchmark datasets.</div></div>\",\"PeriodicalId\":48609,\"journal\":{\"name\":\"Engineering Science and Technology-An International Journal-Jestech\",\"volume\":\"64 \",\"pages\":\"Article 102029\"},\"PeriodicalIF\":5.1000,\"publicationDate\":\"2025-03-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Engineering Science and Technology-An International Journal-Jestech\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2215098625000849\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Science and Technology-An International Journal-Jestech","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2215098625000849","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
Depth-Induced Intra-to-Inter Transformer network for stereoscopic image retargeting
With the advancement of three-dimension visual applications, stereoscopic image editing technologies have attracted widespread popularity in both industry and entertainment. In this paper, we focus on the fundamental stereoscopic image content editing problem, i.e. stereoscopic image retargeting, which aims to transform stereoscopic images to specific resolution with prescribed aspect ratios adaptively. Due to the additional binocular information present between the left and right views in stereoscopic images, the CNN-based stereoscopic image retargeting methods have some obvious limitations in capturing long-range dependencies. To address these issues, we present a depth-induced intra-to-inter Transformer network (DITrans-Net) for stereoscopic image retargeting, which learns the long-range dependencies information between intra-view and inter-view by an intra-to-inter feature extraction module and aggregates the depth information of left view and right view by a depth-induced feature integration module. Specifically, an intra-to-inter feature extraction module exploits intra-to-inter Transformer blocks for long-range dependencies information extraction firstly. Furthermore, a depth-induced feature integration module employs disparity attention learning mechanism to learn stereo correspondence and enhance disparity varying consistency. Finally, a hybrid loss function is applied to improve the stereoscopic image retargeting quality. Extensive experiments demonstrate that the proposed DITrans-Net achieves significant improvements and outperforms state-of-the-art methods both quantitatively and qualitatively on the various benchmark datasets.
期刊介绍:
Engineering Science and Technology, an International Journal (JESTECH) (formerly Technology), a peer-reviewed quarterly engineering journal, publishes both theoretical and experimental high quality papers of permanent interest, not previously published in journals, in the field of engineering and applied science which aims to promote the theory and practice of technology and engineering. In addition to peer-reviewed original research papers, the Editorial Board welcomes original research reports, state-of-the-art reviews and communications in the broadly defined field of engineering science and technology.
The scope of JESTECH includes a wide spectrum of subjects including:
-Electrical/Electronics and Computer Engineering (Biomedical Engineering and Instrumentation; Coding, Cryptography, and Information Protection; Communications, Networks, Mobile Computing and Distributed Systems; Compilers and Operating Systems; Computer Architecture, Parallel Processing, and Dependability; Computer Vision and Robotics; Control Theory; Electromagnetic Waves, Microwave Techniques and Antennas; Embedded Systems; Integrated Circuits, VLSI Design, Testing, and CAD; Microelectromechanical Systems; Microelectronics, and Electronic Devices and Circuits; Power, Energy and Energy Conversion Systems; Signal, Image, and Speech Processing)
-Mechanical and Civil Engineering (Automotive Technologies; Biomechanics; Construction Materials; Design and Manufacturing; Dynamics and Control; Energy Generation, Utilization, Conversion, and Storage; Fluid Mechanics and Hydraulics; Heat and Mass Transfer; Micro-Nano Sciences; Renewable and Sustainable Energy Technologies; Robotics and Mechatronics; Solid Mechanics and Structure; Thermal Sciences)
-Metallurgical and Materials Engineering (Advanced Materials Science; Biomaterials; Ceramic and Inorgnanic Materials; Electronic-Magnetic Materials; Energy and Environment; Materials Characterizastion; Metallurgy; Polymers and Nanocomposites)