中国图象图形学报Pub Date : 2023-01-01DOI: 10.11834/jig.220459
Ji Qingbo, Chen Kuicheng, Hou Changbo, Li Ziqi, Qi Yufei
{"title":"Infrared target tracking algorithm based on attention mechanism enhancement and target model update","authors":"Ji Qingbo, Chen Kuicheng, Hou Changbo, Li Ziqi, Qi Yufei","doi":"10.11834/jig.220459","DOIUrl":"https://doi.org/10.11834/jig.220459","url":null,"abstract":"目的 多数以深度学习为基础的红外目标跟踪方法在对比度弱、噪声多的红外场景下,缺少对目标细节信息的利用,而且当跟踪场景中有相似目标且背景杂乱时,大部分跟踪器无法对跟踪的目标进行有效的更新,导致长期跟踪时鲁棒性较差。为解决这些问题,提出一种基于注意力和目标模型自适应更新的红外目标跟踪算法。方法 首先以无锚框算法为基础,加入针对红外跟踪场景设计的快速注意力增强模块以并行处理红外图像,在不损失原信息的前提下提高红外目标与背景的差异性并增强目标的细节信息,然后将提取的特征融合到主干网络的中间层,最后利用目标模型自适应更新网络,学习红外目标的特征变化趋势,同时对目标的中高层特征进行动态更新。结果 本文方法在 4 个红外目标跟踪评估基准上与其他先进算法进行了比较,在 LSOTB-TIR(large-scale thermalinfrared object tracking benchmark)数据集上的精度为 79.0%,归一化精度为 71.5%,成功率为 66.2%,较第 2 名在精度和成功率上分别高出 4.0%和 4.6%;在 PTB-TIR(thermal infrared pedestrian tracking benchmark)数据集上的精度为85.1%,成功率为 66.9%,较第 2 名分别高出 1.3% 和 3.6%;在 VOT-TIR2015(thermal infrared visual object tracking)和VOT-TIR2017 数据集上的期望平均重叠与精确度分别为 0.344、0.73 和 0.276、0.71,本文算法在前 3 个数据集的测评结果均达到最优。同时,在 LSOTB-TIR 数据集上的消融实验结果显示,本文方法对基线跟踪器有着明显的增益作用。结论 本文算法提高了对红外目标特征的捕捉能力,解决了红外目标跟踪易受干扰的问题,能够提升红外目标长期跟踪的精度和成功率。;Objective Most target tracking algorithms are designed based on visible sight scenes.However, in some cases, infrared target tracking has advantages that visible light does not have.Infrared equipment uses the radiation of an object itself to image and does not require additional lighting sources.It can display the target in weak light or dark scenes and has a certain penetration ability.However, infrared images have defects, such as unclear boundaries between targets and backgrounds, blurred images, and cluttered backgrounds.Moreover, some infrared dataset images are rough, negatively impacting the training of data-driven-based deep learning algorithms to a certain extent.Infrared tracking algorithms can be divided into traditional methods and deep learning methods.Traditional methods generally take the idea of correlation filtering as the core.Deep learning methods are mainly divided into the method of a neural network providing target features for correlation filters and the method of calculating the similarity of the image area with the framework of the Siamese network.The feature extraction ability of traditional methods for infrared targets is far inferior to that of deep learning methods.Moreover, the filters trained online cannot adapt to fast-moving or blurred targets, resulting in poor tracking accuracy in scenes with complex backgrounds.At present, most deep-learning-based infrared target tracking methods still lack the use of detailed information on infrared targets in infrared scenes with weak contrast and noise.Most trackers cannot effectively update the tracked target when the tracking scene has similar targets and cluttered background.This scenario results in poor robustness in long-term tracking.Therefore, an infrared target tracking algorithm based on attention and template adaptive update is proposed to solve the problems mentioned.Method The Siamese network tracking algorithm takes the target in the first frame as the template and performs similarity calculation on the search area of the subsequent frames to obtain the position of the target with the maximum response.The method has a simple structure and high tracking efficiency.","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135601335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
中国图象图形学报Pub Date : 2023-01-01DOI: 10.11834/jig.220562
Zeng Qingwang, Dong Zhangyu, Yang Xuezhi, Chong Fating
{"title":"Interferometric phase denoising combining global context and fused attention","authors":"Zeng Qingwang, Dong Zhangyu, Yang Xuezhi, Chong Fating","doi":"10.11834/jig.220562","DOIUrl":"https://doi.org/10.11834/jig.220562","url":null,"abstract":"目的 干涉相位去噪是合成孔径雷达干涉测量(interferometric synthetic aperture radar,InSAR)技术中的关键环节,其效果对测量精度具有重要影响。针对现有的干涉相位去噪方法大多关注局部特征以及在特征提取方面的局限性,同时为了平衡去噪和结构保持两者之间的关系,提出了一种结合全局上下文与融合注意力的相位去噪网络 GCFA-PDNet (global context and fused attention phase denoising network)。方法 将干涉相位分离为实部和虚部依次输入到网络,先从噪声相位中提取浅层特征,再将其映射到由全局上下文提取模块和融合注意力模块组成的特征增强模块,最后通过全局残差学习生成去噪图像。全局上下文提取模块能提取全局上下文信息,具有非局部方法的优势;融合注意力模块既强调关键特征,又能高效提取隐藏在复杂背景中的噪声信息。结果 所提出的方法与对比方法中性能最优者相比,在模拟数据结果的平均峰值信噪比(peak signal to noise ratio,PSNR)和结构相似性(struc-tural similarity,SSIM)指标分别提高了 5.72% 和 2.94%,在真实数据结果的平均残差点减少百分比(percentage ofresidual point reduction,PRR)和相位标准偏差(phase standard deviation,PSD)指标分别提高了 2.01% 和 3.57%。结合定性与定量分析,所提出的方法优于其他 5 种不同类型的相位去噪方法。结论 提出的去噪网络较其他方法具有更强大的特征提取能力,此外由于关注全局上下文信息和强调关键特征,网络能够在增强去噪能力的同时保持原始相位细节。;Objective Interferometric phase noise is introduced by three types of inherent factors:1)system noise, such as thermal noise and synthetic aperture radar(SAR)speckle noise;2)decoherence problems, including baseline, temporal, and spatial decoherence;3)signal processing errors, such as misregistration.The existence of noise increases the difficulty of phase unwrapping and even causes the process to fail, thereby seriously interfering with the final interferometric result.Therefore, interferometric phase denoising is a key link in interferometric SAR(InSAR)technology.Its effect has an important influence on the accuracy of measurement results.The existing interferometric phase denoising algorithms still have many defects.First is the insufficient ability to capture global contextual information.Some algorithms ignore global context information or only focus on local context information derived from a few pixels.They also lack global context information.This feature is manifested as unstable detail preservation ability in denoising results.Second, many researchers only pay attention to the influence of the spatial dimension or channel dimension of the image on the denoising result to improve the performance of denoising networks.However, they do not use spatial and channel dimensions in combination.Third, the high-level features extracted from the deep layers of the convolutional neural network have rich semantic information and ambiguous spatial details.In comparison, the low-level features extracted from the shallow layers of the network contain considerable pixel-level noise information.However, these features are isolated from one another;thus, they cannot be fully used.Method Most of the existing interferometric phase denoising methods focus on local features, and they have many limitations in feature extraction.A phase denoising network called GCFA-PDNet is proposed to solve these problems while balancing the relationship between denoising and structure preservation.This proposed phase denoising network combines global context and fused attention.The method separates the interference phase into real and imaginary parts and inputs them into the network.First, the shallow features are extracted from th","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135601338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
中国图象图形学报Pub Date : 2023-01-01DOI: 10.11834/jig.230536
Yan Hao, Liu Yuliang, Jin Lianwen, Bai Xiang
{"title":"The development,application,and future of LLM similar to ChatGPT","authors":"Yan Hao, Liu Yuliang, Jin Lianwen, Bai Xiang","doi":"10.11834/jig.230536","DOIUrl":"https://doi.org/10.11834/jig.230536","url":null,"abstract":"生成式人工智能技术自 ChatGPT 发布以来,不断突破瓶颈,吸引了资本规模投入、多领域革命和政府重点关注。本文首先分析了大模型的发展动态、应用现状和前景,然后从以下 3 个方面对大模型相关技术进行了简要介绍:1)概述了大模型相关构造技术,包括构造流程、研究现状和优化技术;2)总结了 3 类当前主流图像-文本的大模型多模态技术;3)介绍了根据评估方式不同而划分的 3 类大模型评估基准。参数优化与数据集构建是大模型产品普及与技术迭代的核心问题;多模态能力是大模型重要发展方向之一;设立评估基准是比较与约束大模型的关键方法。此外,本文还讨论了现有相关技术面临的挑战与未来可能的发展方向。现阶段的大模型产品已有强大的理解能力和创造能力,在教育、医疗和金融等领域已展现出广阔的应用前景。但同时,它们也存在训练部署困难、专业知识不足和安全隐患等问题。因此,完善参数优化、优质数据集构建、多模态等技术,并建立统一、全面、便捷的评估基准,将成为大模型突破现有局限的关键。;Generative artificial intelligence(AI)technology has achieved remarkable breakthroughs and advances in its intelligence level since the release of ChatGPT several months ago, especially in terms of its scope, automation, and intelligence.The rising popularity of generative AI attracts capital inflows and promotes the innovation of various fields.Moreover, governments worldwide pay considerable attention to generative AI and hold different attitudes toward it.The US government maintains a relatively relaxed attitude to stay ahead in the global technological arena, while European countries are conservative and are concerned about data privacy in large language models(LLMs).The Chinese government attaches great importance to AI and LLMs but also emphasizes the regulatory issues.With the growing influence of ChatGPT and its competitors and the rapid development of generative AI technology, conducting a deep analysis of them becomes necessary.This paper first provides an in-depth analysis of the development, application, and prospects of generative AI.Various types of LLMs have emerged as a series of remarkable technological products that have demonstrated versatile capabilities across multiple domains, such as education, medicine, finance, law, programming, and paper writing.These models are usually fine-tuned on the basis of general LLMs, with the aim of endowing the large models with additional domainspecific knowledge and enhanced adaptability to a specific domain.LLMs(e.g., GPT-4)have achieved rapid improvements in the past few months in terms of professional knowledge, reasoning, coding, credibility, security, transferability, and multimodality.Then, the technical contribution of generative AI technology is briefly introduced from four aspects:1) we review the related work on LLMs, such as GPT-4, PaLM2, ERNIE Bot, and their construction pipeline, which involves the training of base and assistant models.The base models store a large amount of linguistic knowledge, while the assistant models acquire stronger comprehension and generation capabilities after a series of fine-tuning.2)We outline a series of public LLMs based on LLaMA, a framework for building lightweight and memory-efficient LLMs, including Alpaca, Vicuna, Koala, and Baize, as well as the key technologies for building LLMs with low memory and computation requirements, consisting of low-rank adaptation, Self-instruct, and automatic prompt engineer.3)We summarize three types of existing mainstream image -text multimodal techniques:training additional adaptation la","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":"123 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135650157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
中国图象图形学报Pub Date : 2023-01-01DOI: 10.11834/jig.220671
Zhang Lei, Chen Wen, Wang Yuehuan
{"title":"Key sub-region feature fusion network for fine-grained ship detection and recognition in remote sensing images","authors":"Zhang Lei, Chen Wen, Wang Yuehuan","doi":"10.11834/jig.220671","DOIUrl":"https://doi.org/10.11834/jig.220671","url":null,"abstract":"目的 遥感图像中的舰船目标细粒度检测与识别在港口海域监视以及情报搜集等应用中有很高的实际应用价值,但遥感图像中不同种类的舰船目标整体颜色、形状与纹理特征相近,分辨力不足,导致舰船细粒度识别困难。针对该问题,提出了一种端到端的基于关键子区域特征的舰船细粒度检测与识别方法。方法 为了获得更适于目标细粒度识别的特征,提出多层次特征融合识别网络,按照整体、局部子区域两个层次从检测网络得到的候选目标区域中提取特征。然后结合候选目标中所有子区域的信息计算每个子区域的判别性显著度,对含有判别性组件的关键子区域进行挖掘。最后基于判别性显著度将子区域特征与整体特征进行自适应融合,形成表征能力更强的特征,对舰船目标进行细粒度识别。整个检测与识别网络采用端到端一体化设计,所有候选目标特征提取过程只需要经过一次骨干网络的计算,提高了计算效率。结果 在公开的带有细粒度类别标签的 HRSC2016(high resolu-tion ship collection)数据集 L3 任务上,本文方法平均准确率为 77.3%,相较于不采用多层次特征融合识别网络提升了 6.3%;在自建的包含 45 类舰船目标的 FGSAID(fine-grained ships in aerial images dataset)数据集上,本文方法平均准确率为 71.5%。结论 本文方法有效挖掘并融合了含有判别性组件的子区域的特征,解决了目标整体特征分辨力不足导致的细粒度目标识别困难问题,相较于现有的遥感图像舰船目标检测与识别算法准确性有明显提升。;Objective The ocean has great economic and military value.The development of human society increases the impact of ocean activities on the development of a country.The sea is an important carrier of marine activities.Thus, the recognition and monitoring of ship targets in key sea areas through remote sensing images are crucial to the national defense and development of the economy.Fine-grained ship detection and recognition in high-resolution remote sensing images refer to the identification of specific types of ships based on ship detection.A precise and detailed classification is valuable in practical application fields, such as sea surveillance and intelligence gathering.Instead of coarse-grained classification categories, such as warcraft and merchant ships, specific ship types, such as Arleigh Burke-class destroyer, Nimitz-class aircraft carrier, container, and car carrier, are necessary.However, the overall color, shape, and texture of different types of ship targets are similar.The structures of ships belong to different types, but their uses are similar.Moreover, the coating color of military ships is monotonous.These characteristics complicate the classification of these targets.The existing ship detectors are designed to focus on locating targets.The design of the classification branch of these detectors is relatively simple.They only use the features of whole targets for classification, significantly decreasing the performance in the fine-grained labeled datasets.The existing ship classification methods, which mainly classify targets on the pre-cropped image patches, are separated from the detection process.This approach is unsatisfactory for practical applications for two reasons:1)the whole backbone of these methods based on neural networks must be executed on every proposal to extract features.The remote sensing images of the harbor usually include several ships;thus, the computation cost increases sharply.2)The detection and classification networks are optimized separately, and the parameters of both networks are optimized to the best.The whole process cannot obtain the optimal solution because the locations of proposals obtained by detection methods vary with the pre-cropped image patches.utilize prior knowledge of ships and propose the key sub-region feature fusion network(KSFFN), whi","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135650158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
中国图象图形学报Pub Date : 2023-01-01DOI: 10.11834/jig.220026
Zhang Yao, Lu Huanzhang, Wang Jue, Zhang Luping, Hu Moufa
{"title":"Short-term memory and CenterTrack based vehicle-related multi-target tracking method","authors":"Zhang Yao, Lu Huanzhang, Wang Jue, Zhang Luping, Hu Moufa","doi":"10.11834/jig.220026","DOIUrl":"https://doi.org/10.11834/jig.220026","url":null,"abstract":"目的 车辆多目标跟踪是智能交通领域关键技术,其性能对车辆轨迹分析和异常行为鉴别有显著影响。然而,车辆多目标跟踪常受外部光照、道路环境因素影响,车辆远近尺度变化以及相互遮挡等干扰,导致远处车辆漏检或车辆身份切换(ID switch,IDs)问题。本文提出短时记忆与CenterTrack的车辆多目标跟踪,提升车辆多目标跟踪准确度(multiple object tracking accuracy,MOTA),改善算法的适应性。方法 利用小样本扩增增加远处小目标车辆训练样本数;通过增加的样本重新训练CenterTrack确定车辆位置及车辆在相邻帧之间的中心位移量;当待关联轨迹与检测目标匹配失败时通过轨迹运动信息预测将来的位置;利用短时记忆将待关联轨迹按丢失时间长短分级与待匹配检测关联以减少跟踪车辆IDs。结果 在交通监控车辆多目标跟踪数据集UA-DETRAC (University at Albany detection and tracking)构建的5个测试序列数据中,本文方法在维持CenterTrack优势的同时,对其表现不佳的场景获得近30%的提升,与YOLOv4-DeepSort(you only look once—simple online and realtime tracking with deep association metric)相比,4种场景均获得近10%的提升,效果显著。Sherbrooke数据集的测试结果,本文方法同样获得了性能提升。结论 本文扩增了远处小目标车辆训练样本,缓解了远处小目标与近处大目标存在的样本不均衡,提高了算法对远处小目标车辆的检测能力,同时短时记忆维持关联失败的轨迹运动信息并分级匹配检测目标,降低了算法对跟踪车辆的IDs,综合提高了MOTA。;Objective The task of multi-object tracking is often focused on estimating the number,location or other related properties of objects in the scene. Specifically,it is required to be estimated accurately and consistently over a period of time. Vehicle-related multi-target tracking can be as a key technique for such domain like intelligent transportation,and its performance has a significant impact on vehicle trajectory analysis and abnormal behavior identification to some extent. Vehicle-related multi-target tracking is also recognized as a key branch of multi-target tracking and a potential technique for autonomous driving and intelligent traffic surveillance systems. For vehicle-related multi-target tracking,temporal-based motion status of vehicles in traffic scenes can be automatically obtained,which is beneficial to analyze traffic conditions and implement decisions-making quickly for transportation administrations,as well as the automatic driving system. However,to resolve missed detection of distant vehicles or vehicle ID switch(IDs) problems,such factors are often to be dealt with in relevance to external illumination,road environment factors,changes in the scale of the vehicle near and far,and mutual occlusion. We develop an integrated short-term memory and CenterTrack ability to improve the vehicle multi-target tracking accuracy(multiple object tracking accuracy(MOTA)),and its adaptability of the algorithm can be optimized further. Method From the analysis of a large number of traffic monitoring video data,it can be seen the reasons for the unbalanced samples in the training samples. On the one hand,due to the fast speed of the captured vehicle target,the identified distant small target vehicle can be preserved temperorily,and it lacks of more consistent frames. On the other hand,the amount of apparent feature information is lower derived from small target vehicle itself,and the amount of neural networkextracted feature information is disappeared quickly many times. The relative number of distant small targets in the field of view is relatively small. After downsampling as a training sample,the feature quantity is disappeared very fast,resulting in an extensive reduction in the number of effectiv","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135057030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Binocular rivalry-based stereoscopic images quality assessment relevant to its asymmetric and distorted contexts","authors":"Tang Yiling, Jiang Shunliang, Xu Shaoping, Xiao Jian, Chen Xiaojun","doi":"10.11834/jig.220309","DOIUrl":"https://doi.org/10.11834/jig.220309","url":null,"abstract":"目的 现有方法存在特征提取时间过长、非对称失真图像预测准确性不高的问题,同时少有工作对非对称失真与对称失真立体图像的分类进行研究,为此提出了基于双目竞争的非对称失真立体图像质量评价方法。方法 依据双目竞争的视觉现象,利用非对称失真立体图像两个视点的图像质量衰减程度的不同,生成单目图像特征的融合系数,融合从左右视点图像中提取的灰度空间特征与HSV (hue-saturation-value)彩色空间特征。同时,量化两个视点图像在结构、信息量和质量衰减程度等多方面的差异,获得双目差异特征。并且将双目融合特征与双目差异特征级联为一个描述能力更强的立体图像质量感知特征向量,训练基于支持向量回归的特征—质量映射模型。此外,还利用双目差异特征训练基于支持向量分类模型的对称失真与非对称失真立体图像分类模型。结果 本文提出的质量预测模型在4个数据库上的SROCC (Spearman rank order correlation coefficient)和PLCC (Pearson linear correlation coefficient)均达到0.95以上,在3个非对称失真数据库上的均方根误差(root of mean square error,RMSE)取值均优于对比算法。在LIVE-II(LIVE 3D image quality database phase II)、IVC-I(Waterloo-IVC 3D image qualityassessment database phase I)和IVC-II (Waterloo-IVC 3D image quality assessment database phase II)这3个非对称失真立体图像测试数据库上的失真类型分类测试中,对称失真立体图像的分类准确率分别为89.91%、94.76%和98.97%,非对称失真立体图像的分类准确率分别为95.46%,92.64%和96.22%。结论 本文方法依据双目竞争的视觉现象融合左右视点图像的质量感知特征用于立体图像质量预测,能够提升非对称失真立体图像的评价准确性和鲁棒性。所提取双目差异性特征还能够用于将对称失真与非对称失真立体图像进行有效分类,分类准确性高。;Objective Computer vision-related stereoscopic image quality assessment(SIQA) is focused on recently. It is essential for parameter setting and system optimizing for such domains of multiple stereoscopic image applications like image storage,compression,transmission,and display. Stereoscopic images can be segmented into two sorts of distorted images:symmetrically and asymmetrically distorted,in terms of the degree of degradation between the left and right views. For symmetric-based distorted stereoscopic images,the distortion type and degree occurred in the left and right views are basically in consistency. Early SIQA methods were effective in evaluating symmetrically distorted images by averaging scores or features derived from the two views. However,in practice,the stereoscopic images are often asymmetrically distorted,where the distortion type and level of the two views are different. Simply averaging the quality values of the two views cannot accurately simulate the binocular fusion process and the binocular rivalry phenomena in relevance to the human visual system. Consequently,the evaluation accuracy of these methods will be down to severe lower when the quality of asymmetrically distorted stereoscopic images is estimated. Previous studies have shown that when the left and right views of a stereoscopic image exhibit varying levels or types of distortion,binocular rivalry is primarily driven by one of the views. Specially,in the process of evaluating the quality of a stereoscopic image,the visual quality of one view has a greater impact on the stereopair quality evaluation than the other view. To address this issue,some methods have simulated the binocular rivalry phenomenon in human visual system,and used a weighted average method to fuse the visual information in the two views of stereo-pairs as well. However,existing methods are still challenged for its lower prediction accuracy of asymmetrically distorted images,and its feature extraction process is also time-consuming. To optimize the evaluation accuracy of asymmet","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135102863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
中国图象图形学报Pub Date : 2023-01-01DOI: 10.11834/jig.230035
Huafeng Liu, Jingjing Chen, Lin Liang, Bingkun Bao, Zechao Li, Jiaying Liu, Liqiang Nie
{"title":"Cross-modal representation learning and generation","authors":"Huafeng Liu, Jingjing Chen, Lin Liang, Bingkun Bao, Zechao Li, Jiaying Liu, Liqiang Nie","doi":"10.11834/jig.230035","DOIUrl":"https://doi.org/10.11834/jig.230035","url":null,"abstract":": Nowadays , with the booming of multimedia data , the character of multi - source and multi - modality of data has become a challenging problem in multimedia research. Its representation and generation can be as two key factors in cross - modal learning research. Cross - modal representation studies feature learning and information integration methods using","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":"8 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74315111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
中国图象图形学报Pub Date : 2023-01-01DOI: 10.11834/jig.220894
Yang Chen, Du Jun, Xue Mobai, Jianshu Zhang
{"title":"An encoder-decoder based generation model for online handwritten mathematical expressions","authors":"Yang Chen, Du Jun, Xue Mobai, Jianshu Zhang","doi":"10.11834/jig.220894","DOIUrl":"https://doi.org/10.11834/jig.220894","url":null,"abstract":"","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":"7 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72377733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
中国图象图形学报Pub Date : 2023-01-01DOI: 10.11834/jig.220284
Xing Suxia, Ju Zihan, Liu Zijiao, Yu Wang, Fan Fuqiang
{"title":"Multi-label classification of chest X-ray images with pre-trained vision Transformer model","authors":"Xing Suxia, Ju Zihan, Liu Zijiao, Yu Wang, Fan Fuqiang","doi":"10.11834/jig.220284","DOIUrl":"https://doi.org/10.11834/jig.220284","url":null,"abstract":"","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":"20 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81572838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}