Yihui Liang, Qian Fu, Zou Kun, Guisong Liu, Han Huang
{"title":"Enhancing transparent object matting using predicted definite foreground and background","authors":"Yihui Liang, Qian Fu, Zou Kun, Guisong Liu, Han Huang","doi":"10.1109/tcsvt.2024.3452512","DOIUrl":"https://doi.org/10.1109/tcsvt.2024.3452512","url":null,"abstract":"","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"78 1","pages":""},"PeriodicalIF":8.4,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142177186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cross-Scene Hyperspectral Image Classification With Consistency-Aware Customized Learning","authors":"Kexin Ding, Ting Lu, Wei Fu, Leyuan Fang","doi":"10.1109/tcsvt.2024.3452135","DOIUrl":"https://doi.org/10.1109/tcsvt.2024.3452135","url":null,"abstract":"","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"12 1","pages":""},"PeriodicalIF":8.4,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142177181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Robust Generative Steganography Based on Image Mapping","authors":"Qinghua Zhang;Fangjun Huang","doi":"10.1109/TCSVT.2024.3451620","DOIUrl":"10.1109/TCSVT.2024.3451620","url":null,"abstract":"Coverless steganography requires no modification of the cover image and can effectively resist steganalysis, which has received widespread attention from researchers in recent years. However, existing coverless image steganographic methods are achieved by constructing a mapping between the secret information and images in a known dataset. This image dataset needs to be sent to the receiver, which consumes substantial resources and poses a risk of information leakage. In addition, existing methods cannot achieve high-accuracy extraction when facing various attacks. To address the aforementioned issues, we propose a robust generative steganography based on image mapping (GSIM). This method establishes prompts based on the topic and quantity requirements first and then generate the candidate image database according to the prompts, which can be independently generated by both the sender and receiver without the need for transmission. In order to improve the robustness of the algorithm, our proposed GSIM utilizes prompts and fractional-order Chebyshev-Fourier moments (FrCHFMs) to construct the mapping between the generated images and the predefined binary sequences, as well as uses speeded-up robust features (SURFs) as auxiliary features in the information extraction phase. The experimental results show that GSIM is superior to existing coverless image steganographic methods in terms of capacity, security, and robustness.","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"34 12","pages":"13543-13555"},"PeriodicalIF":8.3,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142177178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jian Yang;Zhiyu Guan;Jun Li;Zhiping Shi;Xianglong Liu
{"title":"Diffusion Patch Attack With Spatial–Temporal Cross-Evolution for Video Recognition","authors":"Jian Yang;Zhiyu Guan;Jun Li;Zhiping Shi;Xianglong Liu","doi":"10.1109/TCSVT.2024.3452475","DOIUrl":"10.1109/TCSVT.2024.3452475","url":null,"abstract":"Deep neural networks (DNNs) have demonstrated excellent performance across various domains. However, recent studies have shown that deep neural networks are vulnerable to adversarial examples, including DNN-based video action recognition models. While much of the existing research on adversarial attacks against video models focuses on perturbation-based attacks, there is limited research on patch-based black-box attacks. Existing patch-based attack algorithms suffer from the problem of a large search space of optimization algorithms and use patches with simple content, leading to suboptimal attack performance or requiring a large number of queries. To address these challenges, we propose the “Diffusion Patch Attack (DPA) with Spatial-Temporal Cross-Evolution (STCE) for Video Recognition,” a novel approach that integrates the excellent properties of the diffusion model into video black-box adversarial attacks for the first time. This integration significantly narrows the parameter search space while enhancing the adversarial content of patches. Moreover, we introduce the spatial-temporal cross-evolutionary algorithm to adapt to the narrowed search space. Specifically, we separate the spatial and temporal parameters and then employ an alternate evolutionary strategy for each parameter type. Extensive experiments conducted on three widely used video action recognition models (C3D, NL, and TPN) and two benchmark datasets (UCF-101 and HMDB-51) demonstrate the superior performance of our approach compared to other state-of-the-art black-box patch attack algorithms.","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"34 12","pages":"13190-13200"},"PeriodicalIF":8.3,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142177183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jianjie Luo, Yehao Li, Yingwei Pan, Ting Yao, Jianlin Feng, Hongyang Chao, Tao Mei
{"title":"Exploring Vision-Language Foundation Model for Novel Object Captioning","authors":"Jianjie Luo, Yehao Li, Yingwei Pan, Ting Yao, Jianlin Feng, Hongyang Chao, Tao Mei","doi":"10.1109/tcsvt.2024.3452437","DOIUrl":"https://doi.org/10.1109/tcsvt.2024.3452437","url":null,"abstract":"","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"42 1","pages":""},"PeriodicalIF":8.4,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142177180","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CVT-Track: Concentrating on Valid Tokens for One-Stream Tracking","authors":"Jianan Li, Xiaoying Yuan, Haolin Qin, Ying Wang, Xincong Liu, Tingfa Xu","doi":"10.1109/tcsvt.2024.3452231","DOIUrl":"https://doi.org/10.1109/tcsvt.2024.3452231","url":null,"abstract":"","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"1 1","pages":""},"PeriodicalIF":8.4,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142177182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Joint lesion detection and classification of breast ultrasound video via a clinical knowledge-aware framework","authors":"Minglei Li, Wushuang Gong, Pengfei Yan, Xiang Li, Yuchen Jiang, Hao Luo, Hang Zhou, Shen Yin","doi":"10.1109/tcsvt.2024.3452497","DOIUrl":"https://doi.org/10.1109/tcsvt.2024.3452497","url":null,"abstract":"","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"27 1","pages":""},"PeriodicalIF":8.4,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142223514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaomin Li, Qinghe Wang, Dezhuang Li, Mengmeng Ge, Xu Jia, You He, Huchuan Lu
{"title":"MoBox: Enhancing Video Object Segmentation with Motion-Augmented Box Supervision","authors":"Xiaomin Li, Qinghe Wang, Dezhuang Li, Mengmeng Ge, Xu Jia, You He, Huchuan Lu","doi":"10.1109/tcsvt.2024.3451981","DOIUrl":"https://doi.org/10.1109/tcsvt.2024.3451981","url":null,"abstract":"","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"107 1","pages":""},"PeriodicalIF":8.4,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142177184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Globally Deformable Information Selection Transformer for Underwater Image Enhancement","authors":"Junbin Zhuang, Yan Zheng, Baolong Guo, Yunyi Yan","doi":"10.1109/tcsvt.2024.3451553","DOIUrl":"https://doi.org/10.1109/tcsvt.2024.3451553","url":null,"abstract":"","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"78 1","pages":""},"PeriodicalIF":8.4,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142177192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}