{"title":"Multi-Models from Computer Vision to Natural Language Processing for Cheapfakes Detection","authors":"Thanh-Son Nguyen, Minh-Triet Tran","doi":"10.1109/ICMEW59549.2023.00023","DOIUrl":null,"url":null,"abstract":"Cheapfakes can compromise the integrity of information and erode trust in multimedia content, making their detection critical. Identifying Out of Context misuse of media is essential to prevent the spread of misinformation and to ensure that news and information are presented accurately and ethically. In this paper, we focus our efforts on Task 1 of the Grand Challenge on Detecting Cheapfakes in ICME2023, which involves detecting triplets consisting of an image and two captions as Out of Context. We propose a new robust approach for detecting Cheapfakes, which are instances of image reuse with different captions. Our proposed approach leverages multi-models in Computer vision and Natural language processing, such as Named entity recognition, Image captioning, and Natural language inference. In our experiments, the proposed multi-models method achieves an impressive accuracy of 78.6%, the highest accuracy among the candidates on the hidden test set. Overall, our approach demonstrates a promising solution for detecting Cheapfakes and safeguarding the integrity of multimedia content. Our source code is public on https://github.com/thanhson28/icme2023.git.","PeriodicalId":111482,"journal":{"name":"2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMEW59549.2023.00023","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Cheapfakes can compromise the integrity of information and erode trust in multimedia content, making their detection critical. Identifying Out of Context misuse of media is essential to prevent the spread of misinformation and to ensure that news and information are presented accurately and ethically. In this paper, we focus our efforts on Task 1 of the Grand Challenge on Detecting Cheapfakes in ICME2023, which involves detecting triplets consisting of an image and two captions as Out of Context. We propose a new robust approach for detecting Cheapfakes, which are instances of image reuse with different captions. Our proposed approach leverages multi-models in Computer vision and Natural language processing, such as Named entity recognition, Image captioning, and Natural language inference. In our experiments, the proposed multi-models method achieves an impressive accuracy of 78.6%, the highest accuracy among the candidates on the hidden test set. Overall, our approach demonstrates a promising solution for detecting Cheapfakes and safeguarding the integrity of multimedia content. Our source code is public on https://github.com/thanhson28/icme2023.git.