{"title":"从计算机视觉到自然语言处理的多模型廉价假货检测","authors":"Thanh-Son Nguyen, Minh-Triet Tran","doi":"10.1109/ICMEW59549.2023.00023","DOIUrl":null,"url":null,"abstract":"Cheapfakes can compromise the integrity of information and erode trust in multimedia content, making their detection critical. Identifying Out of Context misuse of media is essential to prevent the spread of misinformation and to ensure that news and information are presented accurately and ethically. In this paper, we focus our efforts on Task 1 of the Grand Challenge on Detecting Cheapfakes in ICME2023, which involves detecting triplets consisting of an image and two captions as Out of Context. We propose a new robust approach for detecting Cheapfakes, which are instances of image reuse with different captions. Our proposed approach leverages multi-models in Computer vision and Natural language processing, such as Named entity recognition, Image captioning, and Natural language inference. In our experiments, the proposed multi-models method achieves an impressive accuracy of 78.6%, the highest accuracy among the candidates on the hidden test set. Overall, our approach demonstrates a promising solution for detecting Cheapfakes and safeguarding the integrity of multimedia content. Our source code is public on https://github.com/thanhson28/icme2023.git.","PeriodicalId":111482,"journal":{"name":"2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-Models from Computer Vision to Natural Language Processing for Cheapfakes Detection\",\"authors\":\"Thanh-Son Nguyen, Minh-Triet Tran\",\"doi\":\"10.1109/ICMEW59549.2023.00023\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Cheapfakes can compromise the integrity of information and erode trust in multimedia content, making their detection critical. Identifying Out of Context misuse of media is essential to prevent the spread of misinformation and to ensure that news and information are presented accurately and ethically. In this paper, we focus our efforts on Task 1 of the Grand Challenge on Detecting Cheapfakes in ICME2023, which involves detecting triplets consisting of an image and two captions as Out of Context. We propose a new robust approach for detecting Cheapfakes, which are instances of image reuse with different captions. Our proposed approach leverages multi-models in Computer vision and Natural language processing, such as Named entity recognition, Image captioning, and Natural language inference. In our experiments, the proposed multi-models method achieves an impressive accuracy of 78.6%, the highest accuracy among the candidates on the hidden test set. Overall, our approach demonstrates a promising solution for detecting Cheapfakes and safeguarding the integrity of multimedia content. Our source code is public on https://github.com/thanhson28/icme2023.git.\",\"PeriodicalId\":111482,\"journal\":{\"name\":\"2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)\",\"volume\":\"53 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMEW59549.2023.00023\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMEW59549.2023.00023","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multi-Models from Computer Vision to Natural Language Processing for Cheapfakes Detection
Cheapfakes can compromise the integrity of information and erode trust in multimedia content, making their detection critical. Identifying Out of Context misuse of media is essential to prevent the spread of misinformation and to ensure that news and information are presented accurately and ethically. In this paper, we focus our efforts on Task 1 of the Grand Challenge on Detecting Cheapfakes in ICME2023, which involves detecting triplets consisting of an image and two captions as Out of Context. We propose a new robust approach for detecting Cheapfakes, which are instances of image reuse with different captions. Our proposed approach leverages multi-models in Computer vision and Natural language processing, such as Named entity recognition, Image captioning, and Natural language inference. In our experiments, the proposed multi-models method achieves an impressive accuracy of 78.6%, the highest accuracy among the candidates on the hidden test set. Overall, our approach demonstrates a promising solution for detecting Cheapfakes and safeguarding the integrity of multimedia content. Our source code is public on https://github.com/thanhson28/icme2023.git.