Proceedings of the 1st International Workshop on Trustworthy AI for Multimedia Computing最新文献

筛选
英文 中文
Hierarchical Semantic Enhanced Directional Graph Network for Visual Commonsense Reasoning 面向视觉常识推理的层次语义增强方向图网络
Mingyan Wu, Shuhan Qi, Jun Rao, Jia-jia Zhang, Qing Liao, Xuan Wang, Xinxin Liao
{"title":"Hierarchical Semantic Enhanced Directional Graph Network for Visual Commonsense Reasoning","authors":"Mingyan Wu, Shuhan Qi, Jun Rao, Jia-jia Zhang, Qing Liao, Xuan Wang, Xinxin Liao","doi":"10.1145/3475731.3484957","DOIUrl":"https://doi.org/10.1145/3475731.3484957","url":null,"abstract":"Visual commonsense reasoning (VCR) task aims at boosting research of cognition-level correlations reasoning. It requires not only a thorough understanding of correlated details of the scene but also the ability to infer correlation with related commonsense knowledge. Existing approaches consider the region-word affinity to perform the semantic alignment between vision and linguistic domains, which neglect the implicit correspondence (e.g. word-scene, region-phrase, and phrase-scene) among visual concepts and linguistic words. Although efforts have been made to deliver promising results in previous work, these methods are still confronted with challenges when comes to make interpretable reasoning. Toward this end, we present a novel hierarchical semantic enhanced directional graph network. To be more specific, we design a Modality Interaction Unit (MIU) module, which captures high-order cross-modal alignment by aggregating the hierarchical vision-language relationships. Afterward, we propose a direction clue-aware graph reasoning (DCGR) module. In this module, valuable entities can be dynamically selected in each reasoning step, according to the importance of these entities. This leads to a more interpretable reasoning procedure. Ultimately, heterogeneous graph attention is introduced to filter the irrelevant parts of the final answers. Extensive experiments have been conducted on the VCR benchmark dataset, which demonstrates that our method can achieve competitive results and better interpretability compared with several state-of-the-art baselines.","PeriodicalId":355632,"journal":{"name":"Proceedings of the 1st International Workshop on Trustworthy AI for Multimedia Computing","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126838537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Empirical Study of Uncertainty Gap for Disentangling Factors 解纠缠因素的不确定性差距实证研究
Jiantao Wu, Shentong Mo, Lin Wang
{"title":"An Empirical Study of Uncertainty Gap for Disentangling Factors","authors":"Jiantao Wu, Shentong Mo, Lin Wang","doi":"10.1145/3475731.3484954","DOIUrl":"https://doi.org/10.1145/3475731.3484954","url":null,"abstract":"Disentangling factors has proven to be crucial for building interpretable AI systems. Disentangled generative models would have explanatory input variables to increase the trustworthiness and robustness. Previous works apply a progressive disentanglement learning regime where the ground-truth factors are disentangled in an order. However, they didn't answer why such an order for disentanglement is important. In this work, we propose a novel metric, namely Uncertainty Gap, to evaluate how the uncertainty of generative models changes given input variables. We generalize the Uncertainty Gap to image reconstruction tasks using BCE and MSE. Extensive experiments on three commonly-used benchmarks also demonstrate the effectiveness of our Uncertainty Gap in evaluating both informativeness and redundancy of given variables. We empirically find that the significant factor with the largest Uncertainty Gap should be disentangled before insignificant factors, which indicates that a suitable order of disentangling factors facilities the performance.","PeriodicalId":355632,"journal":{"name":"Proceedings of the 1st International Workshop on Trustworthy AI for Multimedia Computing","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132006811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dataset Diversity: Measuring and Mitigating Geographical Bias in Image Search and Retrieval 数据集多样性:测量和减轻图像搜索和检索中的地理偏差
Abhishek Mandal, Susan Leavy, S. Little
{"title":"Dataset Diversity: Measuring and Mitigating Geographical Bias in Image Search and Retrieval","authors":"Abhishek Mandal, Susan Leavy, S. Little","doi":"10.1145/3475731.3484956","DOIUrl":"https://doi.org/10.1145/3475731.3484956","url":null,"abstract":"Many popular visual datasets used to train deep neural networksfor computer vision applications, especially for facial analytics,are created by retrieving images from the internet. Search enginesare often used to perform this task. However, due to localisationand personalisation of search results by the search engines alongwith the image indexing method used by these search engines, theresultant images overrepresent the demographics of the region fromwhere they were queried from. As most of the visual datasets arecreated in western countries, they tend to have a western centricbias and when these datasets are used to train deep neural networks,they tend to inherit these biases. Researchers studying the issue ofbias in visual datasets have focused on the racial aspect of thesebiases. We approach this from a geographical perspective. In thispaper, we 1) study how linguistic variations in search queries andgeographical variations in the querying region affect the social andcultural aspects of retrieved images focusing on facial analytics, 2)explore how geographical bias in image search and retrieval cancause racial, cultural and stereotypical bias in visual datasets and3) propose methods to mitigate such biases.","PeriodicalId":355632,"journal":{"name":"Proceedings of the 1st International Workshop on Trustworthy AI for Multimedia Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129847818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Patch Replacement: A Transformation-based Method to Improve Robustness against Adversarial Attacks 补丁替换:一种基于转换的方法来提高对对抗性攻击的鲁棒性
Hanwei Zhang, Yannis Avrithis, T. Furon, L. Amsaleg
{"title":"Patch Replacement: A Transformation-based Method to Improve Robustness against Adversarial Attacks","authors":"Hanwei Zhang, Yannis Avrithis, T. Furon, L. Amsaleg","doi":"10.1145/3475731.3484955","DOIUrl":"https://doi.org/10.1145/3475731.3484955","url":null,"abstract":"Deep Neural Networks (DNNs) are robust against intra-class variability of images, pose variations and random noise, but vulnerable to imperceptible adversarial perturbations that are well-crafted precisely to mislead. While random noise even of relatively large magnitude can hardly affect predictions, adversarial perturbations of very small magnitude can make a classifier fail completely. To enhance robustness, we introduce a new adversarial defense called patch replacement, which transforms both the input images and their intermediate features at early layers to make adversarial perturbations behave similarly to random noise. We decompose images/features into small patches and quantize them according to a codebook learned from legitimate training images. This maintains the semantic information of legitimate images, while removing as much as possible the effect of adversarial perturbations. Experiments show that patch replacement improves robustness against both white-box and gray-box attacks, compared with other transformation-based defenses. It has a low computational cost since it does not need training or fine-tuning the network. Importantly, in the white-box scenario, it increases the robustness, while other transformation-based defenses do not.","PeriodicalId":355632,"journal":{"name":"Proceedings of the 1st International Workshop on Trustworthy AI for Multimedia Computing","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127425149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信