Fusing Visual and Textual Information to Determine Content Safety

2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA) Pub Date : 2019-12-01 DOI:10.1109/ICMLA.2019.00324

Rodrigo Leonardo, Amber Hu, M. Uzair, Qiujing Lu, Iris Fu, Keishin Nishiyama, Sooraj Mangalath Subrahmannian, D. Ravichandran

{"title":"Fusing Visual and Textual Information to Determine Content Safety","authors":"Rodrigo Leonardo, Amber Hu, M. Uzair, Qiujing Lu, Iris Fu, Keishin Nishiyama, Sooraj Mangalath Subrahmannian, D. Ravichandran","doi":"10.1109/ICMLA.2019.00324","DOIUrl":null,"url":null,"abstract":"In advertising, identifying the content safety of web pages is a significant concern since advertisers do not want brands to be associated with threatening content. At the same time, publishers would like to maximize the number of web pages on which they can place ads. Thus, a fine balance must be achieved while classifying content safety in order to satisfy both advertisers and publishers. In this paper, we propose a multimodal machine learning framework that fuses visual and textual information from web pages to improve current predictions of content safety. The primary focus is on late fusion, which involves combining final model outputs of separate modalities, such as images and text, to arrive at a single decision. This paper presents a fully automated machine learning framework that performs binary and multilabel classification using late fusion techniques. We also introduce additional work in early fusion, which involves extracting and fusing intermediate features from the two separate models. Our algorithms are applied to data extracted from relevant web pages in the advertising industry. Both of our late and early fusion methods obtain significant improvements over algorithms currently in use.","PeriodicalId":436714,"journal":{"name":"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)","volume":"410 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2019.00324","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

In advertising, identifying the content safety of web pages is a significant concern since advertisers do not want brands to be associated with threatening content. At the same time, publishers would like to maximize the number of web pages on which they can place ads. Thus, a fine balance must be achieved while classifying content safety in order to satisfy both advertisers and publishers. In this paper, we propose a multimodal machine learning framework that fuses visual and textual information from web pages to improve current predictions of content safety. The primary focus is on late fusion, which involves combining final model outputs of separate modalities, such as images and text, to arrive at a single decision. This paper presents a fully automated machine learning framework that performs binary and multilabel classification using late fusion techniques. We also introduce additional work in early fusion, which involves extracting and fusing intermediate features from the two separate models. Our algorithms are applied to data extracted from relevant web pages in the advertising industry. Both of our late and early fusion methods obtain significant improvements over algorithms currently in use.

查看原文本刊更多论文

融合视觉和文本信息以确定内容安全性

在广告中，识别网页内容的安全性是一个重要的问题，因为广告商不希望品牌与威胁内容联系在一起。与此同时，发布商希望最大限度地增加他们可以投放广告的网页数量。因此，在对内容安全进行分类的同时，必须实现一个微妙的平衡，以满足广告商和发布商的要求。在本文中，我们提出了一个多模态机器学习框架，该框架融合了来自网页的视觉和文本信息，以改进当前对内容安全的预测。主要的焦点是后期融合，这涉及到将独立模式(如图像和文本)的最终模型输出结合起来，以得出一个单一的决策。本文提出了一个全自动机器学习框架，该框架使用后期融合技术执行二进制和多标签分类。我们还介绍了早期融合的额外工作，包括从两个独立的模型中提取和融合中间特征。我们的算法应用于从广告行业的相关网页中提取的数据。我们的晚期和早期融合方法都比目前使用的算法有了显著的改进。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)

自引率

0.00%

发文量