Rodrigo Leonardo, Amber Hu, M. Uzair, Qiujing Lu, Iris Fu, Keishin Nishiyama, Sooraj Mangalath Subrahmannian, D. Ravichandran
{"title":"Fusing Visual and Textual Information to Determine Content Safety","authors":"Rodrigo Leonardo, Amber Hu, M. Uzair, Qiujing Lu, Iris Fu, Keishin Nishiyama, Sooraj Mangalath Subrahmannian, D. Ravichandran","doi":"10.1109/ICMLA.2019.00324","DOIUrl":null,"url":null,"abstract":"In advertising, identifying the content safety of web pages is a significant concern since advertisers do not want brands to be associated with threatening content. At the same time, publishers would like to maximize the number of web pages on which they can place ads. Thus, a fine balance must be achieved while classifying content safety in order to satisfy both advertisers and publishers. In this paper, we propose a multimodal machine learning framework that fuses visual and textual information from web pages to improve current predictions of content safety. The primary focus is on late fusion, which involves combining final model outputs of separate modalities, such as images and text, to arrive at a single decision. This paper presents a fully automated machine learning framework that performs binary and multilabel classification using late fusion techniques. We also introduce additional work in early fusion, which involves extracting and fusing intermediate features from the two separate models. Our algorithms are applied to data extracted from relevant web pages in the advertising industry. Both of our late and early fusion methods obtain significant improvements over algorithms currently in use.","PeriodicalId":436714,"journal":{"name":"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)","volume":"410 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2019.00324","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
In advertising, identifying the content safety of web pages is a significant concern since advertisers do not want brands to be associated with threatening content. At the same time, publishers would like to maximize the number of web pages on which they can place ads. Thus, a fine balance must be achieved while classifying content safety in order to satisfy both advertisers and publishers. In this paper, we propose a multimodal machine learning framework that fuses visual and textual information from web pages to improve current predictions of content safety. The primary focus is on late fusion, which involves combining final model outputs of separate modalities, such as images and text, to arrive at a single decision. This paper presents a fully automated machine learning framework that performs binary and multilabel classification using late fusion techniques. We also introduce additional work in early fusion, which involves extracting and fusing intermediate features from the two separate models. Our algorithms are applied to data extracted from relevant web pages in the advertising industry. Both of our late and early fusion methods obtain significant improvements over algorithms currently in use.