Learning to name faces: a multimodal learning scheme for search-based face annotation

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval Pub Date : 2013-07-28 DOI:10.1145/2484028.2484040

Dayong Wang, S. Hoi, Pengcheng Wu, Jianke Zhu, Ying He, C. Miao

{"title":"Learning to name faces: a multimodal learning scheme for search-based face annotation","authors":"Dayong Wang, S. Hoi, Pengcheng Wu, Jianke Zhu, Ying He, C. Miao","doi":"10.1145/2484028.2484040","DOIUrl":null,"url":null,"abstract":"Automated face annotation aims to automatically detect human faces from a photo and further name the faces with the corresponding human names. In this paper, we tackle this open problem by investigating a search-based face annotation (SBFA) paradigm for mining large amounts of web facial images freely available on the WWW. Given a query facial image for annotation, the idea of SBFA is to first search for top-n similar facial images from a web facial image database and then exploit these top-ranked similar facial images and their weak labels for naming the query facial image. To fully mine those information, this paper proposes a novel framework of Learning to Name Faces (L2NF) -- a unified multimodal learning approach for search-based face annotation, which consists of the following major components: (i) we enhance the weak labels of top-ranked similar images by exploiting the \"label smoothness\" assumption; (ii) we construct the multimodal representations of a facial image by extracting different types of features; (iii) we optimize the distance measure for each type of features using distance metric learning techniques; and finally (iv) we learn the optimal combination of multiple modalities for annotation through a learning to rank scheme. We conduct a set of extensive empirical studies on two real-world facial image databases, in which encouraging results show that the proposed algorithms significantly boost the naming accuracy of search-based face annotation task.","PeriodicalId":178818,"journal":{"name":"Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval","volume":"62 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2484028.2484040","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 24

Abstract

Automated face annotation aims to automatically detect human faces from a photo and further name the faces with the corresponding human names. In this paper, we tackle this open problem by investigating a search-based face annotation (SBFA) paradigm for mining large amounts of web facial images freely available on the WWW. Given a query facial image for annotation, the idea of SBFA is to first search for top-n similar facial images from a web facial image database and then exploit these top-ranked similar facial images and their weak labels for naming the query facial image. To fully mine those information, this paper proposes a novel framework of Learning to Name Faces (L2NF) -- a unified multimodal learning approach for search-based face annotation, which consists of the following major components: (i) we enhance the weak labels of top-ranked similar images by exploiting the "label smoothness" assumption; (ii) we construct the multimodal representations of a facial image by extracting different types of features; (iii) we optimize the distance measure for each type of features using distance metric learning techniques; and finally (iv) we learn the optimal combination of multiple modalities for annotation through a learning to rank scheme. We conduct a set of extensive empirical studies on two real-world facial image databases, in which encouraging results show that the proposed algorithms significantly boost the naming accuracy of search-based face annotation task.

查看原文本刊更多论文

学习命名人脸:基于搜索的人脸标注的多模态学习方案

自动人脸标注旨在从照片中自动检测人脸，并进一步使用相应的人名对人脸进行命名。在本文中，我们通过研究一种基于搜索的面部注释(SBFA)范式来解决这个开放的问题，该范式用于挖掘WWW上免费提供的大量网络面部图像。给定要标注的查询面部图像，SBFA的思想是首先从web面部图像数据库中搜索top-n个相似的面部图像，然后利用这些排名靠前的相似面部图像及其弱标签为查询面部图像命名。为了充分挖掘这些信息，本文提出了一种新的面孔命名学习框架(L2NF)——一种基于搜索的人脸标注的统一多模态学习方法，它由以下主要组成部分组成:(i)利用“标签平滑”假设增强排名靠前的相似图像的弱标签;(ii)我们通过提取不同类型的特征来构建面部图像的多模态表示;(iii)我们使用距离度量学习技术对每种特征的距离度量进行优化;最后(iv)我们通过学习排序方案来学习标注的多种模式的最优组合。我们在两个真实世界的人脸图像数据库上进行了大量的实证研究，结果令人鼓舞，表明所提出的算法显著提高了基于搜索的人脸标注任务的命名准确率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

自引率

0.00%

发文量