{"title":"Salient Time Slice Pruning and Boosting for Person-Scene Instance Search in TV Series","authors":"Z. Wang, Fan Yang, S. Satoh","doi":"10.1145/3338533.3366594","DOIUrl":null,"url":null,"abstract":"It is common that TV audiences want to quickly browse scenes with certain actors in TV series. Since 2016, the TREC Video Retrieval Evaluation (TRECVID) Instance Search (INS) task has started to focus on identifying a target person in a target scene simultaneously. In this paper, we name this kind of task as P-S INS (Person-Scene Instance Search). To find out P-S instances, most approaches search person and scene separately, and then directly combine the results together by addition or multiplication. However, we find that person and scene INS modules are not always effective at the same time, or they may suppress each other in some situations. Aggregating the results shot after shot is not a good choice. Luckily, for the TV series, video shots are arranged in chronological order. We extend our focus from time point (single video shot) to time slice (multiple consecutive video shots) in the time-line. Through detecting salient time slices, we prune the data. Through evaluating the importance of salient time slices, we boost the aggregation results. Extensive experiments on the large-scale TRECVID INS dataset demonstrate the effectiveness of the proposed method.","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":"166 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM Multimedia Asia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3338533.3366594","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
It is common that TV audiences want to quickly browse scenes with certain actors in TV series. Since 2016, the TREC Video Retrieval Evaluation (TRECVID) Instance Search (INS) task has started to focus on identifying a target person in a target scene simultaneously. In this paper, we name this kind of task as P-S INS (Person-Scene Instance Search). To find out P-S instances, most approaches search person and scene separately, and then directly combine the results together by addition or multiplication. However, we find that person and scene INS modules are not always effective at the same time, or they may suppress each other in some situations. Aggregating the results shot after shot is not a good choice. Luckily, for the TV series, video shots are arranged in chronological order. We extend our focus from time point (single video shot) to time slice (multiple consecutive video shots) in the time-line. Through detecting salient time slices, we prune the data. Through evaluating the importance of salient time slices, we boost the aggregation results. Extensive experiments on the large-scale TRECVID INS dataset demonstrate the effectiveness of the proposed method.