L. Kennedy, R. V. Zwol, Nicolas Torzec, Belle L. Tseng
{"title":"Learning crop regions for content-aware generation of thumbnail images","authors":"L. Kennedy, R. V. Zwol, Nicolas Torzec, Belle L. Tseng","doi":"10.1145/1991996.1992026","DOIUrl":null,"url":null,"abstract":"We propose a model for automatically cropping images based on a diverse set of content and spatial features. We approach this by extracting pixel-level features and aggregating them over possible crop regions. We then learn a regression model to predict the quality of the crop regions, via the degree to which they would overlaps with human-provided crops from these input features. Candidate images can then be cropped based an exhaustive sweep over candidate crop regions, where each region is scored and the highest-scoring region is retained. The system is unique in its ability to incorporate a variety of pixel-level importance cues when arriving at a final cropping recommendation. We test the system on a set of human-cropped images with a large set of features. We find that the system outperforms baseline approaches, particularly when the aspect ratio of the image is very different from the target thumbnail region.","PeriodicalId":390933,"journal":{"name":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","volume":"30 4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1991996.1992026","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
We propose a model for automatically cropping images based on a diverse set of content and spatial features. We approach this by extracting pixel-level features and aggregating them over possible crop regions. We then learn a regression model to predict the quality of the crop regions, via the degree to which they would overlaps with human-provided crops from these input features. Candidate images can then be cropped based an exhaustive sweep over candidate crop regions, where each region is scored and the highest-scoring region is retained. The system is unique in its ability to incorporate a variety of pixel-level importance cues when arriving at a final cropping recommendation. We test the system on a set of human-cropped images with a large set of features. We find that the system outperforms baseline approaches, particularly when the aspect ratio of the image is very different from the target thumbnail region.