{"title":"Generating text description from content-based annotated image","authors":"Yan Zhu, Hui Xiang, Wenjuan Feng","doi":"10.1109/ICSAI.2012.6223132","DOIUrl":null,"url":null,"abstract":"This paper proposes a statistical generative model to generate sentences from an annotated picture. The images are segmented into regions (using Graph-based algorithms) and then features are computed over each of these regions. Given a training set of images with annotations, we parse the image to get position information. We use SVM to get the probabilities of combinations between labels and prepositions, obtain the data to text set. We use a standard semantic representation to express the image message. Finally generate sentence from the xml report. In view of landscape pictures, this paper implemented experiments on the dataset we collected and annotated, obtained ideal results.","PeriodicalId":164945,"journal":{"name":"2012 International Conference on Systems and Informatics (ICSAI2012)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 International Conference on Systems and Informatics (ICSAI2012)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSAI.2012.6223132","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This paper proposes a statistical generative model to generate sentences from an annotated picture. The images are segmented into regions (using Graph-based algorithms) and then features are computed over each of these regions. Given a training set of images with annotations, we parse the image to get position information. We use SVM to get the probabilities of combinations between labels and prepositions, obtain the data to text set. We use a standard semantic representation to express the image message. Finally generate sentence from the xml report. In view of landscape pictures, this paper implemented experiments on the dataset we collected and annotated, obtained ideal results.