{"title":"Image understanding for converting images into natural language text sentences","authors":"N. Bourbakis","doi":"10.1109/NLPKE.2010.5587864","DOIUrl":null,"url":null,"abstract":"The efficient processing, association and understanding of multimedia based events or multi-modal information is a very important research field with a great variety of applications, such as knowledge discovery, document understanding, human computer interaction, etc. A good approach to this important issue is the development of a common platform for converting different modalities (such as images, text, etc) into the same medium and associating them for efficient processing and understanding. Thus, this talk here presents the development of a methodology capable for automatically converting images into natural language (NL) text sentences using image processing-analysis methods and graphs with attributes for object recognition, and image understanding. Then it converts graph representations into NL text sentences. Moreover, it presents a methodology for transforming NL sentences into Graph representations and then into Stochastic Petri-nets (SPN) descriptions in order to offer a common model of representation of multimodal information and at the same time a way of associating “activities or changes” in image frames for events representation and interpretation. The selection of the SPN graph model is due to its capability for efficiently representing structural and functional knowledge where other models cannot. Simple illustrative examples are provided for proving the concept presented here.","PeriodicalId":259975,"journal":{"name":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NLPKE.2010.5587864","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
The efficient processing, association and understanding of multimedia based events or multi-modal information is a very important research field with a great variety of applications, such as knowledge discovery, document understanding, human computer interaction, etc. A good approach to this important issue is the development of a common platform for converting different modalities (such as images, text, etc) into the same medium and associating them for efficient processing and understanding. Thus, this talk here presents the development of a methodology capable for automatically converting images into natural language (NL) text sentences using image processing-analysis methods and graphs with attributes for object recognition, and image understanding. Then it converts graph representations into NL text sentences. Moreover, it presents a methodology for transforming NL sentences into Graph representations and then into Stochastic Petri-nets (SPN) descriptions in order to offer a common model of representation of multimodal information and at the same time a way of associating “activities or changes” in image frames for events representation and interpretation. The selection of the SPN graph model is due to its capability for efficiently representing structural and functional knowledge where other models cannot. Simple illustrative examples are provided for proving the concept presented here.