Mojdeh Hashemi-Namin, M. Jahed-Motlagh, Adel Torkaman Rahmani
{"title":"Recognition of visual scene elements from a story text in Persian natural language","authors":"Mojdeh Hashemi-Namin, M. Jahed-Motlagh, Adel Torkaman Rahmani","doi":"10.1017/s1351324922000390","DOIUrl":null,"url":null,"abstract":"Abstract Text-to-scene conversion systems map natural language text to formal representations required for visual scenes. The difficulty involved in this mapping is one of the most critical challenges for developing these systems. The current study mapped Persian natural language text as the headmost system to a conceptual scene model. This conceptual scene model is an intermediate semantic representation between natural language and the visual scene and contains descriptions of visual elements of the scene. It will be used to produce meaningful animation based on an input story in this ongoing study. The mapping task was modeled as a sequential labeling problem, and a conditional random field (CRF) model was trained and tested for sequential labeling of scene model elements. To the best of the authors’ knowledge, no dataset for this task exists; thus, the required dataset was collected for this task. The lack of required off-the-shelf natural language processing modules and a significant error rate in the available corpora were important challenges to dataset collection. Some features of the dataset were manually annotated. The results were evaluated using standard text classification metrics, and an average accuracy of 85.7% was obtained, which is satisfactory.","PeriodicalId":49143,"journal":{"name":"Natural Language Engineering","volume":"29 1","pages":"693 - 719"},"PeriodicalIF":2.3000,"publicationDate":"2022-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Natural Language Engineering","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1017/s1351324922000390","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Abstract Text-to-scene conversion systems map natural language text to formal representations required for visual scenes. The difficulty involved in this mapping is one of the most critical challenges for developing these systems. The current study mapped Persian natural language text as the headmost system to a conceptual scene model. This conceptual scene model is an intermediate semantic representation between natural language and the visual scene and contains descriptions of visual elements of the scene. It will be used to produce meaningful animation based on an input story in this ongoing study. The mapping task was modeled as a sequential labeling problem, and a conditional random field (CRF) model was trained and tested for sequential labeling of scene model elements. To the best of the authors’ knowledge, no dataset for this task exists; thus, the required dataset was collected for this task. The lack of required off-the-shelf natural language processing modules and a significant error rate in the available corpora were important challenges to dataset collection. Some features of the dataset were manually annotated. The results were evaluated using standard text classification metrics, and an average accuracy of 85.7% was obtained, which is satisfactory.
期刊介绍:
Natural Language Engineering meets the needs of professionals and researchers working in all areas of computerised language processing, whether from the perspective of theoretical or descriptive linguistics, lexicology, computer science or engineering. Its aim is to bridge the gap between traditional computational linguistics research and the implementation of practical applications with potential real-world use. As well as publishing research articles on a broad range of topics - from text analysis, machine translation, information retrieval and speech analysis and generation to integrated systems and multi modal interfaces - it also publishes special issues on specific areas and technologies within these topics, an industry watch column and book reviews.