{"title":"Efficient approach for complex video description into english text","authors":"V. Wankhede, R. Kagalkar","doi":"10.1109/I2C2.2017.8321778","DOIUrl":null,"url":null,"abstract":"Human activity and role recognition play an important part in complex event understanding. This paper present a system to automatically generate natural language descriptions from complex videos. The system consists of mainly two parts training and testing. The first part is training section in which the complex videos are trained by storing its features, subject, verb, and objects, tense (past tense, future tense or present tense), and actual description of video into the database. In testing part, the testing video is taken as an input and after applying all video processes and classified using SVM classifier, the grammatically correct description is generated using NLP (Natural Language) processing. The video frames are processed through Gaussian and Canny edge detection. Further features of every frame of video is detected by using SIFT algorithm.","PeriodicalId":288351,"journal":{"name":"2017 International Conference on Intelligent Computing and Control (I2C2)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on Intelligent Computing and Control (I2C2)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/I2C2.2017.8321778","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Human activity and role recognition play an important part in complex event understanding. This paper present a system to automatically generate natural language descriptions from complex videos. The system consists of mainly two parts training and testing. The first part is training section in which the complex videos are trained by storing its features, subject, verb, and objects, tense (past tense, future tense or present tense), and actual description of video into the database. In testing part, the testing video is taken as an input and after applying all video processes and classified using SVM classifier, the grammatically correct description is generated using NLP (Natural Language) processing. The video frames are processed through Gaussian and Canny edge detection. Further features of every frame of video is detected by using SIFT algorithm.