{"title":"Structuring lecture videos for distance learning applications","authors":"C. Ngo, Feng Wang, T. Pong","doi":"10.1109/MMSE.2003.1254444","DOIUrl":null,"url":null,"abstract":"We present an automatic and novel approach in structuring and indexing lecture videos for distance learning applications. By structuring video content, we can support both topic indexing and semantic querying of multimedia documents. our aim is to link the discussion topics extracted from the electronic slides with their associated video and audio segments. Two major techniques in our proposed approach include video text analysis and speech recognition. Initially, a video is partitioned into shots based on slide transitions. For each shot, the embedded video texts are detected, reconstructed and segmented as high-resolution foreground texts for commercial OCR recognition. The recognized texts can then be matched with their associated slides for video indexing. Meanwhile, both phrases (title) and keywords (content) are also extracted from the electronic slides to spot the speech signals. The spotted phrases and keywords are further utilized as queries to retrieve the most similar slide for speech indexing.","PeriodicalId":322357,"journal":{"name":"Fifth International Symposium on Multimedia Software Engineering, 2003. Proceedings.","volume":"75 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"43","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Fifth International Symposium on Multimedia Software Engineering, 2003. Proceedings.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MMSE.2003.1254444","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 43
Abstract
We present an automatic and novel approach in structuring and indexing lecture videos for distance learning applications. By structuring video content, we can support both topic indexing and semantic querying of multimedia documents. our aim is to link the discussion topics extracted from the electronic slides with their associated video and audio segments. Two major techniques in our proposed approach include video text analysis and speech recognition. Initially, a video is partitioned into shots based on slide transitions. For each shot, the embedded video texts are detected, reconstructed and segmented as high-resolution foreground texts for commercial OCR recognition. The recognized texts can then be matched with their associated slides for video indexing. Meanwhile, both phrases (title) and keywords (content) are also extracted from the electronic slides to spot the speech signals. The spotted phrases and keywords are further utilized as queries to retrieve the most similar slide for speech indexing.