{"title":"Interactive Audio Indexing and Speech Recognition based Navigation Assist Tool for Tutoring Videos","authors":"Gareshma N, V. P, P. S","doi":"10.1109/ICSCDS53736.2022.9760784","DOIUrl":null,"url":null,"abstract":"The volume of worldwide multimedia data increases daily, especially the growth and usage is enormus during the pandemic period. Content-based audio retrieval increases the sophistication in the usage of multimedia content. While listening to the multimedia, the user navigates to the desired location, either manually or with subtitles. Both these methods are time-consuming and not user friendly. To address this issue, we have developed a speech-based searchable system that transcribes the video material to text and allows the user to navigate to the desired content. The user may listen to video lectures and guides to the topic of interest with the voice command. The system analyses the acoustic signature of the input voice command and matches it against the video. The proposed system maintains the index table, which maps the transcription of the content with the instance of the occurrence. The interactive play control retrieves the instance of the keyword and plays the multimedia content from that instant. The combination of effective transcription, storage and retrieval complements the system and improves the ease of access.","PeriodicalId":433549,"journal":{"name":"2022 International Conference on Sustainable Computing and Data Communication Systems (ICSCDS)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Sustainable Computing and Data Communication Systems (ICSCDS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSCDS53736.2022.9760784","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The volume of worldwide multimedia data increases daily, especially the growth and usage is enormus during the pandemic period. Content-based audio retrieval increases the sophistication in the usage of multimedia content. While listening to the multimedia, the user navigates to the desired location, either manually or with subtitles. Both these methods are time-consuming and not user friendly. To address this issue, we have developed a speech-based searchable system that transcribes the video material to text and allows the user to navigate to the desired content. The user may listen to video lectures and guides to the topic of interest with the voice command. The system analyses the acoustic signature of the input voice command and matches it against the video. The proposed system maintains the index table, which maps the transcription of the content with the instance of the occurrence. The interactive play control retrieves the instance of the keyword and plays the multimedia content from that instant. The combination of effective transcription, storage and retrieval complements the system and improves the ease of access.