Surendra Reddy Vinta, P. Singh, Ajoy Batta, N. Shilpa
{"title":"Automatic video summarization and classification by CNN model: Deep learning","authors":"Surendra Reddy Vinta, P. Singh, Ajoy Batta, N. Shilpa","doi":"10.1109/ICCCI56745.2023.10128303","DOIUrl":null,"url":null,"abstract":"As smartphones and other camera-enabled devices become more mainstream and user-friendly, more people are recording and sharing films through social media and video streaming websites. This makes them an essential tool for spreading information. It’s a pain to watch and evaluate so many movies. An automated video summarizing gives a concise analysis of the source material, which is useful for indexing and categorizing long films in the video database. Putting together a synopsis for a video is an uphill task. By simulating a two-stream architecture with a deep convolutional neural network in each stream to extract a video’s spatial and temporal components, this research hopes to automate the process of making video summaries. Video segment highlight scores may be generated using a two-dimensional Convolutional Neural Network (CNN) that uses spatial information.Additionally, a 3-D convolutional neural network (CNN) includes temporal data. The ratings for each segment in each stream are averaged to determine which portions of the video are the most compelling. Since the highlight result only conveys a relative degree of interest, the DCNN in each stream is trained using a pairwise deep-ranking model. With some model tweaking, we can make the highlighted part of the video score higher than the rest. Videos summaries may be created from the retrieved clips.","PeriodicalId":205683,"journal":{"name":"2023 International Conference on Computer Communication and Informatics (ICCCI)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Conference on Computer Communication and Informatics (ICCCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCCI56745.2023.10128303","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
As smartphones and other camera-enabled devices become more mainstream and user-friendly, more people are recording and sharing films through social media and video streaming websites. This makes them an essential tool for spreading information. It’s a pain to watch and evaluate so many movies. An automated video summarizing gives a concise analysis of the source material, which is useful for indexing and categorizing long films in the video database. Putting together a synopsis for a video is an uphill task. By simulating a two-stream architecture with a deep convolutional neural network in each stream to extract a video’s spatial and temporal components, this research hopes to automate the process of making video summaries. Video segment highlight scores may be generated using a two-dimensional Convolutional Neural Network (CNN) that uses spatial information.Additionally, a 3-D convolutional neural network (CNN) includes temporal data. The ratings for each segment in each stream are averaged to determine which portions of the video are the most compelling. Since the highlight result only conveys a relative degree of interest, the DCNN in each stream is trained using a pairwise deep-ranking model. With some model tweaking, we can make the highlighted part of the video score higher than the rest. Videos summaries may be created from the retrieved clips.