教育视频分类利用文本图像变换和监督学习

2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA) Pub Date : 2017-09-30 DOI:10.1109/IPTA.2017.8853988

Houssem Chatbri, Marlon Oliveira, Kevin McGuinness, S. Little, K. Kameyama, P. Kwan, Alistair Sutherland, N. O’Connor

{"title":"教育视频分类利用文本图像变换和监督学习","authors":"Houssem Chatbri, Marlon Oliveira, Kevin McGuinness, S. Little, K. Kameyama, P. Kwan, Alistair Sutherland, N. O’Connor","doi":"10.1109/IPTA.2017.8853988","DOIUrl":null,"url":null,"abstract":"In this work, we present a method for automatic topic classification of educational videos using a speech transcript transform. Our method works as follows: First, speech recognition is used to generate video transcripts. Then, the transcripts are converted into images using a statistical cooccurrence transformation that we designed. Finally, a classifier is used to produce video category labels for a transcript image input. For our classifiers, we report results using a convolutional neural network (CNN) and a principal component analysis (PCA) model. In order to evaluate our method, we used the Khan Academy on a Stick dataset that contains 2,545 videos, where each video is labeled with one or two of 13 categories. Experiments show that our method is effective and strongly competitive against other supervised learning-based methods.","PeriodicalId":316356,"journal":{"name":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Educational video classification by using a transcript to image transform and supervised learning\",\"authors\":\"Houssem Chatbri, Marlon Oliveira, Kevin McGuinness, S. Little, K. Kameyama, P. Kwan, Alistair Sutherland, N. O’Connor\",\"doi\":\"10.1109/IPTA.2017.8853988\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this work, we present a method for automatic topic classification of educational videos using a speech transcript transform. Our method works as follows: First, speech recognition is used to generate video transcripts. Then, the transcripts are converted into images using a statistical cooccurrence transformation that we designed. Finally, a classifier is used to produce video category labels for a transcript image input. For our classifiers, we report results using a convolutional neural network (CNN) and a principal component analysis (PCA) model. In order to evaluate our method, we used the Khan Academy on a Stick dataset that contains 2,545 videos, where each video is labeled with one or two of 13 categories. Experiments show that our method is effective and strongly competitive against other supervised learning-based methods.\",\"PeriodicalId\":316356,\"journal\":{\"name\":\"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)\",\"volume\":\"37 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-09-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IPTA.2017.8853988\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPTA.2017.8853988","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

摘要

在这项工作中，我们提出了一种使用语音文本变换的教育视频自动主题分类方法。我们的方法是这样的:首先，使用语音识别生成视频文本。然后，使用我们设计的统计并发转换将转录本转换为图像。最后，使用分类器为文本图像输入生成视频类别标签。对于我们的分类器，我们使用卷积神经网络(CNN)和主成分分析(PCA)模型报告结果。为了评估我们的方法，我们在包含2545个视频的Stick数据集上使用了可汗学院，其中每个视频被标记为13个类别中的一个或两个。实验表明，我们的方法是有效的，与其他基于监督学习的方法相比具有很强的竞争力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Educational video classification by using a transcript to image transform and supervised learning

In this work, we present a method for automatic topic classification of educational videos using a speech transcript transform. Our method works as follows: First, speech recognition is used to generate video transcripts. Then, the transcripts are converted into images using a statistical cooccurrence transformation that we designed. Finally, a classifier is used to produce video category labels for a transcript image input. For our classifiers, we report results using a convolutional neural network (CNN) and a principal component analysis (PCA) model. In order to evaluate our method, we used the Khan Academy on a Stick dataset that contains 2,545 videos, where each video is labeled with one or two of 13 categories. Experiments show that our method is effective and strongly competitive against other supervised learning-based methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA)

自引率

0.00%

发文量