使用韩国商业多模态视频片段构建用于多个AI任务的机器学习数据集

2020 International Conference on Information and Communication Technology Convergence (ICTC) Pub Date : 2020-10-21 DOI:10.1109/ICTC49870.2020.9289319

Saim Shin, J. Jang, Minyoung Jung, Jieun Kim, Yoonyoung Jung, Hyedong Jung

{"title":"使用韩国商业多模态视频片段构建用于多个AI任务的机器学习数据集","authors":"Saim Shin, J. Jang, Minyoung Jung, Jieun Kim, Yoonyoung Jung, Hyedong Jung","doi":"10.1109/ICTC49870.2020.9289319","DOIUrl":null,"url":null,"abstract":"Accordingly a lot of broadcasting medias pursuing various concepts have been appeared and the major type of contents consumed on the web has been changed to multimodal contents, the attempt to actively utilize multimedia content in artificial intelligence research is also starting. This paper introduces a study that constructs a converged information dataset in an integrated form by analyzing various types of multimodal information on video clips. The constructed dataset was released with various semantic labels for artificial intelligence research about various information classification. The labels and descriptions in this dataset include various context, intention and emotion information describing with vision, speech and language in each video clips. The constructed dataset can be resolved the problem of lack of public data for multimodal interaction research with Korean. It is expected that this dataset can be applied in the constructions of various artificial intelligence services like Korean dialogue processing, visual information extractions and various multimodal data analysis tasks.","PeriodicalId":282243,"journal":{"name":"2020 International Conference on Information and Communication Technology Convergence (ICTC)","volume":"117 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Construction of a machine learning dataset for multiple AI tasks using Korean commercial multimodal video clips\",\"authors\":\"Saim Shin, J. Jang, Minyoung Jung, Jieun Kim, Yoonyoung Jung, Hyedong Jung\",\"doi\":\"10.1109/ICTC49870.2020.9289319\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Accordingly a lot of broadcasting medias pursuing various concepts have been appeared and the major type of contents consumed on the web has been changed to multimodal contents, the attempt to actively utilize multimedia content in artificial intelligence research is also starting. This paper introduces a study that constructs a converged information dataset in an integrated form by analyzing various types of multimodal information on video clips. The constructed dataset was released with various semantic labels for artificial intelligence research about various information classification. The labels and descriptions in this dataset include various context, intention and emotion information describing with vision, speech and language in each video clips. The constructed dataset can be resolved the problem of lack of public data for multimodal interaction research with Korean. It is expected that this dataset can be applied in the constructions of various artificial intelligence services like Korean dialogue processing, visual information extractions and various multimodal data analysis tasks.\",\"PeriodicalId\":282243,\"journal\":{\"name\":\"2020 International Conference on Information and Communication Technology Convergence (ICTC)\",\"volume\":\"117 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-10-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 International Conference on Information and Communication Technology Convergence (ICTC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICTC49870.2020.9289319\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Information and Communication Technology Convergence (ICTC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTC49870.2020.9289319","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

因此，出现了许多追求各种概念的广播媒体，网络上消费的主要内容类型已经转变为多模式内容，在人工智能研究中积极利用多媒体内容的尝试也开始了。本文介绍了一种通过分析视频片段中各种类型的多模态信息，以集成的形式构建融合信息数据集的研究。构建的数据集被发布，并带有各种语义标签，用于各种信息分类的人工智能研究。该数据集中的标签和描述包括每个视频片段中用视觉、语音和语言描述的各种上下文、意图和情感信息。构建的数据集可以解决韩文多模态交互研究中缺少公共数据的问题。预计该数据集可以应用于韩语对话处理、视觉信息提取和各种多模态数据分析任务等各种人工智能服务的构建。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Construction of a machine learning dataset for multiple AI tasks using Korean commercial multimodal video clips

Accordingly a lot of broadcasting medias pursuing various concepts have been appeared and the major type of contents consumed on the web has been changed to multimodal contents, the attempt to actively utilize multimedia content in artificial intelligence research is also starting. This paper introduces a study that constructs a converged information dataset in an integrated form by analyzing various types of multimodal information on video clips. The constructed dataset was released with various semantic labels for artificial intelligence research about various information classification. The labels and descriptions in this dataset include various context, intention and emotion information describing with vision, speech and language in each video clips. The constructed dataset can be resolved the problem of lack of public data for multimodal interaction research with Korean. It is expected that this dataset can be applied in the constructions of various artificial intelligence services like Korean dialogue processing, visual information extractions and various multimodal data analysis tasks.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 International Conference on Information and Communication Technology Convergence (ICTC)

自引率

0.00%

发文量