Saim Shin, J. Jang, Minyoung Jung, Jieun Kim, Yoonyoung Jung, Hyedong Jung
{"title":"使用韩国商业多模态视频片段构建用于多个AI任务的机器学习数据集","authors":"Saim Shin, J. Jang, Minyoung Jung, Jieun Kim, Yoonyoung Jung, Hyedong Jung","doi":"10.1109/ICTC49870.2020.9289319","DOIUrl":null,"url":null,"abstract":"Accordingly a lot of broadcasting medias pursuing various concepts have been appeared and the major type of contents consumed on the web has been changed to multimodal contents, the attempt to actively utilize multimedia content in artificial intelligence research is also starting. This paper introduces a study that constructs a converged information dataset in an integrated form by analyzing various types of multimodal information on video clips. The constructed dataset was released with various semantic labels for artificial intelligence research about various information classification. The labels and descriptions in this dataset include various context, intention and emotion information describing with vision, speech and language in each video clips. The constructed dataset can be resolved the problem of lack of public data for multimodal interaction research with Korean. It is expected that this dataset can be applied in the constructions of various artificial intelligence services like Korean dialogue processing, visual information extractions and various multimodal data analysis tasks.","PeriodicalId":282243,"journal":{"name":"2020 International Conference on Information and Communication Technology Convergence (ICTC)","volume":"117 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Construction of a machine learning dataset for multiple AI tasks using Korean commercial multimodal video clips\",\"authors\":\"Saim Shin, J. Jang, Minyoung Jung, Jieun Kim, Yoonyoung Jung, Hyedong Jung\",\"doi\":\"10.1109/ICTC49870.2020.9289319\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Accordingly a lot of broadcasting medias pursuing various concepts have been appeared and the major type of contents consumed on the web has been changed to multimodal contents, the attempt to actively utilize multimedia content in artificial intelligence research is also starting. This paper introduces a study that constructs a converged information dataset in an integrated form by analyzing various types of multimodal information on video clips. The constructed dataset was released with various semantic labels for artificial intelligence research about various information classification. The labels and descriptions in this dataset include various context, intention and emotion information describing with vision, speech and language in each video clips. The constructed dataset can be resolved the problem of lack of public data for multimodal interaction research with Korean. It is expected that this dataset can be applied in the constructions of various artificial intelligence services like Korean dialogue processing, visual information extractions and various multimodal data analysis tasks.\",\"PeriodicalId\":282243,\"journal\":{\"name\":\"2020 International Conference on Information and Communication Technology Convergence (ICTC)\",\"volume\":\"117 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-10-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 International Conference on Information and Communication Technology Convergence (ICTC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICTC49870.2020.9289319\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Information and Communication Technology Convergence (ICTC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTC49870.2020.9289319","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Construction of a machine learning dataset for multiple AI tasks using Korean commercial multimodal video clips
Accordingly a lot of broadcasting medias pursuing various concepts have been appeared and the major type of contents consumed on the web has been changed to multimodal contents, the attempt to actively utilize multimedia content in artificial intelligence research is also starting. This paper introduces a study that constructs a converged information dataset in an integrated form by analyzing various types of multimodal information on video clips. The constructed dataset was released with various semantic labels for artificial intelligence research about various information classification. The labels and descriptions in this dataset include various context, intention and emotion information describing with vision, speech and language in each video clips. The constructed dataset can be resolved the problem of lack of public data for multimodal interaction research with Korean. It is expected that this dataset can be applied in the constructions of various artificial intelligence services like Korean dialogue processing, visual information extractions and various multimodal data analysis tasks.