{"title":"考虑框架和文本阅读顺序的喜剧说话人估计算法","authors":"Yuga Omori, Kota Nagamizo, Daisuke Ikeda","doi":"10.1109/IIAIAAI55812.2022.00080","DOIUrl":null,"url":null,"abstract":"Machine learning methods in recent years have focused on multimodal input and cross-modal tasks, and they are used as approaches to problems in various domains. Associating comic texts and characters using these approaches is informative for commercial activities such as speech synthesis and automatic translation of texts. In this study, we address the task of associating a text with a speaker in comics. It is challenging to correspond between them because these are not self-evidently attached, and few studies have attempted. These previous studies have less considered the continuity of comics such as narrative flow or contextual information. We assume that considering the continuity of comics is effective for speaker estimation. This paper proposes algorithms for estimating the reading order of frames or texts, and it also proposes methods for estimating speakers based on these orders. As a result, our proposed method improves accuracy compared to previous methods. Consideration of the frame order is an effective clue to the comic speaker estimation.","PeriodicalId":156230,"journal":{"name":"2022 12th International Congress on Advanced Applied Informatics (IIAI-AAI)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Algorithms for estimation of comic speakers considering reading order of frames and texts\",\"authors\":\"Yuga Omori, Kota Nagamizo, Daisuke Ikeda\",\"doi\":\"10.1109/IIAIAAI55812.2022.00080\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Machine learning methods in recent years have focused on multimodal input and cross-modal tasks, and they are used as approaches to problems in various domains. Associating comic texts and characters using these approaches is informative for commercial activities such as speech synthesis and automatic translation of texts. In this study, we address the task of associating a text with a speaker in comics. It is challenging to correspond between them because these are not self-evidently attached, and few studies have attempted. These previous studies have less considered the continuity of comics such as narrative flow or contextual information. We assume that considering the continuity of comics is effective for speaker estimation. This paper proposes algorithms for estimating the reading order of frames or texts, and it also proposes methods for estimating speakers based on these orders. As a result, our proposed method improves accuracy compared to previous methods. Consideration of the frame order is an effective clue to the comic speaker estimation.\",\"PeriodicalId\":156230,\"journal\":{\"name\":\"2022 12th International Congress on Advanced Applied Informatics (IIAI-AAI)\",\"volume\":\"27 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 12th International Congress on Advanced Applied Informatics (IIAI-AAI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IIAIAAI55812.2022.00080\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 12th International Congress on Advanced Applied Informatics (IIAI-AAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IIAIAAI55812.2022.00080","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Algorithms for estimation of comic speakers considering reading order of frames and texts
Machine learning methods in recent years have focused on multimodal input and cross-modal tasks, and they are used as approaches to problems in various domains. Associating comic texts and characters using these approaches is informative for commercial activities such as speech synthesis and automatic translation of texts. In this study, we address the task of associating a text with a speaker in comics. It is challenging to correspond between them because these are not self-evidently attached, and few studies have attempted. These previous studies have less considered the continuity of comics such as narrative flow or contextual information. We assume that considering the continuity of comics is effective for speaker estimation. This paper proposes algorithms for estimating the reading order of frames or texts, and it also proposes methods for estimating speakers based on these orders. As a result, our proposed method improves accuracy compared to previous methods. Consideration of the frame order is an effective clue to the comic speaker estimation.