Subtitle Synthesis using Inter and Intra utterance Prosodic Alignment for Automatic Dubbing

Giridhar Pamisetty, S. Kodukula
{"title":"Subtitle Synthesis using Inter and Intra utterance Prosodic Alignment for Automatic Dubbing","authors":"Giridhar Pamisetty, S. Kodukula","doi":"10.1109/NCC55593.2022.9806799","DOIUrl":null,"url":null,"abstract":"Automatic dubbing or machine dubbing is the process of replacing the speech in the source video with the desired language speech, which is synthesized using a text-to-speech synthesis (TTS) system. The synthesized speech should align with the events in the source video to have a realistic experience. Most of the existing prosodic alignment processes operate on the synthesized speech by controlling the speaking rate. In this paper, we propose subtitle synthesis, a unified approach for the prosodic alignment that operates at the feature level. Modifying the prosodic parameters at the feature level will not degrade the naturalness of the synthesized speech. We use both inter and intra utterance alignment in the prosodic alignment process. We should have control over the duration of the phonemes to perform alignment at the feature level to achieve synchronization between the synthesized and the source speech. So, we use the Prosody-TTS system to synthesize the speech, which has the provision to control the duration of phonemes and fundamental frequency (f0) during the synthesis. The subjective evaluation of the translated audiovisual content (lecture videos) resulted in a mean opinion score (MOS) of 4.104 that indicates the effectiveness of the proposed prosodic alignment process.","PeriodicalId":403870,"journal":{"name":"2022 National Conference on Communications (NCC)","volume":"66 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 National Conference on Communications (NCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NCC55593.2022.9806799","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Automatic dubbing or machine dubbing is the process of replacing the speech in the source video with the desired language speech, which is synthesized using a text-to-speech synthesis (TTS) system. The synthesized speech should align with the events in the source video to have a realistic experience. Most of the existing prosodic alignment processes operate on the synthesized speech by controlling the speaking rate. In this paper, we propose subtitle synthesis, a unified approach for the prosodic alignment that operates at the feature level. Modifying the prosodic parameters at the feature level will not degrade the naturalness of the synthesized speech. We use both inter and intra utterance alignment in the prosodic alignment process. We should have control over the duration of the phonemes to perform alignment at the feature level to achieve synchronization between the synthesized and the source speech. So, we use the Prosody-TTS system to synthesize the speech, which has the provision to control the duration of phonemes and fundamental frequency (f0) during the synthesis. The subjective evaluation of the translated audiovisual content (lecture videos) resulted in a mean opinion score (MOS) of 4.104 that indicates the effectiveness of the proposed prosodic alignment process.
自动配音中使用语音内外韵律对齐的字幕合成
自动配音或机器配音是将源视频中的语音替换为所需语言语音的过程,该过程使用文本到语音合成(TTS)系统进行合成。合成语音应该与源视频中的事件保持一致,以获得真实的体验。现有的韵律对齐方法大多是通过控制语速来控制合成语音。在本文中,我们提出了字幕合成,这是一种在特征层面上进行韵律对齐的统一方法。在特征层面修改韵律参数不会降低合成语音的自然度。在韵律对齐过程中,我们既使用话语内部对齐也使用话语内部对齐。我们应该控制音素的持续时间,在特征级进行对齐,以实现合成语音和源语音之间的同步。因此,我们使用韵律- tts系统来合成语音,该系统在合成过程中具有控制音素持续时间和基频(f0)的规定。对翻译的视听内容(讲座视频)的主观评价导致平均意见得分(MOS)为4.104,表明所提出的韵律对齐过程的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信