Neural Style Transfer Based Voice Mimicking for Personalized Audio Stories

Syeda Maryam Fatima, Marina Shehzad, Syed Sami Murtuza, S. S. Raza
{"title":"Neural Style Transfer Based Voice Mimicking for Personalized Audio Stories","authors":"Syeda Maryam Fatima, Marina Shehzad, Syed Sami Murtuza, S. S. Raza","doi":"10.1145/3422839.3423063","DOIUrl":null,"url":null,"abstract":"This paper demonstrates a CNN based neural style transfer on audio dataset to make storytelling a personalized experience by asking users to record a few sentences that are used to mimic their voice. User audios are converted to spectrograms, the style of which is transferred to the spectrogram of a base voice narrating the story. This neural style transfer is similar to the style transfer on images. This approach stands out as it needs a small dataset and therefore, also takes less time to train the model. This project is intended specifically for children who prefer digital interaction and are also increasingly leaving behind the storytelling culture and for working parents who are not able to spend enough time with their children. By using a parent's initial recording to narrate a given story, it is designed to serve as a conjunction between storytelling and screen-time to incorporate children's interest through the implicit ethical themes of the stories, connecting children to their loved ones simultaneously ensuring an innocuous and meaningful learning experience.","PeriodicalId":270338,"journal":{"name":"Proceedings of the 2nd International Workshop on AI for Smart TV Content Production, Access and Delivery","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2nd International Workshop on AI for Smart TV Content Production, Access and Delivery","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3422839.3423063","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

This paper demonstrates a CNN based neural style transfer on audio dataset to make storytelling a personalized experience by asking users to record a few sentences that are used to mimic their voice. User audios are converted to spectrograms, the style of which is transferred to the spectrogram of a base voice narrating the story. This neural style transfer is similar to the style transfer on images. This approach stands out as it needs a small dataset and therefore, also takes less time to train the model. This project is intended specifically for children who prefer digital interaction and are also increasingly leaving behind the storytelling culture and for working parents who are not able to spend enough time with their children. By using a parent's initial recording to narrate a given story, it is designed to serve as a conjunction between storytelling and screen-time to incorporate children's interest through the implicit ethical themes of the stories, connecting children to their loved ones simultaneously ensuring an innocuous and meaningful learning experience.
基于神经风格转移的个性化音频故事语音模仿
本文在音频数据集上展示了一种基于CNN的神经风格转移,通过要求用户记录一些句子来模仿他们的声音,使讲故事成为一种个性化的体验。用户音频被转换成声谱图,声谱图的风格被转换成叙述故事的基本声音的声谱图。这种神经风格迁移类似于图像的风格迁移。这种方法脱颖而出,因为它需要一个小的数据集,因此也需要更少的时间来训练模型。这个项目是专门为那些喜欢数字互动的孩子们设计的,他们也越来越多地离开了讲故事的文化,以及那些不能花足够的时间和孩子在一起的工作父母。通过使用家长的原始录音来讲述一个给定的故事,它的设计是将讲故事和屏幕时间结合起来,通过故事中隐含的道德主题来结合孩子的兴趣,将孩子与他们所爱的人联系起来,同时确保无害和有意义的学习体验。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信