利用结构和韵律特征对口语文化遗产档案中的民间文学进行检测和标注

F. Valente, P. Motlícek
{"title":"利用结构和韵律特征对口语文化遗产档案中的民间文学进行检测和标注","authors":"F. Valente, P. Motlícek","doi":"10.1109/CBMI.2012.6269839","DOIUrl":null,"url":null,"abstract":"Spoken cultural heritage can present considerably heterogeneous content as tales, stories, recitals, poems, theatrical representations and other form of folk literature. This work investigates the automatic detection and classification of those data type in large spoken audio archives. The corpus used for this study consists of 90 radio broadcast shows collected for preserving a large variety of Swiss French dialects. Given the variability of the language spoken in the recordings, the paper proposes a language-independent system based on structural features obtained using a speaker diarization system and various acoustic/prosodic features. Results reveal that such a system can achieve an F-measure equal to 0.85 (Precision 0.88/Recall 0.84) in retrieving folk literature in those archives. Prosodic features appear more effective and complementary to structural features. Furthermore, the paper investigates whether the same approach can be used to label speech segments into five large classes (Storytelling, Poetry, Theatre, Interviews, Functionals) showing F-measures ranging from 0.52 to 0.88. As last contribution, prosodic features for disambiguating between spoken prose and spoken poetry are investigated. In summary the study shows that simple structural and acoustic/prosodic features can be used to effectively retrieve and label folk literature in broadcast archives.","PeriodicalId":120769,"journal":{"name":"2012 10th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Detecting and labeling folk literature in spoken cultural heritage archives using structural and prosodic features\",\"authors\":\"F. Valente, P. Motlícek\",\"doi\":\"10.1109/CBMI.2012.6269839\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Spoken cultural heritage can present considerably heterogeneous content as tales, stories, recitals, poems, theatrical representations and other form of folk literature. This work investigates the automatic detection and classification of those data type in large spoken audio archives. The corpus used for this study consists of 90 radio broadcast shows collected for preserving a large variety of Swiss French dialects. Given the variability of the language spoken in the recordings, the paper proposes a language-independent system based on structural features obtained using a speaker diarization system and various acoustic/prosodic features. Results reveal that such a system can achieve an F-measure equal to 0.85 (Precision 0.88/Recall 0.84) in retrieving folk literature in those archives. Prosodic features appear more effective and complementary to structural features. Furthermore, the paper investigates whether the same approach can be used to label speech segments into five large classes (Storytelling, Poetry, Theatre, Interviews, Functionals) showing F-measures ranging from 0.52 to 0.88. As last contribution, prosodic features for disambiguating between spoken prose and spoken poetry are investigated. In summary the study shows that simple structural and acoustic/prosodic features can be used to effectively retrieve and label folk literature in broadcast archives.\",\"PeriodicalId\":120769,\"journal\":{\"name\":\"2012 10th International Workshop on Content-Based Multimedia Indexing (CBMI)\",\"volume\":\"55 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-06-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 10th International Workshop on Content-Based Multimedia Indexing (CBMI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CBMI.2012.6269839\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 10th International Workshop on Content-Based Multimedia Indexing (CBMI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CBMI.2012.6269839","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

口头文化遗产可以呈现相当多样化的内容,如故事、故事、朗诵、诗歌、戏剧表现和其他形式的民间文学。本文研究了大型语音档案中数据类型的自动检测与分类。本研究使用的语料库包括90个电台广播节目,收集这些节目是为了保存各种各样的瑞士法语方言。考虑到录音中所讲语言的可变性,本文提出了一个基于使用说话人音化系统和各种声学/韵律特征获得的结构特征的语言独立系统。结果表明,该系统在民间文献检索中的f值为0.85 (Precision 0.88/Recall 0.84)。韵律特征对结构特征显得更加有效和互补。此外,本文还研究了是否可以使用相同的方法将语音片段标记为五大类(讲故事,诗歌,戏剧,采访,功能),其f值范围从0.52到0.88。作为最后的贡献,韵律特征之间消除歧义的口语散文和口语诗歌进行了研究。综上所述,研究表明,简单的结构特征和声学/韵律特征可以有效地检索和标记广播档案中的民间文学。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Detecting and labeling folk literature in spoken cultural heritage archives using structural and prosodic features
Spoken cultural heritage can present considerably heterogeneous content as tales, stories, recitals, poems, theatrical representations and other form of folk literature. This work investigates the automatic detection and classification of those data type in large spoken audio archives. The corpus used for this study consists of 90 radio broadcast shows collected for preserving a large variety of Swiss French dialects. Given the variability of the language spoken in the recordings, the paper proposes a language-independent system based on structural features obtained using a speaker diarization system and various acoustic/prosodic features. Results reveal that such a system can achieve an F-measure equal to 0.85 (Precision 0.88/Recall 0.84) in retrieving folk literature in those archives. Prosodic features appear more effective and complementary to structural features. Furthermore, the paper investigates whether the same approach can be used to label speech segments into five large classes (Storytelling, Poetry, Theatre, Interviews, Functionals) showing F-measures ranging from 0.52 to 0.88. As last contribution, prosodic features for disambiguating between spoken prose and spoken poetry are investigated. In summary the study shows that simple structural and acoustic/prosodic features can be used to effectively retrieve and label folk literature in broadcast archives.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信