Automatic title generation for Chinese spoken documents with a delicate scored Viterbi algorithm

Sheng-yi Kong, Chien-Chih Wang, Ko-chien Kuo, Lin-Shan Lee
{"title":"Automatic title generation for Chinese spoken documents with a delicate scored Viterbi algorithm","authors":"Sheng-yi Kong, Chien-Chih Wang, Ko-chien Kuo, Lin-Shan Lee","doi":"10.1109/SLT.2008.4777866","DOIUrl":null,"url":null,"abstract":"Automatic title generation for spoken documents is believed to be an important key for browsing and navigation over huge quantities of multimedia content. A new framework of automatic title generation for Chinese spoken documents is proposed in this paper using a delicate scored Viterbi algorithm performed over automatically generated text summaries of the testing spoken documents. The Viterbi beam search is guided by a delicate score evaluated from three sets of models: term selection model tells the most suitable terms to be included in the title, term ordering model gives the best ordering of the terms to make the title readable, and title length model tells the reasonable length of the title. The models are trained from a training corpus which is not required to be matched with the testing spoken documents. Both objective evaluation based on F1 measure and subjective human evaluation for relevance and readability indicated the approach is very attractive.","PeriodicalId":186876,"journal":{"name":"2008 IEEE Spoken Language Technology Workshop","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 IEEE Spoken Language Technology Workshop","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SLT.2008.4777866","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

Abstract

Automatic title generation for spoken documents is believed to be an important key for browsing and navigation over huge quantities of multimedia content. A new framework of automatic title generation for Chinese spoken documents is proposed in this paper using a delicate scored Viterbi algorithm performed over automatically generated text summaries of the testing spoken documents. The Viterbi beam search is guided by a delicate score evaluated from three sets of models: term selection model tells the most suitable terms to be included in the title, term ordering model gives the best ordering of the terms to make the title readable, and title length model tells the reasonable length of the title. The models are trained from a training corpus which is not required to be matched with the testing spoken documents. Both objective evaluation based on F1 measure and subjective human evaluation for relevance and readability indicated the approach is very attractive.
基于精细评分维特比算法的中文口语文档自动标题生成
语音文档的自动标题生成被认为是浏览和导航大量多媒体内容的重要关键。本文提出了一种新的中文口语文档标题自动生成框架,该框架采用精细评分Viterbi算法对测试口语文档自动生成的文本摘要进行处理。Viterbi束搜索由三组模型的精细评分指导:术语选择模型告诉标题中包含最合适的术语,术语排序模型给出术语的最佳排序以使标题可读,标题长度模型告诉标题的合理长度。这些模型是从一个训练语料库中训练出来的,这个语料库不需要与测试口语文档相匹配。无论是基于F1测度的客观评价,还是对相关性和可读性的主观评价,都表明该方法非常有吸引力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信