Spoken document summarization using relevant information

2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU) Pub Date : 2007-12-01 DOI:10.1109/ASRU.2007.4430107

Yi-Ting Chen, Shih-Hsiang Lin, H. Wang, Berlin Chen

引用次数: 3

Abstract

Extractive summarization usually automatically selects indicative sentences from a document according to a certain target summarization ratio, and then sequences them to form a summary. In this paper, we investigate the use of information from relevant documents retrieved from a contemporary text collection for each sentence of a spoken document to be summarized in a probabilistic generative framework for extractive spoken document summarization. In the proposed methods, the probability of a document being generated by a sentence is modeled by a hidden Markov model (HMM), while the retrieved relevant text documents are used to estimate the HMM's parameters and the sentence's prior probability. The results of experiments on Chinese broadcast news compiled in Taiwan show that the new methods outperform the previous HMM approach.

查看原文本刊更多论文

使用相关信息对口头文件进行总结

摘要抽取通常是按照一定的目标摘要比例自动从文献中选取指示句，然后对其进行排序，形成摘要。在本文中，我们研究了从当代文本集中检索的相关文档信息的使用，这些信息来自口语文档的每个句子，并在抽取口语文档摘要的概率生成框架中进行总结。该方法利用隐马尔可夫模型(HMM)对句子生成文档的概率进行建模，并利用检索到的相关文本文档来估计隐马尔可夫模型的参数和句子的先验概率。在台湾对中文广播新闻进行的实验结果表明，新方法优于之前的HMM方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU)

自引率

0.00%

发文量